Communicating With Your SSD
Understanding SMART Attributes
SMART (also written S.M.A.R.T.), which stands for Self-Monitoring, Analysis and Reporting Technology, is an industry standard reliability prediction indicator for both IDE/ATA and SCSI storage drives. When analyzing SMART attributes, it is very important to remember that they vary in meaning and interpretation by manufacturer. SMART simply refers to a signaling method between sensors in the drive and the host computer – the communication protocol is standardized but what it reports is not.
SMART monitors computer drives to detect and report on various reliability indicators. The technology aims to anticipate failures and warn users of impending drive failure, allowing the user to replace an ailing drive to avoid data loss and/or unexpected outages. Of course, SMART can only warn of predictable errors, which result from slow processes like mechanical wear and can be predicted by analyzing certain indicators (such mechanical problems accounted for 60% of HDD failures). Unpredictable failures, like a sudden mechanical failure resulting from an electrical surge, have no measurable variables to track and analyze. Modern SMART implementations (in HDDs) also try to prevent failures by attempting to detect and repair sector errors. All data and all sectors are tested to confirm the drive’s health during periods of inactivity.
In addition to the functions discussed above and the individual SMART attributes outlined in the next section, SMART- enabled drives are also capable of reporting a SMART status. This status represents one of two values, usually “drive OK” and “drive fail” or “threshold not exceeded” and “threshold exceeded.” A “drive fail” or “threshold exceeded” value indicates there is a high probability the drive will fail in the future; however, the failure may not be catastrophic – the SMART Status simply indicates that the drive will not perform within the manufacturer’s declared specifications. So, for example, rather than complete data loss, the drive may simply begin to run slower. As with any technology, the SMART status is not infallible and may not necessarily indicate past or present reliability. The SMART sensors may malfunction, for instance, or a serious mechanical failure may destroy access to the SMART status.
Finally, it is important to remember that SMART attributes vary in both meaning and interpretation by manufacturer. Some attributes are considered trade secrets, and not all drives report the same SMART attributes. A manufacturer, in theory, could report only one SMART value and advertise its drive as SMART-enabled. The SMART standard simply refers to a signaling method between sensors in the drive and the host computer, not a standardization of the attributes themselves.
Device manufacturers who implement SMART technology enable a set of attributes and corresponding thresholds. Please note that names and descriptions may vary by OEM. Also, many attributes were defined for use with traditional HDDs. As a result, some attributes are used with modified meaning by SSD vendors since their names are not applicable to SSD technology. Below are the SMART Attributes associated with the Samsung 840 and 840 PRO Series SSDs, which are displayed in decimal format.
ID # 5 Reallocated Sector Count
The raw value of this attribute represents the number of sectors that have been moved as a result of a read error, write error, or a verification error. If the firmware detects any of these types of errors, all valid data in the block the error originates from must be transferred to a new block. This number should be low because a high number would indicate a large number of failures.
ID # 9 Power-On Hours
The raw value of this attribute shows the total count of hours the drive has spent in the power-on state. When the system or SSD is in Hibernation Mode, the Power-On Hours value does not increment. Samsung’s SSDs support the DIPM (Device Initiated Power Management) feature. Thus, with this feature enabled, this attribute excludes any time the device spends in a “sleep” state. With DIPM off, the recorded value will include all three device power states: active, idle, and sleep.
ID # 12 Power-On Count
The raw value of this attribute reports the cumulative number of power on/off cycles. This includes both sudden power off and normal power off cases.
ID # 177 Wear Leveling Count
This attribute represents the number of media program and erase operations (the number of times a block has been erased). This value is directly related to the lifetime of the SSD. The raw value of this attribute shows the total count of P/E Cycles.
ID # 179 Used Reserved Block Count (total)
This attribute represents the number of reserved blocks that have been used as a result of a read, program or erase failure. This value is related to attribute 5 (Reallocated Sector Count) and will vary based on SSD density.
ID # 181 Program Fail Count (total)
This attribute represents a total count of the number of failed program requests (failed writes).
ID # 182 Erase Fail Count (total)
This attribute represents a total count of the number of failed erase requests.
ID # 183 Runtime Bad Count (total)
Equal to the sum of the Program Fail count (attribute 181), the Program Erase Fail count (attribute 182), and the Read Fail count. This summary value represents the total count of all read/program/erase failures.
ID # 183 Uncorrectable Error Count
The total number of errors that could not be recovered using ECC.
ID # 190 Air Flow temperature
The current temperature of the area surrounding the NAND chips inside of the SSD.
ID # 195 ECC Error Rate
The percentage of ECC correctable errors
ID # 199 CRC Error Count
The number of Cycle Redundancy Check (CRC) errors. If there is a problem between the host and the DRAM or NAND flash, the CRC engine will tally the error and store it in this attribute.
ID # 235 Power Recovery Count
A count of the number of sudden power off cases. If there is a sudden power off, the firmware must recover all of the mapping and user data during the next power on. This is a count of the number of times this has happened.
ID # 241 Total LBAs Written
Represents the total size of all LBAs (Logical Block Address) required for all of the write requests sent to the SSD from the OS. To calculate the total size (in Bytes), multiply the raw value of this attribute by 512B. Alternatively, users may simply consult the Total Bytes Written indicator in Magician 4.0.