How do you check the health of a hard drive?
Whether you are a private user or work within an IT department, you can sometimes be presented with a stark reality; hard drives are no longer working as they should and/or data suddenly disappears. Unlike the usual symptom of a loud scratching or ‘clicking’ noise, it is not always immediately clear whether a hard drive is physically defective or not. Thus, many users are initially irritated and do not know how to react; is it a one-off event or is it a recurring problem? Hard drive failures can happen relatively quickly, as the average lifetime of these devices is usually around 3 to 5 years.
If the affected hard drive is still responding, there are a variety of suitable diagnostic software products that can be used to assess the condition of the device. The quality of tools can vary – some are relatively harmless and can not cause any (further) damage, while others can interfere so deeply with the hardware that the hard drive (and data) can be severely damaged by the tool.
A smart approach
The most important technology used in checking a hard drive is ‘SMART.’ (Self-Monitoring, Analysis and Reporting Technology). Almost all modern hard drives now support SMART analysis tools in order to check the physical condition of the media. According to a study conducted by Google several years ago, two out of three hard drive crashes can be predicted using such an analysis. SMART tools include 10 critical indications that can be interrogated and when this failure is indicated, it can be flagged as a "Possible Indicator for an Impending Electromechanical Failure". In addition, SMART tools cover about 30 other attributes, which can be queried and serve as indicators to predict any impending failures. This will help you in deciding whether or not your hardware needs to be replaced to avoid potential data loss.
One of these possible attributes is the SSD indicator "SSD Life Left". It shows the approximate SSD remaining lifetime with respect to program/erase cycles or flash blocks that are currently available. This can come in very handy to monitor the health of your Solid State Drives.
Don’t leave it too late
It’s also often the case that users will come into contact with SMART tools when damage has already occurred (and when it’s often too late). For this reason, you should look to use SMART tools throughout the life of your device at regular intervals, monitoring the results periodically. Unfortunately, most operating systems do not automatically determine the diagnosis data and present it to the user in a clear format. Instead, they have to be collected with the help of special tools.
Windows Check Disk tool
It is possible to start the ‘CHKDSK’ tool quickly with a right-click on the hard drive under ‘Properties > Tools > Error check’ under the Windows operating system. However, you only want to get information about whether something is wrong or not, therefore you should make sure that you have not clicked either the "Automatic Data System Error Correction" or "Find/Restore Bad Sectors" options. Subsequently, the tool checks the respective hard drive and delivers a report. It is noticeable that the tool does not list all the available information of SMART analysis, but only generally says that the storage medium is either in functioning or not.
On the other hand, the detailed report from CHKDSK contains important information about whether the index entries of the hard disk are correct, if the security descriptions of the data are correct and if there are problems with the data structures. This is certainly important information, but only as a second step, as you firstly want to determine the physical condition of the hard drive and how long it will last for. This type of analysis is only possible with the special SMART diagnostic tools.
A general word of warning; it is extremely important to use CHKDSK with caution. The incorrect use of CHKDSK can do more harm than good and can permanently destroy data that may otherwise have been recoverable.
What can SMART tools offer?
SMART tools normally perform several different tests with the hard drive. For example, there is a ‘fast SMART check’, which queries the most important indicators in the firmware of the hard drive (according to the manufacturer's definition). These indicators include bad sectors, reassigned sectors, the number of spin starts of the spindle until it reaches full speed (Spin Retry Counts), and many others. The most important tests are ‘Drive Self Test (DST)’, ‘Short Test’, or the ‘Long Test’, where every single sector of the hard drive is read, which can take an extremely long time. Different tests provide different types of information in their reports, allowing you to build a picture of the overall health of the drive. An overview of some of the tools available can be found here.
Manufacturer's own SMART tools
Many hard drive manufacturers embed their own diagnostic tools within their devices, which present large quantities of important information and make it available to the user. In addition, some of the tools can change the hard drive settings; however manufacturers usually reject any responsibility for data loss that has resulted from using their tools, so be careful!
The information provided by these supplied or downloadable tools depends on the specific producer. They usually list defects and give the option to repair problematic sectors. As mentioned before, using these tools is at your own risk. Here are some links to the free diagnostic tools from Seagate, Western Digital and Fujitsu:
Commercial diagnostic tools
In addition, there are also a variety of commercial (both free and paid) tools that differ widely in scope and sometimes offer different information. What makes them interesting is that they combine different diagnostic examinations into one solution. In addition to the SMART diagnostic checks, they usually also offer benchmark and file benchmark checks, disk monitoring and error scans as well as power consumption and temperature checks.
Such comprehensive free tools (for personal use) are e.g. CrystalDiskInfo, DiskCheckup, HD Tune, or HDDScan. All four tools are easy to use and inform the user about the health status of the tested hard drive within a very short time. HDDScan is the only one of these four to test RAID volumes and perform a surface test on them. With these features the software is therefore also suitable as a fast diagnostic tool for smaller company environments.
The smartmontools (SMART Monitoring Tools) offer themselves for professional environments usage if you want to use an open source solution. They are available not only for Linux, Mac or Windows, but also for FreeBSD, NetBSD, OpenBSD, Solaris, OS / 2, Cygwin, QNX or eComStation. In addition to individual disks, the tools also support RAID setups.
Another tool is Speedfan. What makes this tool unique is the possibility to display not only the health condition based on SMART and other indicators, but also the predicted remaining life span. It uses an online database (called ‘in-depth online analysis’) and compares the results it obtained with previous comparable hard drives and thereby calculates the remaining lifetime of the disk. Make sure you’re fully aware of the program and its features before experimenting and changing all possible setting; you don’t want to end up causing damage inadvertently.
The same is true for the ‘most dangerous’ software tools with which one can diagnose hard drives: Low-level HDD diagnostic tools. These low-level programs can also be used to identify hard disks that are no longer detected under the operating system or even in the BIOS. Such a tool is MHDD. In addition to the possibility to perform SMART scans, this tool can also be used to outsource defective blocks to the reserve sectors. However, this naturally leads to the fact that this space (which is normally not accessible) is at one point in time full as well. If this is done too often, it can happen that in the case of a "normal" remapping (caused by the firmware of the hard drive) no more space is available here. You’re then in a situation where drive failure and most likely data loss will occur.
In addition, this tool allows you to automatically delete blocks that have a long access time. In short, such a tool should only be used by users who are familiar with the topic and have read and understood the manual completely.
Seek prevention over cure
No matter which diagnostic tool you have used; if it showed you SMART. errors, this does not mean that the hard drive is immediately going to stop working. However, in such a case one should assume that it is already in the failure state (or close). It’s also then fair to assume that a complete breakdown of the device is inevitable; it may take minutes, weeks or months, but don’t risk using it anymore.
Ensure that you have backed up your data to another storage medium, such as an external hard drive, a CD or DVD or on a magnetic tape. Also make sure that you have a current backup, as the drive could fail at any time. With a backup at hand, you should also replace the current drive with a new one as soon as possible. A hard drive which does not succeed in SMART tests should not be considered as reliable! Even if your hard drive does not completely fail, it could still damage parts of your data.
Of course, hardware is never perfect - hard drive can fail without SMART warnings. Nevertheless, you can still consider SMART as preempting technology, indicating that your hard drive might soon fail; some indication is better than none at all!
Do you use a specific tool for checking the health of your devices? Have you ever used SMART tools to make a decision on replacing failing hardware? Let us know by commenting below, or tweet @OntrackUKIE