5 Steps for More Dependable Hard Drives
Media production is synonymous with data-creation; so to say that we’re completely dependent on our hard drive resources would be an under-statement. This is why we all adhere to systematic data backup routines.
Some creative pros hesitate to wade into the acronym-rich, apparently esoteric world of hard drive diagnostics. Still others approach the topic too eager to apply outdated lessons or irrelevant myths.
While there is no foolproof approach to preventing or predicting hard disk drive failure, this article will introduce five steps that media production pros can take toward stable, dependable hard drive resources.
Step 1: ‘Burn In’ New Hard Drives.
It’s probably no secret to many readers that so-called ‘infant mortality’ is an all-too-common occurrence with hard drives. This experience is verified in Google’s illuminating study Failure Trends in a Large Disk Drive Population.
Specifically, the Google study found that if a typical hard drive is going to fail related to high or low utilization, it was most likely to happen in the first three months of the drive’s useful life.
‘Burning in’ a new hard drive provides a stress test (whether highly systematic or less formal) in which the drives that are doomed to fail early in life can either do so before they’re in use, or give strong indications that they will in the near future.
There are really two ways to approach burn-in:
- Use a disk utility or specialized burn-in utility to run a proprietary heavy-load test on your new drive(s) before integrating them into your system. These utilities will either cause a doomed drive to fail, or provide diagnostic feedback that will predict early failure.
- Devise an informal use-specific test of your own. For example, fill a disk to about 60% capacity, then create a high track count, edit-dense DAW session with a long duration. Leave it in loop playback for 12 or more hours. The disk will catalog some important diagnostic information that can help predict early failure (see Step 2).
Step 2: Be Smart About Using S.M.A.R.T. Attributes.
Contemporary hard drives are almost always equipped with the standardized internal diagnostic system called SMART (Self-Monitoring Analysis and Reporting Technology).
There are a number of SMART utilities on the market today that give us access to this abundant data. Knowing which parameters or “attributes” to focus on can make SMART a helpful tool.
Another important finding in Google’s unrivaled study is that there are particular SMART attributes that more closely correlate with drive failure than others. Specifically:
- Scan Errors are errors that can result from physical defects on the surface of a hard disk. Google’s study showed that drives with one or more scan errors during burn-in were 39 times more likely to fail in the first 60 days of use.
- Reallocation Count reflects the number of time a drive has remapped a presumably faulty sector to a new location on the disk. In Google’s study, reallocation counts greater than zero meant a drive was 14 times more likely to fail within 60 days.
There are limits to which attributes can tell a useful story. It’s also important to note that the above selected SMART parameters only predicted 44% of failures in Google’s study. Still, these two attributes alone present a significant advantage over working without diagnostics.
Step 3: Defrag Media Drives Every 6 to 12 Months.
Defragmentation is the process of reorganizing data on a disk into the fewest contiguous regions as possible. Low fragmentation can drastically improve drive performance, especially for large disks. Media storage specialists like Studio Network Solutions recommend a regular 6 to 12 month cycle of defragmentation .
There are integrated operating system utilities and third-party utilities that let you defrag your internal and external drives. Be sure to have your data completely backed up before beginning this process.
Step 4: Don’t Get Distracted by Apparent Myths.
In addition to the factors mentioned above that the Google study was able to correlate with drive failure, there were a number of factors that were exposed as apparently irrelevant (or at least distracting). Among these factors:
- High operating temperature doesn’t correlate with drive failure.
- Higher utilization doesn’t predict failure if the drive isn’t power on/off. Of course this only applies to drives beyond the 6-month mark.
- Seek errors don’t reliably predict failure.
Step 5: End of Life Planning
Every drive will eventually reach an age when all indications will point to oncoming failure. Heed the warnings, and be prepared to replace aging drives. For drives that make it past the critical early stages, mortality becomes very make/model specific, but also very uniform. In general you can expect to get 2 to 4 years out of a well cared for media drive.
I keep a log that shows when each drive in my system was introduced, it’s defrag schedule, and any warning signs from SMART logs. This has proven to be a very useful tool in maintaining drive health and predicting drive failure.
Equally important, I try to stay up-to-date on known issues with the hardware and software systems in my studio. Many DAW and I/O hardware problems can masquerade as drive problems. Being able to eliminate the most likely and routine drive maintenance issues accelerates the troubleshooting process, and keeps downtime to a minimum.