An Analysis of Latent Sector Errors in Disk Drives

Lakshmi N. Bairavasundaram, Garth R. Goodson*, Shankar Pasupathy*, Jiri Schindler*
Department of Computer Sciences, University of Wisconsin-Madison
*Network Appliance, Inc.


The reliability measures in todays disk drive-based storage systems focus predominantly on protecting against complete disk failures. Previous disk reliability studies have analyzed empirical data in an attempt to better understand and predict disk failure rates. Yet, very little is known about the incidence of latent sector errors i.e., errors that go undetected until the corresponding disk sectors are accessed.
     Our study analyzes data collected from production storage systems over 32 months across 1.53 million disks (both nearline and enterprise class). We analyze factors that impact latent sector errors, observe trends, and explore their implications on the design of reliability mechanisms in storage systems. To the best of our knowledge, this is the first study of such large scale our sample size is at least an order of magnitude larger than previously published studies and the first one to focus specifically on latent sector errors and their implications on the design and reliability of storage systems.

