UW-Madison Logo

The ADvanced Systems Laboratory (ADSL)
Publication abstract

Towards Reliable Storage Systems

Haryadi S. Gunawi
Department of Computer Sciences ,
University of Wisconsin-Madison

The 2009 departmental Best Thesis Award
The 2009 ACM Doctoral Dissertation Award Honorable Mention

Full Paper: PDF   BibTeX


Users are storing increasingly massive amounts of data. Storage software complexity is growing. The use of cheap and less reliable hardware is increasing. The combination of these trends presents us with a terrific challenge: How can we promise users that storage systems work robustly in spite of the complex failures that can arise?

In the first part of this dissertation, we respond to this question with our analysis of three reliability components present in many modern file systems: the file system checker (fsck), failure detection and recovery policies (failure policy), and journaling. We find that these subsystems are deficient in handling partial disk failures: in the fsck analysis, we find that some repairs are buggy (making the repaired file system more corrupted) and some repairs are missing (leaving some corruptions unattended). In the failure policy analysis, we observe a major problem of diffused fault handling, which causes policies to be inconsistent, buggy, and inflexible to change. In the journaling analysis, we uncover that current journaling frameworks cannot recover from checkpoint write failures, and hence write failures are intentionally ignored. The results of our analysis hint that managing failures is hard (as also hinted by the developer's comment), and hence demand for novel solutions towards building more reliable storage systems.

In the second part of this dissertation, we present our solutions to the problems above. First, we re-architect the file system checker by introducing SQCK, a robust file system checker that employs a declarative query language. By writing hundreds of checks and repairs in a query language (e.g., SQL), the high-level intent of the checker can be specified in a clear and compact manner. We show that SQCK is able to perform the same functionality as the Linux ext2/3 checker with elegant and compact queries.

Second, we present EDP, a static analysis tool that shows how error codes flow through file systems and storage drivers. We observe that low-level errors are sometimes lost as they travel through the many layers of the storage subsystem: out of the 9022 function calls through which the analyzed error codes propagate, we find that 1153 calls (13%) do not correctly save the propagated error codes. Our detailed analysis shows that many violations are not corner-case mistakes; the return codes of some functions are consistently ignored.

Finally, we present I/O shepherding, a new reliability infrastructure for file systems. With I/O shepherding, the reliability policies of a file system are well-defined, easy to understand, and simple to tailor to environment and workload. As part of this framework, we also introduce chained transactions, a novel and more powerful transactional model for checkpoint recoveries. We show that I/O shepherding enables simple, powerful, and correctly-implemented reliability policies by implementing an increasingly complex set of policies.

Full Paper: PDF   BibTeX