UW-Madison Logo

The ADvanced Systems Laboratory (ADSL)
Publication abstract

IRON File Systems

Vijayan Prabhakaran
Department of Computer Sciences , University of Wisconsin-Madison


Disk drives are widely used as a primary medium for storing information. While commodity file systems trust disks to either work or fail completely, modern disks exhibit complex failure modes such as latent sector faults and block corruptions, where only portions of a disk fail.

In this thesis, we focus on understanding the failure policies of file systems and improving their robustness to disk failures. We suggest a new fail-partial failure model for disks, which incorporates realistic localized faults such as latent sector faults and block corruption. We then develop and apply a novel semantic failure analysis technique, which uses file system block type knowledge and transactional semantics, to inject interesting faults and investigate how commodity file systems react to a range of more realistic disk failures.

We apply our technique to five important journaling file systems: Linux ext3, ReiserFS, JFS, XFS, and Windows NTFS. We classify their failure policies in a new taxonomy that measures their Internal RObustNess (IRON), which includes both failure detection and recovery techniques. Our analysis results show that commodity file systems store little or no redundant information, and contain failure policies that are often inconsistent, sometimes buggy, and generally inadequate in their ability to recover from partial disk failures.

We remedy the reliability short comings in commodity file systems by addressing two issues. First, we design new low-level redundancy techniques that a file system can use to handle disk faults. We begin by qualitatively and quantitatively evaluating various redundancy information such as checksum, parity, and replica, Then, in order to account for spatially correlated faults, we propose a new probabilistic model that can be used to construct redundancy sets Finally, we describe two update strategies: a overwrite and no-overwrite approach that a file system can use to update its data and parity blocks atomically without NVRAM support. Overall, we show that low-level redundant information can greatly enhance file system robustness while incurring modest time and space overheads.

Second, to remedy the problem of failure handling diffusion, we develop a modified ext3 that unifies all failure handling in a Centralized Failure Handler (CFH). We then showcase the power of centralized failure handling in ext3c, a modified IRON version of ext3 that uses CFH by demonstrating its support for flexible, consistent, and fine-grained policies. By carefully separating policy from mechanism, ext3c demonstrates how a file system can provide a thorough, comprehensive, and easily understandable failure-handling policy.

Full Paper: Postscript   PDF   BibTeX