Improving File System Reliability with I/O Shepherding
Haryadi S. Gunawi,
Vijayan Prabhakaran*,
Swetha Krishnan,
Andrea C. Arpaci-Dusseau,
Remzi H. Arpaci-Dusseau
Department of Computer Sciences,
University of Wisconsin - Madison
*Microsoft Research - Silicon Valley
Abstract:
We introduce a new reliability infrastructure for file systems called
I/O shepherding. I/O shepherding allows a file system developer to
craft nuanced reliability policies to detect and recover from a wide
range of storage system failures. We incorporate shepherding into the
Linux ext3 file system through a set of changes to the consistency
management subsystem, layout engine, disk scheduler, and buffer
cache. The resulting file system, CrookFS, enables a broad class of
policies to be easily and correctly specified. We implement numerous
policies, incorporating data protection techniques such as retry,
parity, mirrors, checksums, sanity checks, and data structure repairs;
even complex policies can be implemented in less than 100 lines of
code, confirming the power and simplicity of the shepherding
framework. We also demonstrate that shepherding is properly
integrated, adding less than 5% overhead to the I/O path.
Available as:
Postscript, PDF, BibTeX
Talk Slides: PowerPoint
Publications
|