UW-Madison Logo

The ADvanced Systems Laboratory (ADSL)
Overview

Performance Analysis and Benchmarking of File and Storage Systems

One theme that underlies our work is measurement-driven systems research. We strongly believe that carefully measuring and understanding the current state of the art serves as an excellent catalyst for research and educational innovation. We have developed an array of techniques to bring out surprisingly detailed characterizations of the system under test, showing that even subtle behaviors can be teased out through careful experimentation. Specifically, we have developed new techniques to understand RAID arrays, journaling file systems, distributed storage systems (EMC Centera), Apple Desktop Applications, Facebook Messages, and tail latencies within local file systems.
  • Deconstructing Storage Arrays (ASPLOS '04). We develop a tool, Shear, that determines detailed properties of RAID systems, including the number of disks, chunk size, exact layout and redundancy scheme, through a clever series of microbenchmarks. By accessing sets of disk blocks and timing those accesses, Shear can detect which blocks are located on the same disks and thus infer basic properties of block layout; intuitively, sets of reads that are "slow" are assumed to be on the same disk, while sets of reads that are "fast" are assumed to be on different disks. We show how such knowledge can be utilized for failure detection, performance enhancements, and more detailed benchmarking of underlying single disks.
  • Analysis and Evolution of Journaling File Systems (USENIX '05). We introduce semantic block-level analysis to characterize file system performance, and apply it to modern journaling file systems (the first such detailed study). Specifically, we are able to infer the events that trigger a journaling file system to flush data to disk (e.g., when a timer expires or when the journal becomes full).
  • Deconstructing Commodity Storage Clusters (ISCA '05). We perform a deep dive into the behavior of a complex distributed storage system, the EMC Centera. When analyzing this cluster, we not only observe the network and disk traffic, but we delay specific network packets as well; by observing which subsequent packets are also delayed, we are able to definitively infer dependencies across messages. With time dilation, we are able to infer its caching, load-balancing, and replication policies. Our findings led to many performance enhancements and bug fixes in the shipping product.
  • A File is Not a File: Understanding the I/O Behavior of Apple Desktop Applications (SOSP '11). We perform the first characterization of modern desktop applications, finding numerous performance problems in how such applications are constructed.
  • Analysis of HDFS Under HBase: A Facebook Messages Case Study (FAST '14). Our detailed study focuses on the architecture of distributed storage behind Facebook's Messages service, showing how its many layered storage system is wildly inefficient; we suggest methods to improve performance significantly.
  • Reducing File System Tail Latencies with Chopper (FAST '15). We introduce new statistical methods to find performance problems -- specifically, excessive tail latencies -- that plague modern file systems. Four detailed case studies showcase potential problems in real systems as well as their remedies.
An offshoot of our performance research has been in file system benchmarking.
  • Generating Realistic Impressions for File-System Benchmarking (FAST '09). One fundamental flaw with many measurements of file-system behavior is controlling the state of the system: how to initialize the file system so that its starting state is realistic, and do so quickly and accurately? We introduce a tool, Impressions, that does exactly this, using sound statistical techniques and a wide-ranging study of file system images.
  • Emulating Goliath Storage Systems with David (FAST '11). Another critical question in storage evaluation is how to emulate future devices (that may be larger and faster) using existing storage technology. Our tool, David, develops a key technique (data elision) to do so: by ommitting application data blocks, a large number of future devices can be accurately emulated.
  • ROOT: Replaying Multithreaded Traces with Resource-Oriented Ordering (SOSP '13). One final contribution in this methodological space is in the tool ROOT, which introduces novel techniques to replay I/O traces in a multi-threaded environment. As the first tool to do so, ROOT shows how important it is to take multi-threading into account during I/O trace replay.