UW-Madison Logo

The ADvanced Systems Laboratory (ADSL)
Publication abstract

Deconstructing Commodity Storage Clusters

Haryadi S. Gunawi, Nitin Agrawal, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Jiri Schindler*
Department of Computer Sciences, University of Wisconsin-Madison
*EMC Corp.


The traditional approach for characterizing complex systems is to run standard workloads and measure the resulting performance as seen by the end user. However, unique opportunities exist when characterizing a system that is itself constructed from standardized components: one can also look inside the system itself by instrumenting each of the components. In this paper, we show how "intra-box" instrumentation can help one understand the behavior of a large-scale storage cluster, the EMC Centera.
In our analysis, we leverage standard tools for tracing both the disk and network traffic emanating from each node of the cluster. By correlating this traffic with the running workload, we are able to infer the structure of the software system (e.g., its write update protocol) as well as its policies (e.g., how it performs caching, replication, and load-balancing). Further, by imposing variable intra-box delays on network and disk traffic, we can confirm the causal relationships between network and disk events. Thus, we are able to infer the semantics of the messages between nodes without examining a single line of source code.

Available as: Postscript, PDF, BibTeX
Talk Slides: PowerPoint