Tech Report TR1489

The Interaction of Failure and Performance in a Migratory File Service

John Bent, Douglas Thain, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny
2003

We present the design, implementation, and evaluation of a Migratory File Service (MFS), a system designed to exploit semantic knowledge of workloads and user expectations to improve performance and handle failures effectively in wide-area batch scheduling systems. We discuss Hawk, a prototype MFS system which has two novel components: migratory proxies, which cache data at remote clusters, and a workflow manager; which manages the workflow of the system. Hawk integrates aggressive caching and I/O filtering to reduce wide-area traffic, proactively replicates data to avoid regeneration due to failure, and performs fine-grained rollback and recovery to minimize the effort required to recover from failure. Through a case study of data-intensive applications, we demonstrate the benejts of Hawk over traditional approaches, delivering a two to three orders of magnitude increase in performance for jobs that are deployed across a wide-area batch scheduling environment.

Download this report (PDF)

Return to tech report index

Computer Science | UW Home

Feedback or content questions: send email to "pubs" at the cs.wisc.edu server
Technical or accessibility issues: lab@cs.wisc.edu
Copyright © 2002, 2003, 2004, 2005, 2006, 2007 The Board of Regents of the University of Wisconsin System.