UW-Madison Logo

The ADvanced Systems Laboratory (ADSL)

Scientific Workflows

We have also studied I/O workloads and necessary systems support in scientific (HPC) workloads; this led to both performance studies and new systems to better support HPC applications.
  • Flexibility, Manageability, and Performance in a Grid Storage Appliance (HPDC '02). We introduce a migrating storage appliance that serves HPC application workloads. A key mechanism we introduce is a storage reservation, which ensures proper resource sharing and guarantees progress in batch-scheduled environments.
  • Pipeline and Batch Sharing in Grid Workloads (HPDC '03). We are the first to show how data is shared across complex HPC workflows, and carefully characterize each type of I/O and how it is used by real applications.
  • Explicit Control in a Batch-Aware Distributed File System (NSDI '04). We introduce the novel idea of giving the workflow scheduler control over replication decisions within the distributed file system. By doing so, the system can explore a new space of trade-offs, including whether to replicate data or simply re-run a piece of the workflow in order to avoid data loss. The ideas in BAD-FS have gained great utility in current systems such as that found at Alluxio (formerly Tachyon).