Date
Topic and Speaker
Monday
February 12th
4:00 PM
2310 CS&S
A Five-Year Study of File-System Metadata

This is a practice talk for FAST 2007

For five years, we collected annual snapshots of filesystem metadata from over 60,000 Windows PC file systems in a large corporation. These systems contain 4 billion files totaling 700 TB of file data. In this paper, we use these snapshots to study temporal changes in file size, file age, file-type frequency, directory size, namespace structure, file-system population, storage capacity and consumption, and degree of file modification. We present a generative model that explains the namespace structure and the distribution of directory sizes. We find significant temporal trends relating to the popularity of certain file types, the origin of file content, the way the namespace is used, and the degree of variation among file systems, as well as more pedestrian changes in sizes and capacities. We give examples of consequent lessons for designers of file systems and related software.

Monday
February 26th
4:00 PM
2310 CS&S
VMM-based Hidden Process Detection and Identification using Lycosid

Use of stealth rootkit techniques to hide long-lived malicious processes is a current and alarming security issue. In this paper, we describe, implement, and evaluate a novel VMM-based hidden process detection and identification service called Lycosid that is based on the cross-view validation principle. Like previous VMM-based security services, Lycosid benefits from its protected location. In contrast to previous VMM-based hidden process detectors, Lycosid uses a true VMM-level trusted view that reduces its susceptibility to guest evasion attacks. Lycosid uses implicit information about guest state and activity that decouples it from a specific guest operating system version and patch level, but which can be noisy and unreliable. Statistical inference techniques like hypothesis testing and linear regression allow Lycosid to trade time for accuracy. Despite low quality inputs, Lycosid provides a robust, highly accurate service usable even in security environments where the consequences for wrong decisions can be high

Thursday
March 22nd
4:00 PM
2310 CS&S
Stack Trace Analysis for Large Scale Debugging

We present the Stack Trace Analysis Tool (STAT) to aid in debugging extreme-scale applications. STAT can reduce problem exploration spaces from thousands of processes to a few by sampling stack traces to form process equivalence classes, groups of processes exhibiting similar behavior. We can then use full-featured debuggers on representatives from these behavior classes for root cause analysis.

STAT scalably collects stack traces over a sampling period to assemble a profile of the application's behavior. STAT routines process the samples to form a call graph prefix tree that encodes common behavior classes over the program's process space and time. STAT leverages MRNet, an infrastructure for tool control and data analyses, to overcome scalability barriers faced by heavy-weight debuggers.

We present STAT's design and an evaluation that shows STAT gathers informative process traces from thousands of processes with sub-second latencies, a significant improvement over existing tools. Our case studies of production codes verify that STAT supports the quick identification of errors that were previously difficult to locate.

(This is a practice talk for IPDPS '07)

Monday
March 26th
4:00 PM
2310 CS&S
A discussion of Boxwood: Abstractions as the Foundation for Storage Infrastructure by John MacCormick, Nick Murphy, Marc Najork, Chandramohan A. Thekkath, and Lidong Zhou

We will discuss Boxwood, presented at OSDI 04.

Writers of complex storage applications such as distributed file systems and databases are faced with the challenges of building complex abstractions over simple storage devices like disks. These challenges are exacerbated due to the additional requirements for fault-tolerance and scaling. This paper explores the premise that high-level, fault-tolerant abstractions supported directly by the storage infrastructure can ameliorate these problems. We have built a system called Boxwood to explore the feasibility and utility of providing high-level abstractions or data structures as the fundamental storage infrastructure. Boxwood currently runs on a small cluster of eight machines. The Boxwood abstractions perform very close to the limits imposed by the processor, disk, and the native networking subsystem. Using these abstractions directly, we have implemented an NFSv2 file service that demonstrates the promise of our approach.

Monday
April 16th
4:00 PM
2310 CS&S
A discussion of "Virtualization Aware File Systems: Getting Beyond the Limitations of Virtual Disks" by Ben Pfaff, Tal Garfinkel, and Mendel Rosenblum

We will discussion Ventana, from NSDI06

Virtual disks are the main form of storage in today's virtual machine environments. They offer many attractive features, including whole system versioning, isolation, and mobility, that are absent from current file systems. Unfortunately, the low-level interface of virtual disks is very coarse-grained, forcing all-or-nothing whole system rollback, and opaque, offering no practical means of sharing. These problems impose serious limitations on virtual disks' usability, security, and ease of management.

To overcome these limitations, we offer Ventana, a virtualization aware file system. Ventana combines the file-based storage and sharing benefits of a conventional distributed file system with the versioning, mobility, and access control features that make virtual disks so compelling.

Monday
April 23rd
4:00 PM
2310 CS&S
SOSP Presentations

We will have 10 minute presentations of recent systems conference submissions from UW including:

1. Current approach towards file system benchmarking is largely ad-hoc. Running widely accepted benchmarks (such as Postmark) provides little insight on the system under test. We discuss our initial results in characterizing benchmark workloads and propose one possible technique to better understand how the benchmark is stressing the system under test.

2. Previous work has demonstrated the deficiency of failure-handling in file systems. In this work we attempt to solve the problem by constructing a file system that robustly handles failures in the desired manner.

3. High performance implementations of transactional memory in hardware aim to augment currently widespread multicore architectures with support for software transactions, as an alternative way to locks for synchronizing access to shared data, simplifying in that way multithreaded programming. This talk focuses on the virtualization problem faced by hardware transactional memory systems - how transactions survive context switches and paging - and presents an integrated hardware and software solution that virtualizes transactional memory in a novel way.

Monday
April 30th
4:00 PM
2310 CS&S
Microdrivers: A new architecture for device drivers

Commodity operating systems achieve good performance by running device drivers in-kernel. Unfortunately, this architecture offers poor fault isolation. This paper introduces microdrivers, which reduce the amount of driver code running in the kernel by splitting driver functionality between a small kernel-mode component and a larger user-mode component. This paper presents the microdriver architecture and techniques to refactor existing device drivers into microdrivers, achieving most of the benefits of user-mode drivers with the performance of kernel-mode drivers. Experiments on a network driver show that 75% of its code can be removed from the kernel without affecting common-case performance.

This is joint work with Arini Balakrishnan, Michael Swift and Somesh Jha and is a practice talk for the workshop on Hot Topics in Operating Systems.

Monday
May 7th
4:00 PM
2310 CS&S
A discussion of "Pip: Detecting the Unexpected in Distributed Systems"
Slides from discussion as Powerpoint or PDF

We will discuss Pip, from NSDI 06

Bugs in distributed systems are often hard to find. Many bugs reflect discrepancies between a system's behavior and the programmer's assumptions about that behavior. We present Pip, an infrastructure for comparing actual behavior and expected behavior to expose structural errors and performance problems in distributed systems. Pip allows programmers to express, in a declarative language, expectations about the system's communications structure, timing, and resource consumption. Pip includes system instrumentation and annotation tools to log actual system behavior, and visualization and query tools for exploring expected and unexpected behavior. Pip allows a developer to quickly understand and debug both familiar and unfamiliar systems.

We applied Pip to several applications, including FAB, SplitStream, Bullet, and RanSub. We generated most of the instrumentation for all four applications automatically. We found the needed expectations easy to write, starting in each case with automatically generated expectations. Pip found unexpected behavior in each application, and helped to isolate the causes of poor performance and incorrect behavior.