Date

Topic and Speaker

Monday
September 13

4:00 PM
2310 CS

Exploiting Locality to Scale Flash-based Solid State Drives

Bhuvan Urgaonkar
Assistant Professor, Penn State University

Abstract: NAND Flash-based solid-state drives (SSDs) have gained significant commercial acceptance lately and have begun to play an important role within the storage hierarchy, either as complementary to magnetic hard disks or complete replacements for them. Unlike magnetic hard disks, SSDs are free from any mechanical moving parts, have no seek or rotational delays and consume lower power. However, certain internal idiosyncrasies of the flash technology (erase-before-write, wear caused by erases, among others) pose restrictions on the performance and reliability of these devices. In this talk, I will describe our efforts towards scaling SSDs along these dimenions. First, I will present an address translation scheme that exploits temporal locality within workloads to improve garbage collection overheads. Second, I will present a data storage approach based on content addressability that exploits value locality within workloads to improve SSD performance and lifetime. Finally, I will outline preliminary ideas on a heterogeneous flash device (consisting of multiple page/block sizes) that promises to offer improvements by exploiting spatial and temporal locality within workloads. While our techniques work within the Flash Translation Layer on an SSD, applying them at other storage layers raises interesting challenges and opportunities.

Bio: Bhuvan Urgaonkar is an Assistant Professor of Computer Science and Engineering at the Pennsylvania State University. He received his bachelors degree in computer science and engineering at the Indian Institute of Technology Kharagpur in 1999 and his Ph.D. in computer science at the University of Massachusetts in 2005. His general research interests are in the modeling, implementation, and evaluation of distributed systems, operating systems, and storage systems. He is a recipient of the NSF CAREER award and a co-author on two best student award papers.

Monday
September 20

4:00 PM
1221 CS

Scaling Memcache at Facebook

Venkat Venkataramani
Infrastructure Engineering Manager, Facebook, Inc.

Facebook has the largest production installation of memcache that serves trillions of keys stored in 100s of terabytes of RAM and process more than hundreds of millions of requests every second. There are a lot of challenges we faced while trying to scale memcache at facebook, some of which were unique given the interconnected nature of the underlying data. An average facebook user has 130 friend connections and as we scale we have to be able to quickly pull data across all of our servers, wherever it's stored. This talk will cover some of the key highlights that were involved in transforming memcache from a classic single-threaded unix daemon to a high performance service capable of processing more than 300,000 requests per second on a single host. We will also briefly discuss upcoming areas for development and research.

Bio: Venkat Venkataramani is a manager of Infrastructure Engineering at Facebook, and a Wisconsin alum (MS '02).

Monday
October 11

4:00 PM
2310 CS

Scalable Middleware for Large (Really Large) Scale Systems

Barton Miller
Computer Sciences Dept, UW-Madison

I will discuss the problem of developing tools for large scale parallel environments. We are especially interested in systems, both leadership class parallel computers and clusters that have 100,000's or even millions of processors. The infrastructure that we have developed to address this problem is called MRNet, the Multicast/Reduction Network. MRNet's approach to scale is to structure control and data flow in a tree-based overlay network (TBON) that allows for efficient request distribution and flexible data reductions.

The second part of this talk will present an overview of the MRNet design, architecture, and computational model and then discuss several of the applications of MRNet. The applications include scalable automated performance analysis in Paradyn, a vision clustering application and, most recently, an effort to develop our first petascale tool, STAT, a scalable stack trace analyzer running currently on 100,000's of processors.

Monday
November 8

4:00 PM
2310 CS

Towards a Consolidated and Sustainable Future Internet - Performance Issues in Using Network Virtualization and Network Federation for Vertical and Horizontal Integration

Prof. Dr. Kurt Tutschku
Chair of Future Communication, Faculty of Computer Science, University of Vienna

The Future Internet (FI) is expected to continue the success of today's system and to provide improved features and usability in daily life to individuals and business. FI applications are expected to originate from areas such as health, energy grid, utilities and transportation, and are expected to form overlays. Tight economic constraints, however, require the Future Internet to consolidate applications-specific overlays efficiently into a homogeneous, if possible only a single, physical system. Thus, the system might be sustainable in technical and economical contexts.

Network Virtualization (NV) and Network Federation (NF) are new vertical and horizontal integration concepts. NV achieves s vertical integration (i.e. parallel operation of overlays), NF accomplishes horizontal convergence of diverse technical/administrative domains.

In this talk, we discuss how NV and NF can achieve sustainable integration. We use LTE mobile core networks and Transport Virtualization as examples. In addition, we discuss the key NV performance metrics for isolation (CPU and network capacity) and synchronization (consistency; temporal alignment of traffic flows) with the aims of achieving a better understanding for operating NV/NF-based networks.

Monday
November 12

11:00 AM
2310 CS

Operating System Abstractions for GPU Programming

Emmett Witchel
Associate Professor, University of Texas, Austin

GPGPU frameworks such as CUDA improve programmability, but GPU parallelism remains inaccessible in many application domains. We argue that poor OS support causes this problem. OSes do not provide the kind of high-level abstractions for GPUs that applications expect for other resources like CPUs and file systems. We advocate reorganizing kernel abstractions to support GPUs as first-class computing resources, with traditional guarantees such as fairness and isolation. We quantify some shortcomings in Windows 7 GPU support, and show preliminary evidence that better OS abstractions can accelerate interactive workloads like gesture recognition by a factor of 10X over a CUDA implementation.

*Bio* Emmett Witchel is an associate professor in computer science at The University of Texas at Austin. He and his group are interested in operating systems, architecture and security. He received his doctorate from MIT in 2004.

Date	Topic and Speaker
Monday September 13 4:00 PM 2310 CS	Exploiting Locality to Scale Flash-based Solid State Drives Bhuvan Urgaonkar Assistant Professor, Penn State University Abstract: NAND Flash-based solid-state drives (SSDs) have gained significant commercial acceptance lately and have begun to play an important role within the storage hierarchy, either as complementary to magnetic hard disks or complete replacements for them. Unlike magnetic hard disks, SSDs are free from any mechanical moving parts, have no seek or rotational delays and consume lower power. However, certain internal idiosyncrasies of the flash technology (erase-before-write, wear caused by erases, among others) pose restrictions on the performance and reliability of these devices. In this talk, I will describe our efforts towards scaling SSDs along these dimenions. First, I will present an address translation scheme that exploits temporal locality within workloads to improve garbage collection overheads. Second, I will present a data storage approach based on content addressability that exploits value locality within workloads to improve SSD performance and lifetime. Finally, I will outline preliminary ideas on a heterogeneous flash device (consisting of multiple page/block sizes) that promises to offer improvements by exploiting spatial and temporal locality within workloads. While our techniques work within the Flash Translation Layer on an SSD, applying them at other storage layers raises interesting challenges and opportunities. Bio: Bhuvan Urgaonkar is an Assistant Professor of Computer Science and Engineering at the Pennsylvania State University. He received his bachelors degree in computer science and engineering at the Indian Institute of Technology Kharagpur in 1999 and his Ph.D. in computer science at the University of Massachusetts in 2005. His general research interests are in the modeling, implementation, and evaluation of distributed systems, operating systems, and storage systems. He is a recipient of the NSF CAREER award and a co-author on two best student award papers.
Monday September 20 4:00 PM 1221 CS	Scaling Memcache at Facebook Venkat Venkataramani Infrastructure Engineering Manager, Facebook, Inc. Facebook has the largest production installation of memcache that serves trillions of keys stored in 100s of terabytes of RAM and process more than hundreds of millions of requests every second. There are a lot of challenges we faced while trying to scale memcache at facebook, some of which were unique given the interconnected nature of the underlying data. An average facebook user has 130 friend connections and as we scale we have to be able to quickly pull data across all of our servers, wherever it's stored. This talk will cover some of the key highlights that were involved in transforming memcache from a classic single-threaded unix daemon to a high performance service capable of processing more than 300,000 requests per second on a single host. We will also briefly discuss upcoming areas for development and research. Bio: Venkat Venkataramani is a manager of Infrastructure Engineering at Facebook, and a Wisconsin alum (MS '02).
Monday October 11 4:00 PM 2310 CS	Scalable Middleware for Large (Really Large) Scale Systems Barton Miller Computer Sciences Dept, UW-Madison I will discuss the problem of developing tools for large scale parallel environments. We are especially interested in systems, both leadership class parallel computers and clusters that have 100,000's or even millions of processors. The infrastructure that we have developed to address this problem is called MRNet, the Multicast/Reduction Network. MRNet's approach to scale is to structure control and data flow in a tree-based overlay network (TBON) that allows for efficient request distribution and flexible data reductions. The second part of this talk will present an overview of the MRNet design, architecture, and computational model and then discuss several of the applications of MRNet. The applications include scalable automated performance analysis in Paradyn, a vision clustering application and, most recently, an effort to develop our first petascale tool, STAT, a scalable stack trace analyzer running currently on 100,000's of processors.
Monday November 8 4:00 PM 2310 CS	Towards a Consolidated and Sustainable Future Internet - Performance Issues in Using Network Virtualization and Network Federation for Vertical and Horizontal Integration Prof. Dr. Kurt Tutschku Chair of Future Communication, Faculty of Computer Science, University of Vienna The Future Internet (FI) is expected to continue the success of today's system and to provide improved features and usability in daily life to individuals and business. FI applications are expected to originate from areas such as health, energy grid, utilities and transportation, and are expected to form overlays. Tight economic constraints, however, require the Future Internet to consolidate applications-specific overlays efficiently into a homogeneous, if possible only a single, physical system. Thus, the system might be sustainable in technical and economical contexts. Network Virtualization (NV) and Network Federation (NF) are new vertical and horizontal integration concepts. NV achieves s vertical integration (i.e. parallel operation of overlays), NF accomplishes horizontal convergence of diverse technical/administrative domains. In this talk, we discuss how NV and NF can achieve sustainable integration. We use LTE mobile core networks and Transport Virtualization as examples. In addition, we discuss the key NV performance metrics for isolation (CPU and network capacity) and synchronization (consistency; temporal alignment of traffic flows) with the aims of achieving a better understanding for operating NV/NF-based networks.
Monday November 12 11:00 AM 2310 CS	Operating System Abstractions for GPU Programming Emmett Witchel Associate Professor, University of Texas, Austin GPGPU frameworks such as CUDA improve programmability, but GPU parallelism remains inaccessible in many application domains. We argue that poor OS support causes this problem. OSes do not provide the kind of high-level abstractions for GPUs that applications expect for other resources like CPUs and file systems. We advocate reorganizing kernel abstractions to support GPUs as first-class computing resources, with traditional guarantees such as fairness and isolation. We quantify some shortcomings in Windows 7 GPU support, and show preliminary evidence that better OS abstractions can accelerate interactive workloads like gesture recognition by a factor of 10X over a CUDA implementation. Bio Emmett Witchel is an associate professor in computer science at The University of Texas at Austin. He and his group are interested in operating systems, architecture and security. He received his doctorate from MIT in 2004.