Forth International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures (ADMS'13) www.adms-conf.org Riva del Garda, Trento, Italy Moday, August 26, 2013 Advanced Program 8.40am - 5pm, Room 300 8.40- 8.45 am: Welcome Comments 8.45-10.15 am: Keynote by Milind Bhandarkar, Chief Scientist, Machine Learning Platforms, Pivotal Inc. Title: Hadoop: Past, Present, and (possibly) Future Abstract: Apache Hadoop has rapidly become the de facto data processing platform, and is often mentioned synonymously with "Big Data". Hadoop started as a project within Apache Lucene and Nutch to scale the content backend for web search engine. However, it is currently being used in majority of Fortune 500 companies, in many other application domains, such as fraud detection at credit card companies, healthcare analytics, churn detection and prevention at Telecom companies etc. In this talk, we will reminisce about the early days of Hadoop at Yahoo, and lessons learned in scaling this platform from a 20-node prototype to a datacenter-wide production deployment. We will give an overview of the current state of Hadoop ecosystem, and present some prominent patterns and use cases of this platform. We will also discuss how Hadoop is evolving, and its future as a platform for "Big Data" processing. 10.15-10.30 Coffee Break 10.30-12.00 pm Session 1: Compute Optimizations 10.30-11.00 am "Vectorizing Database Column Scans with Complex Predicates", Thomas Willhalm (Intel), Ismail Oukid (Intel and SAP AG), Ingo Muller (Karlsruhe Institute of Technology and SAP AG) and Franz Faerber (SAP AG). 11.00-11.30 am "High-Performance XML Twing Filtering using GPUs", Ildar Absalyamov (UC Riverside), Roger Moussalli (IBM T. J. Watson Research Center), Vassilis Tsotras and Walid Najjar (UC Riverside) 11.30-12.00 pm "Skew Handling in Aggregate Streaming Queries on GPUs", Georgios Koutsoumpakis, Iakovos Koutsoumpakis (Uppsala University) and Anastasios Gounaris (Aristotle University of Thessaloniki). 12.00-1.30 pm Lunch 1.30-3.00 pm Session 2: 1.30-2.00 pm "Task Scheduling for Hightly Concurrent Analytical and Transcational Main-Memory Workloads", Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May (SAP AG) and Anastasia Ailamaki (EPFL). 2.00-2.30 pm "Modularizing B+-trees: Three-Level B+-trees Work Fine", Shigero Sasaki and Takuya Araki (NEC Corporation). 2.30-3.00 pm "FBARC: I/O Asymmetry Aware Buffer Replacement Strategy", Ilia Petrov (Reutlingen University), Robert Gottstein and Alejandro Buchmann (TU Darmstadt) 3.00-3.45 pm Coffee Break 3.45-5.00 pm: Keynote by Blake Fitch, Senior Technical Staff Member, IBM T. J. Watson Research Center Title: Active Storage: Exploring a Scalable, Compute-In-Storage model by extending the Blue Gene/Q architecture with Integrated Non-volatile Memory Abstract: Emerging storage class memories offer a set of challenges and opportunities in system architecture, programming models, and application design. We are exploring the close integration of emerging solid-state storage technologies in conjunction with high performance networks and integrated processing capability. Specifically, we consider the extension of the Blue Gene/Q architecture by integrating Flash into the node to enable a scalable, data-centric computing platform. We are using BG/Q as a rapid prototyping platform allowing us to build a research system based on an infrastructure with proven scalability to thousands of nodes. Our work also involves enabling a Linux environment with standard network interfaces on the BG/Q hardware. We plan to explore applications of this system architecture including existing file systems and middleware as well as more aggressive compute-in-storage approaches. Compute-in-storage is intended to enable the use of high performance (HPC) programming techniques (MPI) to implement data-centric algorithms (e.g. sort, join, graph) that execute on processing elements embedded within a storage system. This presentation will review the architectural extension to BG/Q, present a progress report on the project, and describe some early results.