DEEM'20 The 4th Workshop on Data Management for End-to-End Machine Learning, Sunday, 14th of June, 2020 http://deem-workshop.org Held in conjunction with ACM SIGMOD/PODS 2020 Portland, OR, June 14th - June 19th, 2020 https://sigmod2020.org/ ------------------------ CALL FOR PARTICIPATION ------------------------ We cordially invite you to the 4th workshop on "Data Management for End-to-End Machine Learning" at ACM SIGMOD on Sunday. http://deem-workshop.org/#schedule The workshop will be run as an online event via Zoom, starting at 8am PDT, and can be attended via the following link: https://nyu.zoom.us/j/9174708199?pwd=YVhYWVEzRVBEQkxlQm9rOWVhcG4xZz09 pw: 123456 The workshop will feature four keynotes and invited talks: * Bill Howe (University of Washington): Integrative Data Equity Systems * Amit Sabne (Google Brain): XLA - Compiling Machine Learning for Peak Performance * Manasi Vartak (verta.ai): Can you do impactful MLSys work outside of large companies? * Matthias Boehm (Graz University of Technology): Apache SystemDS: An ML System for the End-to-End Data Science Lifecycle Additionally, our eight accepted papers will be presented, which include updates on the latest work for ML-related data management from Google, Microsoft, Amazon and Databricks. We would be very happy to have you participate! Best, Sebastian, Steven and Julia ---------- WORKSHOP ---------- Applying Machine Learning (ML) in real-world scenarios is a challenging task. In recent years, the main focus of the database community has been on creating systems and abstractions for the efficient training of ML models on large datasets. However, model training is only one of many steps in an end-to-end ML application, and a number of orthogonal data management problems arise from the large-scale use of ML, which require the attention of the data management community. Additionally, the importance of incorporating ethics and legal compliance into machine-assisted decision-making is being broadly recognized. Critical opportunities for improving data quality and representativeness, controlling for bias, and allowing humans to oversee and impact computational processes are missed if we do not consider the lifecycle stages upstream from model training and deployment. DEEM welcomes research on providing system-level support to data scientists who wish to develop and deploy responsible machine learning methods. DEEM aims to bring together researchers and practitioners at the intersection of applied machine learning, data management and systems research, with the goal to discuss the arising data management issues in ML application scenarios. The workshop solicits regular research papers describing preliminary and ongoing research results. In addition, the workshop encourages the submission of industrial experience reports of end-to-end ML deployments. ----------------- ACCEPTED PAPERS ----------------- - From Data to Models and Back [Mike Dreves; Gene Huang; Zhuo Peng; Neoklis Polyzotis; Evan Rosen; Paul Suganthan G. C. (Google)] - Amazon SageMaker Autopilot: a White Box AutoML Solution at Scale [Piali Das; Nikita Ivkin; Tanya Bansal; Laurence Rouesnel; Philip Gautier; Zohar Karnin; Leo Dirac; Lakshmi Ramakrishnan; Andre Perunicic; Iaroslav Shcherbatyi; Wilton Wu; Aida Zolic; Huibin Shen; Amr Ahmed; Fela Winkelmolen; Miroslav Miladinovic; Cedric Archembeau; Alex Tang; Bhaskar Dutt; Patricia Grao; Kumar Venkateswar (Amazon)] - MLOS: an Infrastructure for Automated Performance Engineering [Carlo Curino; Neha Gowdal; Brian Kroth; Sergiy Kuryata; Greg Lapinski; Siqi Liu; Slava Oks; Olga Poppe; Adam Smiechowski; Ed Thayer; Markus Weimer; Yiwen Zhu (Microsoft)] - Resilient Neural Forecasting Systems [Michael Bohlke-schneider; Shubham Kapoor; Tim Januschowski (Amazon)] - Developments in MLflow: A System to Accelerate the Machine Learning Lifecycle [Corey Zumar; Matei Zaharia; Andrew Chen; Aaron Davidson; Arjun DCunha; Clemens Mewald; Sue Ann Hong; Andy Konwinski; Siddharth Murching; Tomas Nykodym; Richard Zang; Paul Ogilvie; Mani Parkhe; Avesh Singh; Fen Xie; Juntai Zheng; Max Allen; Apurva Koti; Ankit MathurDatabricks (Databricks)] - Causality-based Explanation of Classification Outcomes [Leopoldo Bertossi (Universidad Adolfo Ibanez and RelationalAI Inc.); Dan Suciu; Maximilian Schleich (University of Washington); Jordan Li (Carleton University); Zografoula Vagena (RelationalAI Inc.)] - IntegratedML: Every SQL Developer is a Data Scientist [Benjamin De Boe; Thomas Dyar; Tom Woodfin (InterSystems Corporation)] - A Vision on Accelerating Enterprise IT System 2.0 [Rekha Singhal (TCS)]