Call for Papers

Journal: Frontiers in Big Data (Data Mining and Management)

Research Topic - Scalable Data Science: From Theory to Practice


Data science is the next frontier for data-driven decision making in domains such as ecommerce, healthcare, manufacturing, defense, government, and education. It is an interdisciplinary field that combines principles, concepts, and techniques in mathematics, statistics, computer science, and information science. One of the key goals in data science is to automatically extract meaningful insights and knowledge from structured, semi-structured, and unstructured data. To turn raw data into insights, several tasks need to be performed including data collection, data storage and retrieval, data wrangling, data analysis using statistical techniques and machine learning, and data visualization.

It is predicted that by 2024 there will be nearly 150 zettabytes of data. The data deluge continues to challenge us. Large amounts of data are produced on the Web (e.g., social media). Enterprise data lakes and electronic health record systems contain massive amounts of sensitive data of customers and patients. Sensors and Internet of Things (IoT) devices produce enormous amounts of data at a very high rate. As the price of whole genome sequencing continues to drop, healthcare systems will be faced with the challenge of managing massive amounts of genomic data in the near future. In recent years, machine learning (ML), deep learning (DL), and natural language processing (NLP) have become ubiquitous in commercial applications and services such as search, recommendation, image understanding, and speech recognition.

However, the explosion in the volume of structured (e.g., relational databases), semi-structured (e.g., graphs), and unstructured data (e.g., web pages, images, videos) poses serious technical challenges for data science research and applications. The goal of this Research Topic is to focus on novel approaches and techniques including scalable algorithms, models, and systems for data science tasks on large, complex datasets.

The scope of this Research Topic includes theoretical advances, systems design, algorithmic contributions in data science. We seek high-quality contributions of the following types: Original Research, Methods, Technology and Code, and Data Report. Topics of interest include but are not limited to:

  - Scalable approaches for data collection and data wrangling
  - Scalable approaches for data storage and retrieval of structured/semi-structured, unstructured data
  - Scalable ML/DL/NLP techniques for data science tasks
  - Scalable data visualization techniques

Topic Editors
Praveen Rao, University of Missouri-Columbia, USA
Haridimos Kondylakis, ICS-FORTH, Greece
Sanjay Madria, Missouri University of Science and Technology, USA
Kostas Stefanidis, Tampere University, Finland
Bongki Moon, Seoul National University, South Korea

New Submission Deadline
01 July 2021       Manuscript

Submission Guidelines

Contact Information
In case of questions, please contact Prof. Praveen Rao (