DBWorld Message

___________________________________________________________
 
                CALL FOR PAPERS
 
    ACM Journal of Data and Information Quality
 Special Issue on Deep Learning for Data Quality
 
___________________________________________________________

* Guest Editors:
-Paolo Papotti, EURECOM (France)
-Donatello Santoro, Università degli Studi della Basilicata (Italy)
-Saravanan Thirumuruganathan, QCRI (Qatar)

* Context:
Deep learning (DL) has been recently used successfully for monitoring 
and improving data quality (DQ). Examples include data integration 
tasks such as entity resolution and schema matching, data cleaning 
tasks such as error detection and repair, and data curation in general. 
The data curation community has successfully leveraged deep learning 
techniques spanning from word embeddings to transformers to achieve 
state-of-the-art performance on well established data quality benchmarks. 
Nevertheless, there is still an open debate on which technical solution 
performs best for relational data and under which setting. 

Despite a promising start, deep learning for data quality has a long 
way to go in achieving the human level performance that it has achieved 
in domains such as computer vision, natural language processing, and 
speech recognition. While there have been some substantial improvements 
in specific tasks such as entity resolution and data repair/imputation, 
many of the other data quality tasks (such as data discovery, data 
profiling, data integration, record fusion) are yet to fully benefit from 
the DL revolution. Also, it is not clear how to push DL techniques to 
get the same level of adaptation achieved by more traditional logic-based 
methods. For example, interpretability of the models is a key stumbling block. 
How can one develop DQ explanations that are consumed by non-experts? 
Should the explanation be generated individually for each error? 
Or can it be summarized so that the user gets a high level overview? 
Finally, DL data quality tools need novel explanation algorithms which 
are not a priority for DL researchers as the architecture is quite specific.

This special issue focuses on deep learning used for assessing and improving 
the quality of data. Thus, the issue is addressed to those members from the 
data science community proposing novel methods, architectures and algorithms 
capable of integrating, cleaning and profiling relational data sources 
with supervised and unsupervised approaches. 

* Topics:
The goal of this special issue is to collect recent advances, innovations, 
and practices in ML, data and software engineering for building techniques, 
solutions, and systems that support users in assessing and improving 
relational data quality. The topics of interest are inspired from the 
themes above and include, but are not limited to:
 - Deep learning methods for data integration and data cleaning 
 - Deep learning methods for metadata discovering/profiling, including constraint discovery
 - Making deep learning methods for data quality interpretable 
 - Experimental studies of deep learning methods for data quality
 - Deep learning methods for curating data in domain specific applications
 - Scalability of deep learning methods for data quality (speeding up DL for DQ using GPU)
 - Characterization of data quality tasks that are more amenable to deep learning
 - Reducing the need of large amount of training data in supervised approaches 
 (weak- and self-supervision for data quality)
 - Combination of logic based and DL based methods for data quality

* Expected contributions:
We welcome three types of research contributions:
 - Full research papers describing a novel contribution to the field (up to 25 pages)
 - Experience papers discussing important lessons learned (up to 20 pages)
 - Vision and Challenge papers (up to 7 pages)
 - Survey papers (up to 30 pages)

* Submission Format:
JDIQ welcomes manuscripts that extend prior published work, provided they 
contain at least 30% new material, and that the significant new contributions 
are clearly identified in the introduction. 

Submission guidelines with Latex (preferred) or Word templates are available at:
http://jdiq.acm.org/authors.cfm#subm

To submit, select the paper type
"SI: Deep Learning for Data Quality"

* Important Dates:
- Submission deadline: March 1, 2021
- First notification: May 15, 2021
- Revised manuscripts deadline: July 15, 2021
- Final notification: September 15, 2021
- Camera-ready manuscripts: October 15, 2021
- Estimated publication date: January 2022