- Paolo Papotti (Arizona State University, USA)
- Shazia Sadiq (The University of Queensland, Australia)
- Date & Time: Thursday May 19, 2016, 11:00am – 12:30pm.
- Panel Description:
The increased reliance on data driven enterprise has seen an unprecedented
investment in big data initiatives. As companies intensify their efforts to get
value from big data, the growth in the amount of data being managed continues
at an exponential rate, leaving organizations with a massive footprint of
unexplored, unfamiliar datasets. On February 8th, 2015, a group of global
thought leaders from the database research community outlined the grand
challenges in getting value from big data [Stoyanovich and Suchanek, 2015].
The key message was the need to develop the capacity to
"understand how the quality of data affects the quality of the insight
we derive from it". At the same time, data quality discovery and repair techniques are highly contextual and their success depends on their fitness against both the data quality dimension (e.g. completeness, consistency, timeliness, accuracy, reliability) as well as the type of data (e.g. structured/relational, text, spatial, time series, social/graph, RDF/web). The techniques include those driven by logic such as data dependency constraints and integrity rules; those driven by numerical and statistical approaches; and several others
such as probabilistic, learning and empirical methods.
In this session, the panelists will debate on the fitness of techniques in
diverse settings. The attendees of the panel can expect to get a broad
understanding of key approaches on data quality and to get more familiar with
emerging techniques and lessons learnt in different settings.
Felix Naumann (Hasso-Plattner-Institut, Germany)
Tamraparni Dasu (AT&T Labs Research, USA)
Juliana Freire (New York University Tandon School of Engineering, USA)
Ihab F. Ilyas (University of Waterloo, Canada)
Eric Simon (SAP, France)