Menu Nederlands

Management of data quality related problems: exploiting operational knowledge

Publication of Creating 010

Jan Dijk,van, M.S. Bargh, R. Choenni | Article | Publication date: 01 July 2016
Dealing with data quality related problems is an important issue that all organizations face in realizing and sustaining data intensive advanced applications. Upon detecting these problems in datasets, data analysts often register them in issue tracking systems in order to address them later on categorically and collectively. As there is no standard format for registering these problems, data analysts often describe them in natural languages and subsequently rely on ad-hoc, non-systematic, and expensive solutions to categorize and resolve registered problems. In this contribution we present a formal description of an innovative data quality resolving architecture to semantically and dynamically map the descriptions of data quality related problems to data quality attributes. Through this mapping, we reduce complexity – as the dimensionality of data quality attributes is far smaller than that of the natural language space – and enable data analysts to directly use the methods and tools.

Author(s) - affiliated with Rotterdam University of Applied Sciences

For this publication