Utilize este identificador para referenciar este registo: http://hdl.handle.net/10451/14171
Título: Explicitly Involving the User in a Data Cleaning Process
Outros títulos: Support for User Involvement in Data Cleaning Applications (original title)
Autor: Galhardas, Helena
Lopes, Antónia
Santos, Emanuel
Palavras-chave: Data Cleaning
User feedback
Data Transformation
Data: 23-Jul-2010
Relatório da Série N.º: 2010;3
Resumo: Data cleaning and Extract-Transform-Load processes are usually modeled as graphs of data transformations. These graphs typically involve a large number of data transformations, and must handle large amounts of data. The involvement of the users responsible for executing the corresponding programs over real data is important to tune data transformations and to manually correct data items that cannot be treated automatically. In this paper, we extend the notion of data cleaning graph in order to better support the user involvement in data cleaning processes. We propose that data cleaning graphs include: (i) data quality constraints to help users to identify the points of the graph and the records that need their attention and (ii) manual data repairs for representing the way users can provide the feedback required to manually clean some data items. We provide preliminary experimental results that show, for a real-world data cleaning process, the significant gains obtained with our approach in terms of the quality of the data produced and the cost incurred by users in data visualization and updating tasks.
Descrição: Reviewed by Mário Silva
URI: http://hdl.handle.net/10451/14171
Aparece nas colecções:FC-DI - Technical Reports

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
TR-2010-03.pdf775,45 kBAdobe PDFVer/Abrir

