Utilize este identificador para referenciar este registo: http://hdl.handle.net/10451/14158
Título: Semantic Similarity Match for Data Quality
Autor: Martins, Fernando
Falcão, André
Couto, Francisco M.
Palavras-chave: semantic similarity
data cleaning
data quality
wordnet
similarity match
Data: Out-2007
Editora: Department of Informatics, University of Lisbon
Relatório da Série N.º: di-fcul-tr-07-25
Resumo: Data quality is a critical aspect of applications that support business operations. Often entities are represented more than once in data repositories. Since duplicate records do not share a common key, they are hard to detect. Duplicate detection over text is usually performed using lexical approaches, which do not capture text sense. The difficulties increase when the duplicate detection must be performed using the text sense. This work presents a semantic similarity approach, based on a text sense matching mechanism, that performs the detection of text units which are similar in sense. The goal of the proposed semantic similarity approach is therefore to perform the duplicate detection task in a data quality process
URI: http://hdl.handle.net/10451/14158
http://repositorio.ul.pt/handle/10455/3050
Aparece nas colecções:FC-DI - Technical Reports

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
07-25.pdf215,96 kBAdobe PDFVer/Abrir


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpace
Formato BibTex MendeleyEndnote Degois 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.