Object Identification Quality

Authors: 
Neiling, M; Jurk, S; Lenz, HJ; Naumann, F
Author: 
Neiling, M
Jurk, S
Lenz, H
Naumann, F
Year: 
2003
Venue: 
Proc. DQCIS Workshop, 2003
URL: 
http://cis.cs.tu-berlin.de/~mneiling/publications/Neiling_et_al@DQCIS2003.pdf
Citations: 
19
Citations range: 
10 - 49
AttachmentSize
Neiling2003ObjectIdentification.pdf171.68 KB

Research and industry has tackled the object identification
problem of data integration in many different ways.
This paper presents a framework, that allows the evaluation of
competing approaches. To this end, complexity measures and
data characteristics are introduced, which reflect the hardness
of a given object identification problem. All characteristics can be
estimated by use of simple SQL queries and simple calculations.
Following the principle of benchmark definitions we specify a test
framework. It consists of a test database and its characteristics,
quality criteria, and a test specification. Adequate measures
needed for the correctness criterion of the benchmark are given.
A running example of the Berlin Online Apartment-Advertisements
database (BOA) illustrates the approach. The BOA-database is
freely available at www.wiwiss.fu-berlin.de/lenz/boa/.