The Object Identification Framework

Neiling, Mattis; Jurk, Steffen
KDD03 Workshop on Data Cleaning, Record Linkage, and Object Consolidation

The object identification problem in databases has been
tackled in many different ways, e.g. Record Linkage, or the
Sorted Neighborhood Method. We present a framework, that
allows the evaluation of the competing approaches. Appropriate experiments on a real-world database has been made.

Object Identification Quality

Neiling, M; Jurk, S; Lenz, HJ; Naumann, F
Proc. DQCIS Workshop, 2003

Research and industry has tackled the object identification
problem of data integration in many different ways.
This paper presents a framework, that allows the evaluation of
competing approaches. To this end, complexity measures and
data characteristics are introduced, which reflect the hardness
of a given object identification problem. All characteristics can be
estimated by use of simple SQL queries and simple calculations.
Following the principle of benchmark definitions we specify a test
framework. It consists of a test database and its characteristics,

