Key Concepts in the ChoiceMaker 2 Record Matching System

Borthwick, A; Buechi, M; Goldberg, A
Procs. First Workshop on Data Cleaning

We describe an innovative record matching system called
ChoiceMaker 2 we developed at ChoiceMaker Technologies
(CMT). Firstly, we describe the process by which we use a
machine learning technique known as maximum entropy
modeling to tune the system to the problem at hand. Secondly,
we describe the ClueMaker™ programming language that is used
to describe record matching characteristics. Thirdly, we describe
our method for testing record matching systems and describe how
our IDE facilitates this process.

