This Article 
 Bibliographic References 
 Add to: 
Fifth IEEE International Conference on Data Mining (ICDM'05)
Houston, Texas
November 27-November 30
ISBN: 0-7695-2278-5
Wensheng Wu, University of Illinois at Urbana-Champaign
AnHai Doan, University of Illinois at Urbana-Champaign
Clement Yu, University of Illinois at Chicago
We consider the problem of integrating a large number of interface schemas over the Deep Web, The scale of the problem and the diversity of the sources present serious challenges to the conventional manual or rule-based approaches to schema integration. To address these challenges, we propose a novel formulation of schema integration as an optimization problem, with the objective of maximally satisfying the constraints given by individual schemas. Since the optimization problem can be shown to be NP-complete, we develop a novel approximation algorithm LMax, which builds the unified schema via recursive applications of clustering aggregation. We further extend LMax to handle the irregularities frequently occurring among the interface schemas. Extensive evaluation on real-world data sets shows the effectiveness of our approach.
Wensheng Wu, AnHai Doan, Clement Yu, "Merging Interface Schemas on the Deep Web via Clustering Aggregation," icdm, pp.801-804, Fifth IEEE International Conference on Data Mining (ICDM'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.