Matching hierarchies using shared objects

Ikeda, R; Zhao, K; Garcia-Molina, H
Ikeda, R
Zhao, K
Garcia-Molina, H
Proc. ECDL
Citations range: 
1 - 9

One of the main challenges in integrating two hierarchies (e.g., of books or web pages) is determining the correspondence between the edges of each hierarchy. Traditionally, this process, which we call hierarchy matching, is done by comparing the text associated with each edge. In this paper we instead use the placement of objects present in both hierarchies to infer how the hierarchies relate. We present two algorithms that, given a hierarchy with known facets (attribute-value pairs that define what objects are placed under an edge), determine feasible facets for a second hierarchy, based on shared objects. One algorithm is rule-based and the other is statistics-based. In the experimental section, we compare the results of the two algorithms, and see how their performances vary based on the amount of noise in the hierarchies.