WebIQ: Learning from the Web to Match Query Interfaces on the Deep Web

Authors: 
Wu, W.; Doan, A.; Yu, C.
Author: 
Wu, W
Doan, A
Yu, C
Year: 
2006
Venue: 
ICDE, 2006
URL: 
http://www.dit.unitn.it/~p2p/RelatedWork/Matching/icde06-webiq.pdf
Citations: 
0
Citations range: 
n/a
AttachmentSize
Wu2006WebIQLearningfromtheWebto.pdf276.34 KB

Integrating Deep Web sources requires highly accurate semantic matches between the attributes of the source query interfaces. These matches are usually established by comparing the similarities of the attributes' labels and instances. However, attributes on query interfaces often have no or very few data instances. The pervasive lack of instances seriously reduces the accuracy of current matching techniques. To address this problem, we describe WebIQ, a solution that learns from both the Surface Web and the Deep Web to automatically discover instances for interface attributes. WebIQ extends question answering techniques commonly used in the AI community for this purpose. We describe how to incorporate WebIQ into current interface matching systems. Extensive experiments over five realworld domains show the utility ofWebIQ. In particular, the results show that acquired instances help improve matching accuracy from 89.5% F-1 to 97.5%, at only a modest runtime overhead.