Name Disambiguation Using Web Connection

Authors: 
Lu, Y; Nie, Z; Cheng, T; Gao, Y; Wen, JR
Author: 
Lu, Y
Nie, Z
Cheng, T
Gao, Y
Wen, JR
Year: 
2007
Venue: 
Proceedings of AAAI 2007 Workshop on Information Integration ...
URL: 
https://www.aaai.org/Papers/Workshops/2007/WS-07-14/WS07-14-010.pdf
Citations: 
4
Citations range: 
1 - 9
AttachmentSize
Lu2007NameDisambiguationUsingWebConnection.pdf373.34 KB

to the same person, it is very likely that they share some Name disambiguation is an important challenge in data coauthors, references, or are indirectly related by a chain of cleaning. In this paper, we focus on the problem that multiple relationships. real-world objects (e.g., authors, actors) in a dataset share the same name. We show that Web corpora can be exploited to significantly improve the accuracy (i.e. precision and recall) of name disambiguation. We introduce a novel approach called WebNaD (Web-based Name Disambiguation) to effectively measure and use the Web connection between different object appearances of the same name in the local dataset. Our empirical study done in the context of Libra, an academic search engine that indexes 1 million papers, shows the effectiveness of our approach. Figure 1. Three "Lei Zhang" are found in DBLP.