Nelder, J.A.; Mead, R. A simplex method for function minimization |
1965 |
16651 |
Sep06 |
Kohavi, R.; John, G.H. Wrappers for Feature Subset Selection |
1997 |
4115 |
Sep06 |
Fellegi, I.P.; Sunter, A.B. A Theory for Record Linkage |
1969 |
1444 |
Oct06 |
Navarro, G A guided tour to approximate string matching |
2001 |
1369 |
May07 |
Cohen, WW; Ravikumar, P; Fienberg, SE A comparison of string distance metrics for name-matching tasks |
2003 |
1091 |
Sep06 |
Elmagarmid, Ahmed; Ipeirotis, Panagiotis; Verykios, Vassilios Duplicate Record Detection: A Survey |
2007 |
785 |
Oct06 |
Rahm, Erhard; Do, Hong Hai Data Cleaning: Problems and Current Approaches |
2000 |
778 |
Aug06 |
Hernandez, M.A.; Stolfo, S.J. The merge/purge problem for large databases |
1995 |
751 |
Sep06 |
Winkler, W.E. The state of record linkage and current research problems |
1999 |
634 |
Oct06 |
Hernandez, MA; Stolfo, S. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem |
1998 |
604 |
Sep06 |
Bilenko, M; Mooney, RJ Adaptive duplicate detection using learnable string similarity measures |
2003 |
573 |
Sep06 |
McCallum, A; Nigam, K; Ungar, LH Efficient clustering of high-dimensional data sets with application to reference matching |
2000 |
550 |
Apr07 |
Ristad, ES; Yianilos, PN; Inc, M.T.; Princeton, NJ Learning string-edit distance |
1998 |
498 |
Oct06 |
Monge, A.; Elkan, C. The field matching problem: Algorithms and applications |
1996 |
443 |
Oct06 |
Cohen, WW Integration of heterogeneous databases without common domains using queries based on textual similarity |
1998 |
438 |
Sep06 |
Gravano, L.; Ipeirotis, P.G.; Jagadish, H.V.; Koudas, N.; Muthukrishnan, S.; Srivastava, D. Approximate string joins in a database (almost) for free |
2001 |
411 |
Oct06 |
Dong, X.; Halevy, A.; Madhavan, J. Reference reconciliation in complex information spaces |
2005 |
380 |
Sep06 |
Chaudhuri, S.; Ganjam, K.; Ganti, V.; Motwani, R. Robust and efficient fuzzy match for online data cleaning |
2003 |
378 |
Sep06 |
Monge, A.E.; Elkan, C. An efficient domain-independent algorithm for detecting approximately duplicate database records |
1997 |
364 |
Oct06 |
Bilenko, M; Mooney, R; Cohen, W; P Ravikumar, S Adaptive name matching in information integration |
2003 |
339 |
Nov07 |
Guarino, N; Welty, CA An overview of OntoClean |
2009 |
337 |
Jul11 |
Ananthakrishna, R; Chaudhuri, S; Ganti, V Eliminating fuzzy duplicates in data warehouses |
2002 |
334 |
Sep06 |
Galhardas, H; Florescu, D; Shasha, D; Simon, E; Saita, C. Declarative data cleaning: Language, model, and algorithms |
2001 |
323 |
Sep06 |
Mann, GS; Yarowsky, D Unsupervised Personal Name Disambiguation |
2003 |
283 |
Apr07 |
Cohen, William; Richman, Jacob Learning to match and cluster large high-dimensional data sets for data integration |
2002 |
274 |
Oct06 |
Pasula, H; Marthi, B; Milch, B; Russell, S; Shpitser, I Identity uncertainty and citation matching |
2003 |
267 |
Apr07 |
Hjaltason, G.R.; Samet, H. Incremental distance join algorithms for spatial databases |
1998 |
250 |
Oct06 |
Naumann, F; Leser, U; Freytag, J Quality-driven Integration of Heterogeneous Information Systems |
1999 |
246 |
Sep06 |
Bhattacharya, I.; Getoor, L.; Collective Entity Resolution in Relational Data |
2007 |
238 |
Apr07 |
Tejada, S; Knoblock, CA; Minton, S Learning object identification rules for information integration |
2001 |
219 |
May08 |
Tejada, S Learning Object Identification Rules for Information Integration |
2002 |
219 |
Oct06 |
Bitton, D.; DeWitt, D.J. Duplicate record elimination in large data files |
1983 |
208 |
Oct06 |
Han, H; Giles, L; Zha, H; Li, C; Tsioutsiouliklis, K Two supervised learning approaches for name disambiguation in author citations |
2004 |
203 |
Feb09 |
Baxter, R; Christen, P; Churches, T A comparison of fast blocking methods for record linkage |
2003 |
203 |
Apr07 |
Tejada, S; Knoblock, CA; Minton, S Learning domain-independent string transformation weights for high accuracy object identification |
2002 |
202 |
Sep06 |
Chaudhuri, S.; Ganti, V.; Kaushik, R. A Primitive Operator for Similarity Joins in Data Cleaning |
2006 |
201 |
Oct06 |
Winkler, W.E. Advanced methods for record linkage |
1994 |
196 |
Oct06 |
Cohen, W.W. Data integration using similarity joins and a word-based information representation language |
2000 |
195 |
Oct06 |
Bleiholder, J; Naumann, F Data fusion |
2008 |
185 |
Mar09 |
Galhardas, H; Florescu, D; Shasha, D; Simon, E AJAX: an extensible data cleaning tool |
2000 |
175 |
Sep06 |
Cohen, W.W.; Hirsh, H. Joins that generalize: text classification using Whirl |
1998 |
161 |
Sep06 |
Jin, L.; Li, C.; Mehrotra, S. Efficient record linkage in large data sets |
2003 |
154 |
Oct06 |
Volz, J; Bizer, C; Gaedke, M; Kobilarov, G Silk - a link discovery framework for the web of data |
2009 |
151 |
May10 |
Lee, M.L.; Ling, T.W.; Low, W.L. IntelliClean: a knowledge-based intelligent data cleaner |
2000 |
149 |
Oct06 |
Bhattacharya, I.; Getoor, L.; A Latent Dirichlet Model for Unsupervised Entity Resolution |
2006 |
144 |
Apr07 |
Chaudhuri, Surajit; Ganti, Venkatesh; Motwani, Rajeev Robust Identification of Fuzzy Duplicates |
2005 |
140 |
Aug06 |
Maletic, J.I.; Marcus, A. Data Cleansing: Beyond Integrity Analysis |
2000 |
138 |
Sep06 |
Gu, L; Baxter, R; Vickers, D; Rainsford, C Record linkage: Current practice and future directions |
2003 |
137 |
Sep06 |
Gravano, L.; Ipeirotis, P.G.; Koudas, N.; Srivastava, D. Text joins in an RDBMS for web data integration |
2003 |
129 |
Oct06 |
Bohannon, Philip; Fan, Wenfei; Geerts, Floris; Jia, Xibei; Kementsietsidis, Anastasios Conditional Functional Dependencies for Data Cleaning |
2007 |
122 |
Jan08 |