Search

Title/Author	Year	Citations	added
Nelder, J.A.; Mead, R. A simplex method for function minimization	1965	16651	Sep06
Kohavi, R.; John, G.H. Wrappers for Feature Subset Selection	1997	4115	Sep06
Fellegi, I.P.; Sunter, A.B. A Theory for Record Linkage	1969	1444	Oct06
Navarro, G A guided tour to approximate string matching	2001	1369	May07
Cohen, WW; Ravikumar, P; Fienberg, SE A comparison of string distance metrics for name-matching tasks	2003	1091	Sep06
Elmagarmid, Ahmed; Ipeirotis, Panagiotis; Verykios, Vassilios Duplicate Record Detection: A Survey	2007	785	Oct06
Rahm, Erhard; Do, Hong Hai Data Cleaning: Problems and Current Approaches	2000	778	Aug06
Hernandez, M.A.; Stolfo, S.J. The merge/purge problem for large databases	1995	751	Sep06
Winkler, W.E. The state of record linkage and current research problems	1999	634	Oct06
Hernandez, MA; Stolfo, S. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem	1998	604	Sep06
Bilenko, M; Mooney, RJ Adaptive duplicate detection using learnable string similarity measures	2003	573	Sep06
McCallum, A; Nigam, K; Ungar, LH Efficient clustering of high-dimensional data sets with application to reference matching	2000	550	Apr07
Ristad, ES; Yianilos, PN; Inc, M.T.; Princeton, NJ Learning string-edit distance	1998	498	Oct06
Monge, A.; Elkan, C. The field matching problem: Algorithms and applications	1996	443	Oct06
Cohen, WW Integration of heterogeneous databases without common domains using queries based on textual similarity	1998	438	Sep06
Gravano, L.; Ipeirotis, P.G.; Jagadish, H.V.; Koudas, N.; Muthukrishnan, S.; Srivastava, D. Approximate string joins in a database (almost) for free	2001	411	Oct06
Dong, X.; Halevy, A.; Madhavan, J. Reference reconciliation in complex information spaces	2005	380	Sep06
Chaudhuri, S.; Ganjam, K.; Ganti, V.; Motwani, R. Robust and efficient fuzzy match for online data cleaning	2003	378	Sep06
Monge, A.E.; Elkan, C. An efficient domain-independent algorithm for detecting approximately duplicate database records	1997	364	Oct06
Bilenko, M; Mooney, R; Cohen, W; P Ravikumar, S Adaptive name matching in information integration	2003	339	Nov07
Guarino, N; Welty, CA An overview of OntoClean	2009	337	Jul11
Ananthakrishna, R; Chaudhuri, S; Ganti, V Eliminating fuzzy duplicates in data warehouses	2002	334	Sep06
Galhardas, H; Florescu, D; Shasha, D; Simon, E; Saita, C. Declarative data cleaning: Language, model, and algorithms	2001	323	Sep06
Mann, GS; Yarowsky, D Unsupervised Personal Name Disambiguation	2003	283	Apr07
Cohen, William; Richman, Jacob Learning to match and cluster large high-dimensional data sets for data integration	2002	274	Oct06
Pasula, H; Marthi, B; Milch, B; Russell, S; Shpitser, I Identity uncertainty and citation matching	2003	267	Apr07
Hjaltason, G.R.; Samet, H. Incremental distance join algorithms for spatial databases	1998	250	Oct06
Naumann, F; Leser, U; Freytag, J Quality-driven Integration of Heterogeneous Information Systems	1999	246	Sep06
Bhattacharya, I.; Getoor, L.; Collective Entity Resolution in Relational Data	2007	238	Apr07
Tejada, S; Knoblock, CA; Minton, S Learning object identification rules for information integration	2001	219	May08
Tejada, S Learning Object Identification Rules for Information Integration	2002	219	Oct06
Bitton, D.; DeWitt, D.J. Duplicate record elimination in large data files	1983	208	Oct06
Han, H; Giles, L; Zha, H; Li, C; Tsioutsiouliklis, K Two supervised learning approaches for name disambiguation in author citations	2004	203	Feb09
Baxter, R; Christen, P; Churches, T A comparison of fast blocking methods for record linkage	2003	203	Apr07
Tejada, S; Knoblock, CA; Minton, S Learning domain-independent string transformation weights for high accuracy object identification	2002	202	Sep06
Chaudhuri, S.; Ganti, V.; Kaushik, R. A Primitive Operator for Similarity Joins in Data Cleaning	2006	201	Oct06
Winkler, W.E. Advanced methods for record linkage	1994	196	Oct06
Cohen, W.W. Data integration using similarity joins and a word-based information representation language	2000	195	Oct06
Bleiholder, J; Naumann, F Data fusion	2008	185	Mar09
Galhardas, H; Florescu, D; Shasha, D; Simon, E AJAX: an extensible data cleaning tool	2000	175	Sep06
Cohen, W.W.; Hirsh, H. Joins that generalize: text classification using Whirl	1998	161	Sep06
Jin, L.; Li, C.; Mehrotra, S. Efficient record linkage in large data sets	2003	154	Oct06
Volz, J; Bizer, C; Gaedke, M; Kobilarov, G Silk - a link discovery framework for the web of data	2009	151	May10
Lee, M.L.; Ling, T.W.; Low, W.L. IntelliClean: a knowledge-based intelligent data cleaner	2000	149	Oct06
Bhattacharya, I.; Getoor, L.; A Latent Dirichlet Model for Unsupervised Entity Resolution	2006	144	Apr07
Chaudhuri, Surajit; Ganti, Venkatesh; Motwani, Rajeev Robust Identification of Fuzzy Duplicates	2005	140	Aug06
Maletic, J.I.; Marcus, A. Data Cleansing: Beyond Integrity Analysis	2000	138	Sep06
Gu, L; Baxter, R; Vickers, D; Rainsford, C Record linkage: Current practice and future directions	2003	137	Sep06
Gravano, L.; Ipeirotis, P.G.; Koudas, N.; Srivastava, D. Text joins in an RDBMS for web data integration	2003	129	Oct06
Bohannon, Philip; Fan, Wenfei; Geerts, Floris; Jia, Xibei; Kementsietsidis, Anastasios Conditional Functional Dependencies for Data Cleaning	2007	122	Jan08

Data Cleaning publication categorizer

Guided search

Data Cleaning

Data sets

Data type

Paper type

Venue type

Author

Year

mailpart

Citations range

Keyword search

Results

User login