Search: no paper type

Title/Author	Year	Citations	added
Nelder, J.A.; Mead, R. A simplex method for function minimization	1965	16651	Sep06
Kohavi, R.; John, G.H. Wrappers for Feature Subset Selection	1997	4115	Sep06
Fellegi, I.P.; Sunter, A.B. A Theory for Record Linkage	1969	1444	Oct06
Hernandez, M.A.; Stolfo, S.J. The merge/purge problem for large databases	1995	751	Sep06
Hernandez, MA; Stolfo, S. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem	1998	604	Sep06
McCallum, A; Nigam, K; Ungar, LH Efficient clustering of high-dimensional data sets with application to reference matching	2000	550	Apr07
Ristad, ES; Yianilos, PN; Inc, M.T.; Princeton, NJ Learning string-edit distance	1998	498	Oct06
Monge, A.; Elkan, C. The field matching problem: Algorithms and applications	1996	443	Oct06
Cohen, WW Integration of heterogeneous databases without common domains using queries based on textual similarity	1998	438	Sep06
Chaudhuri, S.; Ganjam, K.; Ganti, V.; Motwani, R. Robust and efficient fuzzy match for online data cleaning	2003	378	Sep06
Monge, A.E.; Elkan, C. An efficient domain-independent algorithm for detecting approximately duplicate database records	1997	364	Oct06
Bilenko, M; Mooney, R; Cohen, W; P Ravikumar, S Adaptive name matching in information integration	2003	339	Nov07
Cohen, William; Richman, Jacob Learning to match and cluster large high-dimensional data sets for data integration	2002	274	Oct06
Hjaltason, G.R.; Samet, H. Incremental distance join algorithms for spatial databases	1998	250	Oct06
Bhattacharya, I.; Getoor, L.; Collective Entity Resolution in Relational Data	2007	238	Apr07
Tejada, S Learning Object Identification Rules for Information Integration	2002	219	Oct06
Bitton, D.; DeWitt, D.J. Duplicate record elimination in large data files	1983	208	Oct06
Cohen, W.W. Data integration using similarity joins and a word-based information representation language	2000	195	Oct06
Bhattacharya, I.; Getoor, L.; A Latent Dirichlet Model for Unsupervised Entity Resolution	2006	144	Apr07
Kalashnikov, D.V.; Mehrotra, S.; Chen, Z. Exploiting relationships for domain-independent data cleaning	2005	111	Oct06
Kalashnikov, DV; Mehrotra, S Domain-independent data cleaning via analysis of entity-relationship graph	2006	98	Apr07
Monge, AE Matching Algorithms within a Duplicate Detection System	2000	91	Apr07
Low, WL; Lee, ML; Ling, TW A knowledge-based approach for duplicate elimination in data cleaning	2001	88	Apr07
Song, Y; Huang, J; Councill, IG; Li, J; Giles, CL Efficient topic-based unsupervised name disambiguation	2007	76	Nov07
Xi, W; Fox, EA; Fan, W; Zhang, B; Chen, Z; Yan, J; J Yan, D SimFusion: measuring similarity using unified relationship matrix	2005	75	Oct06
Karger, DR; Jones, W Data unification in personal information management	2006	71	Apr07
Borgman, CL; Siegfried, SL Getty's Synoname and its cousins: A survey of applications of personal name-matching algorithms	1992	69	Apr07
Doan, AnHai; Lu, Ying; Lee, Yoonkyong; Han, Jiawei Object Matching for Information Integration: A Profiler-Based Approach	2003	68	Sep06
Chen, Z; Kalashnikov, DV; Mehrotra, S Exploiting relationships for object consolidation	2005	68	Sep06
Verykios, V. S.; Moustakides, G. V.; Elfeky, M. G. A Bayesian decision model for cost optimal record matching	2003	66	Oct06
Tan, YF; Kan, MY; Lee, D Search engine driven author disambiguation	2006	63	Apr07
Lee, Dongwon; On, Byung-Won; Kang, Jaewoo; Park, Sanghyun Effective and scalable solutions for mixed and split citation problems in digital libraries	2005	61	Oct06
Bhattacharya, Indrajit; Getoor, Lise Relational clustering for multi-type entity resolution	2005	49	Oct06
Lee, M.L.; Hsu, W.; Kothari, V. Cleaning the spurious links in data	2004	40	Sep06
Schallehn, E; Sattler, KU; Saake, G Efficient similarity-based operations for data integration	2004	39	Apr07
Bhattacharya, I; Getoor, L; Licamele, L Query-time entity resolution	2006	39	Sep06
Chua, CEH; Chiang, RHL; Lim, EP Instance-based attribute identification in database integration	2003	36	Sep06
Aizawa, A; Oyama, K A Fast Linkage Detection Scheme for Multi-Source Information Integration	2005	35	Nov07
Benjelloun, O.; Garcia-Molina, H.; Gong, H.; Kawai, H; Larson, T.E.; Menestrina, D.; Thavisomboon, S. D-Swoosh: A Family of Algorithms for Generic, Distributed Entity Resolution	2007	32	Aug07
Herbert, KG; Gehani, NH; Piel, WH; Wang, JTL; Wu, CH BIO-AJAX: an extensible framework for biological data cleaning	2004	30	Mar07
Galhardas, H; Florescu, D; Shasha, D; Simon, E; E Simon, CA Improving data cleaning quality using a data lineage facility	2001	30	Oct06
Quass, D.; Starkey, P. Record linkage for genealogical databases	2003	24	Sep06
Michalowski, M; Thakkar, S; Knoblock, CA Exploiting secondary sources for automatic object consolidation	2003	23	Apr07
Ganesh, M.; Srivastava, J.; Richardson, T. Mining entity-identification rules for database integration	1996	21	Sep06
Verykios, VS; Elfeky, MG; AK Elmagarmid, A On The Accuracy and Completeness of The Record Matching Process	2000	16	Apr07
Kalashnikov, DV; Mehrotra, S A probabilistic model for entity disambiguation using relationships	2005	16	Sep06
Nuray-Turan, R; Kalashnikov, DV; Mehrotra, S Self-tuning in graph-based reference disambiguation	2007	13	Nov07
Goyal, P Duplicate record identification in bibliographic databases	1987	12	Apr07
Snae, C; Diaz, BM An interface for mining genealogical nominal data using the concept of linkage and a hybrid name matching algorithm	2002	9	Apr07
Jakoniene, V; Rundqvist, D;Lambrix, P A method for similarity-based grouping of biological data	2006	8	Mar07

Data Cleaning publication categorizer

Guided search

Data Cleaning

Data sets

Data type

Paper type

Venue type

Author

Year

Citations range

Keyword search

Results

Current search

Paper type

User login