CiteSeerX: an Architecture and Web Service Design for an Academic Document Search Engine

Authors: 
Li, Huajing; Councill, Isaac; Lee, Wang-Chien; Giles, C. Lee
Author: 
Li, H
Councill, I
Lee, W
Giles, C. Lee
Year: 
2006
Venue: 
15th International World Wide Web Conference (WWW2006):(poster) 2006
URL: 
http://clgiles.ist.psu.edu/papers/www2006-citeseerx.pdf
Citations: 
29
Citations range: 
10 - 49
AttachmentSize
Li2006CiteSeerXanArchitectureand.pdf224.19 KB

CiteSeer is a scientific literature digital library and search engine which automatically crawls and indexes scientific documents in the field of computer and information science. After serving as a public search engine for nearly ten years, CiteSeer is starting to have scaling problems for handling of more documents, adding new feature and more users. Its monolithic architecture design prevents it from effectively making use of new web technologies and providing new services. After analyzing the current system problems, we propose a new architecture and data model, CiteSeerx. CiteSeerx that will overcome the existing problems as well as provide scalability and better performance plus new services and system features.