id author title date pages extension mime words sentences flesch summary cache txt en-wikipedia-org-3628 CiteSeerX - Wikipedia .html text/html 1895 184 61 CiteSeerx (originally called CiteSeer) is a public search engine and digital library for scientific and academic papers, primarily in the fields of computer and information science. CiteSeer is considered as a predecessor of academic search tools such as Google Scholar and Microsoft Academic Search.[citation needed] CiteSeer-like engines and archives usually only harvest documents from publicly available websites and do not crawl publisher websites. CiteSeerx[2] is a public search engine and digital library and repository for scientific and academic papers primarily with a focus on computer and information science.[2] However, recently CiteSeerx has been expanding into other scholarly domains such as economics, physics and others. CiteSeerx also shares its software, data, databases and metadata with other researchers, currently by Amazon S3 and by rsync.[5] Its new modular open source architecture and software (available previously on SourceForge but now on GitHub) is built on Apache Solr and other Apache and open source tools which allows it to be a testbed for new algorithms in document harvesting, ranking, indexing, and information extraction. ./cache/en-wikipedia-org-3628.html ./txt/en-wikipedia-org-3628.txt