id author title date pages extension mime words sentences flesch summary cache txt lawlesst-github-io-9159 Automatically extracting keyphrases from text .html text/html 191 15 61 Automatically extracting keyphrases from text Automatically extracting keyphrases from text I've posted an explainer/guide to how we are automatically extracting keyphrases for Constellate, a new text analytics service from JSTOR and Portico. We are defining keyphrases as up to three word phrases that are key, or important, to the overall subject matter of the document. Keyphrase is often used interchangeably with keywords, but we are opting to use the former since it's more descriptive. We did a fair amount of reading to grasp prior art in this area, extracting keyphrases is a long standing research topic in information retrieval and natural language processing, and ended up developing a custom solution based on term frequency in the Constellate corpus. If you are interested in this work generally, and not just the Constellate implementation, Burton DeWilde has published an excellent primer on automated keyphrase extraction. More information about Constellate can be found here. Disclaimer: this is a work-related post. ./cache/lawlesst-github-io-9159.html ./txt/lawlesst-github-io-9159.txt