Creator | emorgan |
Date created | 2022-10-13 |
Number of items | 35 |
Number of words | 4067181 |
Average readability score | 57 |
Bibliographics | plain text; HTML; JSON |
Other files | stopwords; entire corpus |
unigrams |
bigrams |
nouns |
proper nouns |
pronouns |
verbs |
adjectives |
adverbs |
any entity |
persons |
geo-political entities |
organizations |
The next step is for you to ask yourself some sort of question, and apply it to this data set. There are quite a number of ways to do this.
The Distant Reader and the Distant Reader Toolbox take an almost arbitrary amount of text as input and output data sets -- affectionatly known as "study carrels". The contents of this page was created from a study carrel.
Each study carrel is constituted with the same set of folders and files. These folders and files contain "features" of the original documents such as parts-of-speech, named-entities, and statistically significant keywords. For example, all of the original documents have been saved in the cache folder, and all of the plain-text versions of the original documents have been saved in the txt directory. Since almost all of the files in a study carrel are either plain-text files or tab-delimited files, Distant Reader study carrels can be accessed and used by almost any text editor, word processor, spreadsheet, database, or analysis application. The following folders contain information of particular interest:
There are a few files of note:
There are quite a number of graphical-user interface (GUI) applications you can apply to a carrel's content:
Finally, if you have Python installed, then you can install the Reader Toolbox (pip install reader-toolbox
), and use the rdr
command from the command line to do many of the things the GUI applications do and more. There is also a set of Jupyter Notebooks demonstrating how the Toolbox can be extended and used in conjunction with other Python modules (like Pandas, SQLite, WordNet, etc.).
For more information, please see the complete manual.
Happy reading!
Eric Lease Morgan <emorgan@nd.edu>
Navari Family Center for Digital Scholarship
University of Notre Dame