DH Blog @ Notre Dame DH Blog @ Notre Dame Learning about human expression through the use of computers Tiny list of part-of-speech taggers This is a tiny list of part-of-speech (POS) taggers, where taggers are tools used to denote what words in a sentence are nouns, verbs, adjectives, etc. Once parts-of-speech are denoted, a reader can begin to analyze a text on a … Continue reading → Quick And Dirty Website Analysis This posting describes a quick & dirty way to begin doing website content analysis. A student here at Notre Dame wants to do computer and text mining analyze a set of websites. After a bit of discussion and investigation, I … Continue reading → Beth Plale, Yiming Sun, and the HathiTrust Research Center Beth Plale and Yiming Sun, both from the HathiTrust Research Center, came to Notre Dame on Tuesday (May 7) to give the digital humanities group an update of some of the things happening at the Center. This posting documents some … Continue reading → JSTOR Tool — A Programatic sketch JSTOR Tool is a “programatic sketch” — a simple and rudimentary investigation of what might be done with datasets dumped from Data For Research of JSTOR. More specifically, a search was done against JSTOR for English language articles dealing with … Continue reading → Matt Sag and copyright Matt Sag (Loyola University Chicago) came to visit Notre Dame on Friday, April 12 (2013). His talk was on copyright and the digital humanities. In his words, “I will explain how practices such as text mining present a fundamental challenge … Continue reading → Copyright And The Digital Humanities This Friday (April 12) the Notre Dame Digital Humanities group will be sponsoring a lunchtime presentation by Matthew Sag called Copyright And The Digital Humanities: I will explain how practices such as text mining present a fundamental challenge to our … Continue reading → Digital humanities and the liberal arts The abundance of freely available full text combined with ubiquitous desktop and cloud computing provide a means to inquire on the human condition in ways not possible previously. Such an environment offers a huge number of opportunities for libraries and … Continue reading → Introduction to text mining Text mining is a process for analyzing textual information. It can be used to find both patterns and anomalies in a corpus of one or more documents. Sometimes this process is called “distant reading”. It is very important to understand … Continue reading → Genderizing names I was wondering what percentage of subscribers to the Code4Lib mailing list were male and female, and consequently I wrote a hack. This posting describes it — the hack that is, genderizing names. I own/moderate a mailing list called Code4Lib. … Continue reading → Visualization and GIS The latest “digital humanities” lunch presentations were on the topics of visualization and GIS. Kristina Davis (Center for Research Computing) gave our lunchtime crowd a tour of online resources for visualization. “Visualization is about transforming data into visual representations in … Continue reading →