mv: 'input-file.zip' and './input-file.zip' are the same file Creating study carrel named subject-tories-freebo Initializing database Unzipping Archive: input-file.zip inflating: ./tmp/input/A65681.xml inflating: ./tmp/input/xml2htm.xsl inflating: ./tmp/input/metadata.csv inflating: ./tmp/input/A55123.xml inflating: ./tmp/input/A53021.xml caution: excluded filename not matched: *MACOSX* === DIRECTORIES: ./tmp/input === DIRECTORY: === metadata file: ./tmp/input/metadata.csv === found metadata file === updating bibliographic database Building study carrel named subject-tories-freebo May 25, 2021 12:37:57 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. May 25, 2021 12:37:57 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: Tesseract OCR is installed and will be automatically applied to image files unless you've excluded the TesseractOCRParser from the default parser. Tesseract may dramatically slow down content extraction (TIKA-2359). As of Tika 1.15 (and prior versions), Tesseract is automatically called. In future versions of Tika, users may need to turn the TesseractOCRParser on via TikaConfig. May 25, 2021 12:37:57 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version. INFO Starting Apache Tika 1.24.1 server INFO Setting the server's publish address to be http://localhost:9998/ INFO Logging initialized @3603ms to org.eclipse.jetty.util.log.Slf4jLog INFO jetty-9.4.27.v20200227; built: 2020-02-27T18:37:21.340Z; git: a304fd9f351f337e7c0e2a7c28878dd536149c6c; jvm 1.8.0_281-b09 INFO Started ServerConnector@3e74829{HTTP/1.1, (http/1.1)}{localhost:9998} INFO Started @3708ms WARN Empty contextPath INFO Started o.e.j.s.h.ContextHandler@62010f5c{/,null,AVAILABLE} INFO Started Apache Tika server at http://localhost:9998/ INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) FILE: cache/A65681.xml OUTPUT: txt/A65681.txt FILE: cache/A53021.xml OUTPUT: txt/A53021.txt FILE: cache/A55123.xml OUTPUT: txt/A55123.txt === file2bib.sh === INFO Detecting media type for Filename: b'A53021.xml' INFO Detecting media type for Filename: b'A65681.xml' INFO Detecting media type for Filename: b'A55123.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) A65681 txt/../pos/A65681.pos A53021 txt/../pos/A53021.pos A65681 txt/../ent/A65681.ent A53021 txt/../wrd/A53021.wrd A53021 txt/../ent/A53021.ent A65681 txt/../wrd/A65681.wrd A55123 txt/../pos/A55123.pos === file2bib.sh === id: A53021 author: Honest trimmer. title: A new-years-gift to the Tories, or, A few sober queries concerning them by an honest trimmer. date: 1683 pages: extension: .xml txt: ./txt/A53021.txt cache: ./cache/A53021.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 10 resourceName b'A53021.xml' === file2bib.sh === id: A65681 author: Colledge, Stephen, 1635?-1681. title: A letter from Mr. Edward Whitaker to the Protestant joyner upon his bill being sent to Oxford date: 1681 pages: extension: .xml txt: ./txt/A65681.txt cache: ./cache/A65681.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 11 resourceName b'A65681.xml' A55123 txt/../ent/A55123.ent A55123 txt/../wrd/A55123.wrd === file2bib.sh === id: A55123 author: Phillips, John, 1631-1706. title: A pleasant conference upon the Observator and Heraclitus together with a brief relation of the present posture of the French affairs. date: 1682 pages: extension: .xml txt: ./txt/A55123.txt cache: ./cache/A55123.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 33 resourceName b'A55123.xml' Done mapping. Reducing subject-tories-freebo === reduce.pl bib === id = A53021 author = Honest trimmer. title = A new-years-gift to the Tories, or, A few sober queries concerning them by an honest trimmer. date = 1683 pages = extension = .xml mime = application/xml words = 1535 sentences = 268 flesch = 84 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A new-years-gift to the Tories, or, A few sober queries concerning them by an honest trimmer. A new-years-gift to the Tories, or, A few sober queries concerning them by an honest trimmer. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). cache = ./cache/A53021.xml txt = ./txt/A53021.txt === reduce.pl bib === id = A55123 author = Phillips, John, 1631-1706. title = A pleasant conference upon the Observator and Heraclitus together with a brief relation of the present posture of the French affairs. date = 1682 pages = extension = .xml mime = application/xml words = 14992 sentences = 4540 flesch = 95 summary = A pleasant conference upon the Observator and Heraclitus together with a brief relation of the present posture of the French affairs. A pleasant conference upon the Observator and Heraclitus together with a brief relation of the present posture of the French affairs. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). In general, first editions of a works in English were prioritized, although there are a number of works in other languages, notably Latin and Welsh, included and sometimes a second or later edition of a work was chosen if there was a compelling reason to do so. cache = ./cache/A55123.xml txt = ./txt/A55123.txt === reduce.pl bib === id = A65681 author = Colledge, Stephen, 1635?-1681. title = A letter from Mr. Edward Whitaker to the Protestant joyner upon his bill being sent to Oxford date = 1681 pages = extension = .xml mime = application/xml words = 1840 sentences = 354 flesch = 89 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A letter from Mr. Edward Whitaker to the Protestant joyner upon his bill being sent to Oxford A letter from Mr. Edward Whitaker to the Protestant joyner upon his bill being sent to Oxford EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). cache = ./cache/A65681.xml txt = ./txt/A65681.txt Building ./etc/reader.txt A55123 A65681 A53021 A65681 A55123 A53021 number of items: 3 sum of words: 18,367 average size in words: 6,122 average readability score: 89 nouns: governour; t; time; man; text; thing; men; texts; characters; work; world; religion; nothing; way; landers; friends; xml; part; books; place; interest; images; business; ▪; works; word; things; money; image; hath; design; care; reason; project; page; one; keying; encoding; elements; eebo; edition; data; years; subjects; self; power; people; land; day; title verbs: be; is; have; are; was; been; had; do; were; ''s; make; made; take; know; being; did; say; done; am; tell; sent; go; give; encoded; think; set; pray; has; said; keep; find; come; does; hear; called; call; according; seems; having; brought; live; based; serve; see; get; believe; thought; suppose; put; let adjectives: other; great; such; own; good; true; much; more; little; early; whole; honest; english; old; better; many; first; general; available; young; same; present; late; certain; very; sure; next; new; few; best; able; several; possible; least; last; impossible; illegible; full; french; false; considerable; worth; short; secure; right; large; fine; common; worse; wide adverbs: not; so; then; now; up; well; more; very; as; never; most; too; therefore; only; here; out; much; in; ever; enough; yet; once; again; sometimes; long; still; over; all; presently; together; rather; no; far; down; on; off; else; always; away; thus; perhaps; online; just; indeed; besides; there; sooner; onely; n''t; first pronouns: i; he; you; they; his; it; your; their; my; him; our; we; them; ''em; me; her; themselves; himself; us; she; em; yours; mine; its; yourself; u; thee; ours; ''s proper nouns: monsieur; belfagor; sir; pluto; tcp; master; ●; english; island; governour; oxford; tory; tories; text; tei; le; england; eebo; masters; plotters; observator; forty; whigg; princes; heraclitus; senate; 〉; whiggs; prince; popish; plot; law; proquest; pope; phase; partnership; king; jesuits; highness; creation; church; 〈; ◊; ye; religion; protestant; london; de; d.; country keywords: tcp; tories; sir; monsieur; master; island; highness; heraclitus; governour; friends; country; belfagor one topic; one dimension: governour file(s): ./cache/A65681.xml titles(s): A letter from Mr. Edward Whitaker to the Protestant joyner upon his bill being sent to Oxford three topics; one dimension: governour; tcp; 01 file(s): ./cache/A55123.xml, ./cache/A65681.xml, ./cache/A53021.xml titles(s): A pleasant conference upon the Observator and Heraclitus together with a brief relation of the present posture of the French affairs. | A letter from Mr. Edward Whitaker to the Protestant joyner upon his bill being sent to Oxford | A new-years-gift to the Tories, or, A few sober queries concerning them by an honest trimmer. five topics; three dimensions: governour monsieur belfagor; text tcp english; 2008 trimmer established; 2008 trimmer established; 2008 trimmer established file(s): ./cache/A55123.xml, ./cache/A65681.xml, ./cache/A53021.xml, ./cache/A53021.xml, ./cache/A53021.xml titles(s): A pleasant conference upon the Observator and Heraclitus together with a brief relation of the present posture of the French affairs. | A letter from Mr. Edward Whitaker to the Protestant joyner upon his bill being sent to Oxford | A new-years-gift to the Tories, or, A few sober queries concerning them by an honest trimmer. | A new-years-gift to the Tories, or, A few sober queries concerning them by an honest trimmer. | A new-years-gift to the Tories, or, A few sober queries concerning them by an honest trimmer. Type: zip2carrel title: subject-tories-freebo date: 2021-05-25 time: 12:28 username: emorgan patron: Eric Morgan email: emorgan@nd.edu input: input-file.zip ==== make-pages.sh htm files ==== make-pages.sh complex files ==== make-pages.sh named enities ==== making bibliographics id: A65681 author: Colledge, Stephen, 1635?-1681. title: A letter from Mr. Edward Whitaker to the Protestant joyner upon his bill being sent to Oxford date: 1681 words: 1840 sentences: 354 pages: flesch: 89 cache: ./cache/A65681.xml txt: ./txt/A65681.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A letter from Mr. Edward Whitaker to the Protestant joyner upon his bill being sent to Oxford A letter from Mr. Edward Whitaker to the Protestant joyner upon his bill being sent to Oxford EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). id: A53021 author: Honest trimmer. title: A new-years-gift to the Tories, or, A few sober queries concerning them by an honest trimmer. date: 1683 words: 1535 sentences: 268 pages: flesch: 84 cache: ./cache/A53021.xml txt: ./txt/A53021.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A new-years-gift to the Tories, or, A few sober queries concerning them by an honest trimmer. A new-years-gift to the Tories, or, A few sober queries concerning them by an honest trimmer. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). id: A55123 author: Phillips, John, 1631-1706. title: A pleasant conference upon the Observator and Heraclitus together with a brief relation of the present posture of the French affairs. date: 1682 words: 14992 sentences: 4540 pages: flesch: 95 cache: ./cache/A55123.xml txt: ./txt/A55123.txt summary: A pleasant conference upon the Observator and Heraclitus together with a brief relation of the present posture of the French affairs. A pleasant conference upon the Observator and Heraclitus together with a brief relation of the present posture of the French affairs. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). In general, first editions of a works in English were prioritized, although there are a number of works in other languages, notably Latin and Welsh, included and sometimes a second or later edition of a work was chosen if there was a compelling reason to do so. ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel