mv: 'input-file.zip' and './input-file.zip' are the same file Creating study carrel named author-horace-freebo Initializing database Unzipping Archive: input-file.zip inflating: ./tmp/input/xml2htm.xsl inflating: ./tmp/input/A45579.xml inflating: ./tmp/input/metadata.csv inflating: ./tmp/input/A36014.xml inflating: ./tmp/input/A44478.xml inflating: ./tmp/input/A44464.xml inflating: ./tmp/input/A44471.xml caution: excluded filename not matched: *MACOSX* === DIRECTORIES: ./tmp/input === DIRECTORY: === metadata file: ./tmp/input/metadata.csv === found metadata file === updating bibliographic database Building study carrel named author-horace-freebo May 24, 2021 9:33:36 AM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. May 24, 2021 9:33:36 AM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: Tesseract OCR is installed and will be automatically applied to image files unless you've excluded the TesseractOCRParser from the default parser. Tesseract may dramatically slow down content extraction (TIKA-2359). As of Tika 1.15 (and prior versions), Tesseract is automatically called. In future versions of Tika, users may need to turn the TesseractOCRParser on via TikaConfig. May 24, 2021 9:33:36 AM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version. INFO Starting Apache Tika 1.24.1 server INFO Setting the server's publish address to be http://localhost:9998/ INFO Logging initialized @3475ms to org.eclipse.jetty.util.log.Slf4jLog INFO jetty-9.4.27.v20200227; built: 2020-02-27T18:37:21.340Z; git: a304fd9f351f337e7c0e2a7c28878dd536149c6c; jvm 1.8.0_281-b09 INFO Started ServerConnector@3e74829{HTTP/1.1, (http/1.1)}{localhost:9998} INFO Started @3622ms WARN Empty contextPath INFO Started o.e.j.s.h.ContextHandler@51fadaff{/,null,AVAILABLE} INFO Started Apache Tika server at http://localhost:9998/ INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) FILE: cache/A44464.xml OUTPUT: txt/A44464.txt FILE: cache/A36014.xml OUTPUT: txt/A36014.txt FILE: cache/A45579.xml OUTPUT: txt/A45579.txt FILE: cache/A44471.xml OUTPUT: txt/A44471.txt FILE: cache/A44478.xml OUTPUT: txt/A44478.txt === file2bib.sh === INFO Detecting media type for Filename: b'A44464.xml' INFO Detecting media type for Filename: b'A36014.xml' INFO Detecting media type for Filename: b'A45579.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO Detecting media type for Filename: b'A44471.xml' INFO Detecting media type for Filename: b'A44478.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) A36014 txt/../pos/A36014.pos A44464 txt/../pos/A44464.pos A36014 txt/../wrd/A36014.wrd A44464 txt/../ent/A44464.ent === file2bib.sh === id: A36014 author: Horace. title: XXV select allusions to several places of Horace, Martial, Anacreon and Petron. Arbitr. Part I written by Mr. Dilke. date: 1698 pages: extension: .xml txt: ./txt/A36014.txt cache: ./cache/A36014.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 52 resourceName b'A36014.xml' A44464 txt/../wrd/A44464.wrd A36014 txt/../ent/A36014.ent === file2bib.sh === id: A44464 author: Horace. title: Horace's Art of poetry made English by the Right Honourable the Earl of Roscommon. date: 1680 pages: extension: .xml txt: ./txt/A44464.txt cache: ./cache/A44464.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 74 resourceName b'A44464.xml' A45579 txt/../pos/A45579.pos A45579 txt/../wrd/A45579.wrd A45579 txt/../ent/A45579.ent A44471 txt/../pos/A44471.pos === file2bib.sh === id: A45579 author: Horace. title: A banquet of essayes, fetcht out of famous Owens confectionary, disht out, and served up at the table of Mecoenas by Henry Harflete ... date: 1653 pages: extension: .xml txt: ./txt/A45579.txt cache: ./cache/A45579.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 87 resourceName b'A45579.xml' A44478 txt/../pos/A44478.pos A44471 txt/../ent/A44471.ent A44471 txt/../wrd/A44471.wrd A44478 txt/../wrd/A44478.wrd A44478 txt/../ent/A44478.ent === file2bib.sh === id: A44471 author: Horace. title: The Odes, Satyrs, and Epistles of Horace Done into English. date: 1684 pages: extension: .xml txt: ./txt/A44471.txt cache: ./cache/A44471.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 246 resourceName b'A44471.xml' === file2bib.sh === id: A44478 author: Horace. title: The poems of Horace consisting of odes, satyres, and epistles / rendred in English verse by several persons. date: 1666 pages: extension: .xml txt: ./txt/A44478.txt cache: ./cache/A44478.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 256 resourceName b'A44478.xml' Done mapping. Reducing author-horace-freebo === reduce.pl bib === id = A44464 author = Horace. title = Horace's Art of poetry made English by the Right Honourable the Earl of Roscommon. date = 1680 pages = extension = .xml mime = application/xml words = 5700 sentences = 1681 flesch = 95 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. Horace's Art of poetry made English by the Right Honourable the Earl of Roscommon. Horace's Art of poetry made English by the Right Honourable the Earl of Roscommon. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A44464.xml txt = ./txt/A44464.txt === reduce.pl bib === id = A44478 author = Horace. title = The poems of Horace consisting of odes, satyres, and epistles / rendred in English verse by several persons. date = 1666 pages = extension = .xml mime = application/xml words = 88004 sentences = 29068 flesch = 102 summary = The poems of Horace consisting of odes, satyres, and epistles / rendred in English verse by several persons. The poems of Horace consisting of odes, satyres, and epistles / rendred in English verse by several persons. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A44478.xml txt = ./txt/A44478.txt === reduce.pl bib === id = A45579 author = Horace. title = A banquet of essayes, fetcht out of famous Owens confectionary, disht out, and served up at the table of Mecoenas by Henry Harflete ... date = 1653 pages = extension = .xml mime = application/xml words = 15730 sentences = 4714 flesch = 90 summary = This text is an enriched version of the TCP digital transcription A45579 of text R3351 in the English Short Title Catalog (Wing H766). Textual changes and metadata enrichments aim at making the text more computationally tractable, easier to read, and suitable for network-based collaborative curation by amateur and professional end users from many walks of life. This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A banquet of essayes, fetcht out of famous Owens confectionary, disht out, and served up at the table of Mecoenas by Henry Harflete ... A banquet of essayes, fetcht out of famous Owens confectionary, disht out, and served up at the table of Mecoenas by Henry Harflete ... civilwar no A banquet of essayes, fetcht out of famous Owens confectionary, disht out, and serv'd up at the table of Mecoenas. cache = ./cache/A45579.xml txt = ./txt/A45579.txt === reduce.pl bib === id = A36014 author = Horace. title = XXV select allusions to several places of Horace, Martial, Anacreon and Petron. Arbitr. Part I written by Mr. Dilke. date = 1698 pages = extension = .xml mime = application/xml words = 4125 sentences = 1372 flesch = 100 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. XXV select allusions to several places of Horace, Martial, Anacreon and Petron. XXV select allusions to several places of Horace, Martial, Anacreon and Petron. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A36014.xml txt = ./txt/A36014.txt === reduce.pl bib === id = A44471 author = Horace. title = The Odes, Satyrs, and Epistles of Horace Done into English. date = 1684 pages = extension = .xml mime = application/xml words = 81830 sentences = 28500 flesch = 105 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. printed for Jacob Tonson, and sold by Tim. Goodwin at the Maiden-head against St. Dunstans Church in Fleetstreet, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A44471.xml txt = ./txt/A44471.txt Building ./etc/reader.txt A44471 A44478 A45579 A44478 A44471 A44464 number of items: 5 sum of words: 195,389 average size in words: 39,077 average readability score: 98 nouns: t; man; men; things; friend; self; mind; life; doth; day; love; art; time; way; praise; words; thy; death; care; none; blood; wealth; hand; name; thing; rage; night; friends; ▪; head; place; gods; others; face; force; fear; ode; heart; l; part; gold; ease; store; nothing; age; fire; eyes; years; wine; hath verbs: is; be; are; was; have; do; did; ''s; let; make; were; had; take; see; come; made; go; live; give; know; does; say; think; write; tell; being; am; hear; please; makes; been; done; leave; keep; said; bear; use; run; bring; has; get; came; sing; fear; love; raise; call; gone; dost; appear adjectives: good; great; more; roman; such; own; non; old; -; little; free; vain; poor; true; other; much; rich; same; mad; new; happy; small; better; wise; first; fair; best; sweet; noble; many; high; last; long; full; short; young; soft; ill; greater; fit; fierce; doth; mighty; proud; common; bad; equal; worthy; greatest; strong adverbs: not; so; then; now; too; still; more; as; well; yet; up; out; thus; never; here; away; again; there; just; first; much; once; down; only; no; long; n''t; sometimes; else; on; ever; soon; in; far; alone; therefore; very; most; all; off; back; onely; often; enough; rather; forth; quickly; freely; better; before pronouns: i; he; his; my; you; their; me; it; thy; they; him; your; her; our; we; them; she; thee; us; himself; its; ''s; themselves; mine; one; ''em; yours; theirs; vvith; l; beg''d; thou; us''d; t''you; ours; ye; unconcern''d; ts; nay; itself; dy''d; can''st; your''e; y''ve; wr; wax; unprovok''t; toss; th; t''ane proper nouns: thou; sir; thee; rome; t; ode; le; god; thy; e''re; muse; fate; ●; town; lord; 〉; caesar; man; wine; hath; vertue; sea; love; father; ◊; gods; horace; poet; verse; 〈; th; age; men; king; heaven; wit; r.; fortune; son; nature; feast; fame; tcp; hor; venus; f.; book; jove; world; doth keywords: tcp; ode; muse; men; world; wine; town; thou; thee; sir; roman; nature; man; love; gods; god; friend; estate; country; youth; work; wit; wealth; wars; vertue; verse; thy; thing; sun; stage; soul; son; seas; sea; rome; reader; praise; poets; poet; non; mr.; mind; mea; master; maecenas; lord; live; like; life; lib one topic; one dimension: thy file(s): ./cache/A36014.xml titles(s): XXV select allusions to several places of Horace, Martial, Anacreon and Petron. Arbitr. Part I written by Mr. Dilke. three topics; one dimension: thy; non; tcp file(s): ./cache/A44478.xml, ./cache/A45579.xml, ./cache/A44464.xml titles(s): The poems of Horace consisting of odes, satyres, and epistles / rendred in English verse by several persons. | A banquet of essayes, fetcht out of famous Owens confectionary, disht out, and served up at the table of Mecoenas by Henry Harflete ... | Horace''s Art of poetry made English by the Right Honourable the Earl of Roscommon. five topics; three dimensions: thy thou thee; non roman man; men words things; extent usually creating; extent usually creating file(s): ./cache/A44478.xml, ./cache/A45579.xml, ./cache/A44464.xml, ./cache/A36014.xml, ./cache/A36014.xml titles(s): The poems of Horace consisting of odes, satyres, and epistles / rendred in English verse by several persons. | A banquet of essayes, fetcht out of famous Owens confectionary, disht out, and served up at the table of Mecoenas by Henry Harflete ... | Horace''s Art of poetry made English by the Right Honourable the Earl of Roscommon. | XXV select allusions to several places of Horace, Martial, Anacreon and Petron. Arbitr. Part I written by Mr. Dilke. | XXV select allusions to several places of Horace, Martial, Anacreon and Petron. Arbitr. Part I written by Mr. Dilke. Type: zip2carrel title: author-horace-freebo date: 2021-05-24 time: 09:12 username: emorgan patron: Eric Morgan email: emorgan@nd.edu input: input-file.zip ==== make-pages.sh htm files ==== make-pages.sh complex files ==== make-pages.sh named enities ==== making bibliographics id: A45579 author: Horace. title: A banquet of essayes, fetcht out of famous Owens confectionary, disht out, and served up at the table of Mecoenas by Henry Harflete ... date: 1653 words: 15730 sentences: 4714 pages: flesch: 90 cache: ./cache/A45579.xml txt: ./txt/A45579.txt summary: This text is an enriched version of the TCP digital transcription A45579 of text R3351 in the English Short Title Catalog (Wing H766). Textual changes and metadata enrichments aim at making the text more computationally tractable, easier to read, and suitable for network-based collaborative curation by amateur and professional end users from many walks of life. This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A banquet of essayes, fetcht out of famous Owens confectionary, disht out, and served up at the table of Mecoenas by Henry Harflete ... A banquet of essayes, fetcht out of famous Owens confectionary, disht out, and served up at the table of Mecoenas by Henry Harflete ... civilwar no A banquet of essayes, fetcht out of famous Owens confectionary, disht out, and serv''d up at the table of Mecoenas. id: A44478 author: Horace. title: The poems of Horace consisting of odes, satyres, and epistles / rendred in English verse by several persons. date: 1666 words: 88004 sentences: 29068 pages: flesch: 102 cache: ./cache/A44478.xml txt: ./txt/A44478.txt summary: The poems of Horace consisting of odes, satyres, and epistles / rendred in English verse by several persons. The poems of Horace consisting of odes, satyres, and epistles / rendred in English verse by several persons. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A44464 author: Horace. title: Horace''s Art of poetry made English by the Right Honourable the Earl of Roscommon. date: 1680 words: 5700 sentences: 1681 pages: flesch: 95 cache: ./cache/A44464.xml txt: ./txt/A44464.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. Horace''s Art of poetry made English by the Right Honourable the Earl of Roscommon. Horace''s Art of poetry made English by the Right Honourable the Earl of Roscommon. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A44471 author: Horace. title: The Odes, Satyrs, and Epistles of Horace Done into English. date: 1684 words: 81830 sentences: 28500 pages: flesch: 105 cache: ./cache/A44471.xml txt: ./txt/A44471.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. printed for Jacob Tonson, and sold by Tim. Goodwin at the Maiden-head against St. Dunstans Church in Fleetstreet, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A36014 author: Horace. title: XXV select allusions to several places of Horace, Martial, Anacreon and Petron. Arbitr. Part I written by Mr. Dilke. date: 1698 words: 4125 sentences: 1372 pages: flesch: 100 cache: ./cache/A36014.xml txt: ./txt/A36014.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. XXV select allusions to several places of Horace, Martial, Anacreon and Petron. XXV select allusions to several places of Horace, Martial, Anacreon and Petron. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel