mv: 'input-file.zip' and './input-file.zip' are the same file Creating study carrel named subject-monsters-freebo Initializing database Unzipping Archive: input-file.zip inflating: ./tmp/input/A10066.xml inflating: ./tmp/input/A94009.xml inflating: ./tmp/input/xml2htm.xsl inflating: ./tmp/input/metadata.csv inflating: ./tmp/input/A30133.xml inflating: ./tmp/input/A08949.xml caution: excluded filename not matched: *MACOSX* === DIRECTORIES: ./tmp/input === DIRECTORY: === metadata file: ./tmp/input/metadata.csv === found metadata file === updating bibliographic database Building study carrel named subject-monsters-freebo May 24, 2021 7:34:50 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. May 24, 2021 7:34:51 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: Tesseract OCR is installed and will be automatically applied to image files unless you've excluded the TesseractOCRParser from the default parser. Tesseract may dramatically slow down content extraction (TIKA-2359). As of Tika 1.15 (and prior versions), Tesseract is automatically called. In future versions of Tika, users may need to turn the TesseractOCRParser on via TikaConfig. May 24, 2021 7:34:51 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version. INFO Starting Apache Tika 1.24.1 server INFO Setting the server's publish address to be http://localhost:9998/ INFO Logging initialized @3698ms to org.eclipse.jetty.util.log.Slf4jLog INFO jetty-9.4.27.v20200227; built: 2020-02-27T18:37:21.340Z; git: a304fd9f351f337e7c0e2a7c28878dd536149c6c; jvm 1.8.0_281-b09 INFO Started ServerConnector@3e74829{HTTP/1.1, (http/1.1)}{localhost:9998} INFO Started @3815ms WARN Empty contextPath INFO Started o.e.j.s.h.ContextHandler@62010f5c{/,null,AVAILABLE} INFO Started Apache Tika server at http://localhost:9998/ INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) FILE: cache/A08949.xml OUTPUT: txt/A08949.txt FILE: cache/A94009.xml OUTPUT: txt/A94009.txt FILE: cache/A10066.xml OUTPUT: txt/A10066.txt FILE: cache/A30133.xml OUTPUT: txt/A30133.txt === file2bib.sh === INFO Detecting media type for Filename: b'A10066.xml' INFO Detecting media type for Filename: b'A30133.xml' INFO Detecting media type for Filename: b'A08949.xml' INFO Detecting media type for Filename: b'A94009.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) A94009 txt/../pos/A94009.pos A08949 txt/../pos/A08949.pos A94009 txt/../ent/A94009.ent A94009 txt/../wrd/A94009.wrd A10066 txt/../wrd/A10066.wrd A30133 txt/../wrd/A30133.wrd A30133 txt/../pos/A30133.pos A10066 txt/../pos/A10066.pos A08949 txt/../wrd/A08949.wrd A10066 txt/../ent/A10066.ent A30133 txt/../ent/A30133.ent A08949 txt/../ent/A08949.ent === file2bib.sh === id: A94009 author: Davie, John. title: Strange news from Ireland, or, A true and perfect relation of a famous fish taken at Kingsale the manner of its taking, and description of its horrible shapes / as it was certified in a letter from one Mr. Robinson, living in Kingsale, (an eye-witness) to Mr. John Davie a relation of his, living in Westminster. date: 1677 pages: extension: .xml txt: ./txt/A94009.txt cache: ./cache/A94009.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 22 resourceName b'A94009.xml' === file2bib.sh === id: A10066 author: L.P. (Lawrence Price), fl. 1625-1680?. title: A monstrous shape. Or a shapelesse monster A description of a female creature borne in Holland, compleat in every p[arte] save only a head like a swine, who hath travailed into many parts, and is now to be seene in London, ... To the tune of the Spanish Pavin. date: 1639 pages: extension: .xml txt: ./txt/A10066.txt cache: ./cache/A10066.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 21 resourceName b'A10066.xml' === file2bib.sh === id: A08949 author: M. P. (Martin Parker), d. 1656? title: A description of a strange (and miraculous) fish cast upon the sands in the meads, in the hundred of Worwell, in the county Palatine of Chester, (or Chesshiere. The certainty whereof is here related concerning the said most monstrous fish. To the tune of Bragandary. date: 1635 pages: extension: .xml txt: ./txt/A08949.txt cache: ./cache/A08949.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 23 resourceName b'A08949.xml' === file2bib.sh === id: A30133 author: E. B. title: Strange and wonderful news of the birth of a monstrous child with two heads, and three arms which was lately born at Attenree, in the county of Meath, in Ireland. date: 1685 pages: extension: .xml txt: ./txt/A30133.txt cache: ./cache/A30133.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 21 resourceName b'A30133.xml' Done mapping. Reducing subject-monsters-freebo === reduce.pl bib === id = A10066 author = L.P. (Lawrence Price), fl. 1625-1680?. title = A monstrous shape. Or a shapelesse monster A description of a female creature borne in Holland, compleat in every p[arte] save only a head like a swine, who hath travailed into many parts, and is now to be seene in London, ... To the tune of the Spanish Pavin. date = 1639 pages = extension = .xml mime = application/xml words = 1594 sentences = 318 flesch = 91 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. Or a shapelesse monster A description of a female creature borne in Holland, compleat in every p[arte] save only a head like a swine, who hath travailed into many parts, and is now to be seene in London, ... Or a shapelesse monster A description of a female creature borne in Holland, compleat in every p[arte] save only a head like a swine, who hath travailed into many parts, and is now to be seene in London, ... F[lesher] for Tho: Lambert, and are to be sold at the signe of the Horse shooe in Smithfield, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). cache = ./cache/A10066.xml txt = ./txt/A10066.txt === reduce.pl bib === id = A94009 author = Davie, John. title = Strange news from Ireland, or, A true and perfect relation of a famous fish taken at Kingsale the manner of its taking, and description of its horrible shapes / as it was certified in a letter from one Mr. Robinson, living in Kingsale, (an eye-witness) to Mr. John Davie a relation of his, living in Westminster. date = 1677 pages = extension = .xml mime = application/xml words = 1454 sentences = 234 flesch = 83 summary = Strange news from Ireland, or, A true and perfect relation of a famous fish taken at Kingsale the manner of its taking, and description of its horrible shapes / as it was certified in a letter from one Mr. Robinson, living in Kingsale, (an eye-witness) to Mr. John Davie a relation of his, living in Westminster. Strange news from Ireland, or, A true and perfect relation of a famous fish taken at Kingsale the manner of its taking, and description of its horrible shapes / as it was certified in a letter from one Mr. Robinson, living in Kingsale, (an eye-witness) to Mr. John Davie a relation of his, living in Westminster. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). cache = ./cache/A94009.xml txt = ./txt/A94009.txt === reduce.pl bib === id = A30133 author = E. B. title = Strange and wonderful news of the birth of a monstrous child with two heads, and three arms which was lately born at Attenree, in the county of Meath, in Ireland. date = 1685 pages = extension = .xml mime = application/xml words = 1432 sentences = 234 flesch = 86 summary = Strange and wonderful news of the birth of a monstrous child with two heads, and three arms which was lately born at Attenree, in the county of Meath, in Ireland. Strange and wonderful news of the birth of a monstrous child with two heads, and three arms which was lately born at Attenree, in the county of Meath, in Ireland. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). cache = ./cache/A30133.xml txt = ./txt/A30133.txt === reduce.pl bib === id = A08949 author = M. P. (Martin Parker), d. 1656? title = A description of a strange (and miraculous) fish cast upon the sands in the meads, in the hundred of Worwell, in the county Palatine of Chester, (or Chesshiere. The certainty whereof is here related concerning the said most monstrous fish. To the tune of Bragandary. date = 1635 pages = extension = .xml mime = application/xml words = 1668 sentences = 342 flesch = 89 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A description of a strange (and miraculous) fish cast upon the sands in the meads, in the hundred of Worwell, in the county Palatine of Chester, (or Chesshiere. A description of a strange (and miraculous) fish cast upon the sands in the meads, in the hundred of Worwell, in the county Palatine of Chester, (or Chesshiere. The certainty whereof is here related concerning the said most monstrous fish. The certainty whereof is here related concerning the said most monstrous fish. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). cache = ./cache/A08949.xml txt = ./txt/A08949.txt Building ./etc/reader.txt A94009 A30133 A10066 A94009 A30133 A10066 number of items: 4 sum of words: 6,148 average size in words: 1,537 average readability score: 87 nouns: text; texts; characters; works; xml; books; images; fish; project; image; work; page; keying; encoding; elements; eebo; edition; data; description; heads; title; parts; like; compare; child; users; tune; time; sets; selection; schema; purposes; markup; instances; guidelines; editions; t; relation; head; shape; sands; part; news; nature; credit; reason; monster; man; letter; length verbs: is; was; be; were; are; have; encoded; been; based; nere; made; had; did; represented; published; marked; hath; created; create; corrected; -; living; transformed; cast; born; bear; taken; returned; make; given; do; creating; being; using; use; understanding; transcribed; take; stand; simplify; sent; scanned; said; reviewed; request; remaining; remain; released; reflect; range adjectives: early; english; rare; available; monstrous; other; first; true; strange; many; such; illegible; general; large; second; same; possible; perfect; long; good; usual; original; wide; textual; syntactic; subject; structural; readable; quality; public; own; overall; much; monographic; lossless; light; later; keyboarded; greater; financial; female; famous; external; eligible; editorial; due; displayable; diplomatic; critical; compelling adverbs: not; now; so; very; then; therefore; online; as; out; in; early; well; variously; usually; thus; sometimes; respectfully; over; notably; never; much; most; mainly; long; lately; here; even; also; accurately; above; yet; still; proofread; only; away; again; too; there; soon; somewhat; neatly; forth; down; all; up; though; swéetly; straight; since; rather pronouns: it; i; her; she; their; they; its; his; you; we; he; my; him; our; me; them; your; hers; ''s proper nouns: tcp; text; tei; eebo; english; oxford; england; proquest; phase; partnership; creation; mr.; london; utf-8; unicode; transcribed; p5; online; ncbel; michigan; ireland; kingsale; c.; printed; mona; m.; logarbo; robinson; p.; john; january; welsh; university; universities; universal; uk; tiff; stc; smithfield; sgml; sdata; sampled; qc; northwestern; new; nebraska; mnemonic; mi; meath; literature keywords: tcp; mr.; heads; early one topic; one dimension: tcp file(s): ./cache/A94009.xml titles(s): Strange news from Ireland, or, A true and perfect relation of a famous fish taken at Kingsale the manner of its taking, and description of its horrible shapes / as it was certified in a letter from one Mr. Robinson, living in Kingsale, (an eye-witness) to Mr. John Davie a relation of his, living in Westminster. three topics; one dimension: tcp; tcp; longer file(s): ./cache/A08949.xml, ./cache/A10066.xml, ./cache/A30133.xml titles(s): A description of a strange (and miraculous) fish cast upon the sands in the meads, in the hundred of Worwell, in the county Palatine of Chester, (or Chesshiere. The certainty whereof is here related concerning the said most monstrous fish. To the tune of Bragandary. | A monstrous shape. Or a shapelesse monster A description of a female creature borne in Holland, compleat in every p[arte] save only a head like a swine, who hath travailed into many parts, and is now to be seene in London, ... To the tune of the Spanish Pavin. | Strange and wonderful news of the birth of a monstrous child with two heads, and three arms which was lately born at Attenree, in the county of Meath, in Ireland. five topics; three dimensions: tcp text eebo; text tcp mr; appear hair soon; appear hair soon; appear hair soon file(s): ./cache/A08949.xml, ./cache/A94009.xml, ./cache/A30133.xml, ./cache/A30133.xml, ./cache/A30133.xml titles(s): A description of a strange (and miraculous) fish cast upon the sands in the meads, in the hundred of Worwell, in the county Palatine of Chester, (or Chesshiere. The certainty whereof is here related concerning the said most monstrous fish. To the tune of Bragandary. | Strange news from Ireland, or, A true and perfect relation of a famous fish taken at Kingsale the manner of its taking, and description of its horrible shapes / as it was certified in a letter from one Mr. Robinson, living in Kingsale, (an eye-witness) to Mr. John Davie a relation of his, living in Westminster. | Strange and wonderful news of the birth of a monstrous child with two heads, and three arms which was lately born at Attenree, in the county of Meath, in Ireland. | Strange and wonderful news of the birth of a monstrous child with two heads, and three arms which was lately born at Attenree, in the county of Meath, in Ireland. | Strange and wonderful news of the birth of a monstrous child with two heads, and three arms which was lately born at Attenree, in the county of Meath, in Ireland. Type: zip2carrel title: subject-monsters-freebo date: 2021-05-24 time: 19:33 username: emorgan patron: Eric Morgan email: emorgan@nd.edu input: input-file.zip ==== make-pages.sh htm files ==== make-pages.sh complex files ==== make-pages.sh named enities ==== making bibliographics id: A94009 author: Davie, John. title: Strange news from Ireland, or, A true and perfect relation of a famous fish taken at Kingsale the manner of its taking, and description of its horrible shapes / as it was certified in a letter from one Mr. Robinson, living in Kingsale, (an eye-witness) to Mr. John Davie a relation of his, living in Westminster. date: 1677 words: 1454 sentences: 234 pages: flesch: 83 cache: ./cache/A94009.xml txt: ./txt/A94009.txt summary: Strange news from Ireland, or, A true and perfect relation of a famous fish taken at Kingsale the manner of its taking, and description of its horrible shapes / as it was certified in a letter from one Mr. Robinson, living in Kingsale, (an eye-witness) to Mr. John Davie a relation of his, living in Westminster. Strange news from Ireland, or, A true and perfect relation of a famous fish taken at Kingsale the manner of its taking, and description of its horrible shapes / as it was certified in a letter from one Mr. Robinson, living in Kingsale, (an eye-witness) to Mr. John Davie a relation of his, living in Westminster. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). id: A30133 author: E. B. title: Strange and wonderful news of the birth of a monstrous child with two heads, and three arms which was lately born at Attenree, in the county of Meath, in Ireland. date: 1685 words: 1432 sentences: 234 pages: flesch: 86 cache: ./cache/A30133.xml txt: ./txt/A30133.txt summary: Strange and wonderful news of the birth of a monstrous child with two heads, and three arms which was lately born at Attenree, in the county of Meath, in Ireland. Strange and wonderful news of the birth of a monstrous child with two heads, and three arms which was lately born at Attenree, in the county of Meath, in Ireland. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). id: A10066 author: L.P. (Lawrence Price), fl. 1625-1680?. title: A monstrous shape. Or a shapelesse monster A description of a female creature borne in Holland, compleat in every p[arte] save only a head like a swine, who hath travailed into many parts, and is now to be seene in London, ... To the tune of the Spanish Pavin. date: 1639 words: 1594 sentences: 318 pages: flesch: 91 cache: ./cache/A10066.xml txt: ./txt/A10066.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. Or a shapelesse monster A description of a female creature borne in Holland, compleat in every p[arte] save only a head like a swine, who hath travailed into many parts, and is now to be seene in London, ... Or a shapelesse monster A description of a female creature borne in Holland, compleat in every p[arte] save only a head like a swine, who hath travailed into many parts, and is now to be seene in London, ... F[lesher] for Tho: Lambert, and are to be sold at the signe of the Horse shooe in Smithfield, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). id: A08949 author: M. P. (Martin Parker), d. 1656? title: A description of a strange (and miraculous) fish cast upon the sands in the meads, in the hundred of Worwell, in the county Palatine of Chester, (or Chesshiere. The certainty whereof is here related concerning the said most monstrous fish. To the tune of Bragandary. date: 1635 words: 1668 sentences: 342 pages: flesch: 89 cache: ./cache/A08949.xml txt: ./txt/A08949.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A description of a strange (and miraculous) fish cast upon the sands in the meads, in the hundred of Worwell, in the county Palatine of Chester, (or Chesshiere. A description of a strange (and miraculous) fish cast upon the sands in the meads, in the hundred of Worwell, in the county Palatine of Chester, (or Chesshiere. The certainty whereof is here related concerning the said most monstrous fish. The certainty whereof is here related concerning the said most monstrous fish. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel