mv: 'input-file.zip' and './input-file.zip' are the same file Creating study carrel named subject-coachDrivers-freebo Initializing database Unzipping Archive: input-file.zip inflating: ./tmp/input/B02601.xml inflating: ./tmp/input/xml2htm.xsl inflating: ./tmp/input/metadata.csv inflating: ./tmp/input/A81286.xml inflating: ./tmp/input/A81287.xml inflating: ./tmp/input/A33493.xml caution: excluded filename not matched: *MACOSX* === DIRECTORIES: ./tmp/input === DIRECTORY: === metadata file: ./tmp/input/metadata.csv === found metadata file === updating bibliographic database Building study carrel named subject-coachDrivers-freebo May 24, 2021 5:00:28 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. May 24, 2021 5:00:28 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: Tesseract OCR is installed and will be automatically applied to image files unless you've excluded the TesseractOCRParser from the default parser. Tesseract may dramatically slow down content extraction (TIKA-2359). As of Tika 1.15 (and prior versions), Tesseract is automatically called. In future versions of Tika, users may need to turn the TesseractOCRParser on via TikaConfig. May 24, 2021 5:00:28 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version. INFO Starting Apache Tika 1.24.1 server INFO Setting the server's publish address to be http://localhost:9998/ INFO Logging initialized @1913ms to org.eclipse.jetty.util.log.Slf4jLog INFO jetty-9.4.27.v20200227; built: 2020-02-27T18:37:21.340Z; git: a304fd9f351f337e7c0e2a7c28878dd536149c6c; jvm 1.8.0_281-b09 INFO Started ServerConnector@3e74829{HTTP/1.1, (http/1.1)}{localhost:9998} INFO Started @1987ms WARN Empty contextPath INFO Started o.e.j.s.h.ContextHandler@62010f5c{/,null,AVAILABLE} INFO Started Apache Tika server at http://localhost:9998/ INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) FILE: cache/B02601.xml OUTPUT: txt/B02601.txt FILE: cache/A33493.xml OUTPUT: txt/A33493.txt FILE: cache/A81286.xml OUTPUT: txt/A81286.txt FILE: cache/A81287.xml OUTPUT: txt/A81287.txt === file2bib.sh === INFO Detecting media type for Filename: b'B02601.xml' INFO Detecting media type for Filename: b'A33493.xml' INFO Detecting media type for Filename: b'A81287.xml' INFO Detecting media type for Filename: b'A81286.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) A81286 txt/../wrd/A81286.wrd A81286 txt/../pos/A81286.pos B02601 txt/../ent/B02601.ent A33493 txt/../wrd/A33493.wrd B02601 txt/../wrd/B02601.wrd A81287 txt/../wrd/A81287.wrd A33493 txt/../ent/A33493.ent A81286 txt/../ent/A81286.ent A81287 txt/../pos/A81287.pos A33493 txt/../pos/A33493.pos A81287 txt/../ent/A81287.ent B02601 txt/../pos/B02601.pos === file2bib.sh === id: B02601 author: Gee, Richard. title: The case of R. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. date: 1695.0 pages: extension: .xml txt: ./txt/B02601.txt cache: ./cache/B02601.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 resourceName b'B02601.xml' === file2bib.sh === id: A81287 author: Cadman, Thomas. title: The case of several of His Majesties loyal subjects, very much oppressed, contrary to the laws of this land as they are advised, humbly represented to the consideration of the right honourable the knights, citizens, and burgesses in Parliament assembled. date: nan pages: extension: .xml txt: ./txt/A81287.txt cache: ./cache/A81287.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 6 resourceName b'A81287.xml' === file2bib.sh === id: A33493 author: Cadman, Thomas. title: The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey and his adherents, to the utter ruin of many families, for his and his accomplices private interest date: nan pages: extension: .xml txt: ./txt/A33493.txt cache: ./cache/A33493.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 6 resourceName b'A33493.xml' === file2bib.sh === id: A81286 author: Cadman, Thomas. title: The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey, and his adherents, to the utter ruin of many families, for his and his accomplices private interest. date: nan pages: extension: .xml txt: ./txt/A81286.txt cache: ./cache/A81286.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 6 resourceName b'A81286.xml' Done mapping. Reducing subject-coachDrivers-freebo === reduce.pl bib === id = A33493 author = Cadman, Thomas. title = The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey and his adherents, to the utter ruin of many families, for his and his accomplices private interest date = nan pages = extension = .xml mime = application/xml words = 1676 sentences = 291 flesch = 82 summary = The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey and his adherents, to the utter ruin of many families, for his and his accomplices private interest The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey and his adherents, to the utter ruin of many families, for his and his accomplices private interest EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). cache = ./cache/A33493.xml txt = ./txt/A33493.txt === reduce.pl bib === id = B02601 author = Gee, Richard. title = The case of R. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. date = 1695.0 pages = extension = .xml mime = application/xml words = 1246 sentences = 183 flesch = 77 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of EEBO-TCP is to encode one copy (usually the first edition) of every monographic English-language title published between 1473 and 1700 available in EEBO. EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). cache = ./cache/B02601.xml txt = ./txt/B02601.txt === reduce.pl bib === id = A81286 author = Cadman, Thomas. title = The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey, and his adherents, to the utter ruin of many families, for his and his accomplices private interest. date = nan pages = extension = .xml mime = application/xml words = 1685 sentences = 297 flesch = 82 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey, and his adherents, to the utter ruin of many families, for his and his accomplices private interest. The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey, and his adherents, to the utter ruin of many families, for his and his accomplices private interest. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). cache = ./cache/A81286.xml txt = ./txt/A81286.txt === reduce.pl bib === id = A81287 author = Cadman, Thomas. title = The case of several of His Majesties loyal subjects, very much oppressed, contrary to the laws of this land as they are advised, humbly represented to the consideration of the right honourable the knights, citizens, and burgesses in Parliament assembled. date = nan pages = extension = .xml mime = application/xml words = 1542 sentences = 249 flesch = 80 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. The case of several of His Majesties loyal subjects, very much oppressed, contrary to the laws of this land as they are advised, humbly represented to the consideration of the right honourable the knights, citizens, and burgesses in Parliament assembled. The case of several of His Majesties loyal subjects, very much oppressed, contrary to the laws of this land as they are advised, humbly represented to the consideration of the right honourable the knights, citizens, and burgesses in Parliament assembled. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). cache = ./cache/A81287.xml txt = ./txt/A81287.txt Building ./etc/reader.txt B02601 A81287 A81286 B02601 A81287 A81286 number of items: 4 sum of words: 6,149 average size in words: 1,537 average readability score: 80 nouns: text; texts; characters; works; xml; image; books; data; work; project; page; keying; images; encoding; elements; eebo; edition; case; others; title; users; time; sets; selection; schema; purposes; money; markup; interest; instances; guidelines; editions; accomplices; consideration; number; coaches; coach; adherents; hackney; families; drivers; wing; web; variety; user; transcriptions; transcription; touch; terms; teams verbs: said; was; be; were; is; have; encoded; are; did; been; made; based; represented; take; had; published; marked; created; create; corrected; -; put; make; having; occasioned; meet; given; get; coachmen; assembled; according; ply; known; keep; carried; being; using; use; understanding; transformed; transcribed; suffered; simplify; sent; send; scanned; ruined; reviewed; returned; request adjectives: early; several; english; other; available; own; many; first; private; illegible; general; utter; more; honourable; good; clear; true; such; same; poor; original; aforesaid; wide; usual; textual; syntactic; subject; structural; sensible; second; readable; quality; public; possible; pleased; overall; next; monographic; lossless; light; later; large; keyboarded; greater; financial; external; eligible; editorial; due; displayable adverbs: then; not; humbly; therefore; so; in; online; now; very; early; never; above; sometimes; out; ever; variously; usually; respectfully; over; only; notably; most; mainly; even; accurately; up; much; formerly; accordingly; thereupon; thereof; thereby; off; last; immediately; credably; also; afterwards; together; no; longer; lastly; contrary pronouns: his; their; we; our; they; your; them; him; it; he; us; himself; i; themselves proper nouns: tcp; coachmen; hackney; murrey; london; act; text; tei; eebo; cadman; thomas; english; oxford; england; parliament; robert; proquest; phase; partnership; creation; gee; westminster; commons; coaches; utf-8; unicode; transcribed; regulating; r.; p5; online; ncbel; michigan; john; house; honour; coach; petition; mona; men; logarbo; king; common; books; accomplices; year; university; lincoln; law; commissioners keywords: tcp; parliament; murrey; coachmen one topic; one dimension: said file(s): ./cache/B02601.xml titles(s): The case of R. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. three topics; one dimension: said; does; does file(s): ./cache/A81286.xml, ./cache/B02601.xml, ./cache/B02601.xml titles(s): The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey, and his adherents, to the utter ruin of many families, for his and his accomplices private interest. | The case of R. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. | The case of R. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. five topics; three dimensions: said coachmen text; tcp text gee; suffer suggested following; suffer suggested following; suffer suggested following file(s): ./cache/A81286.xml, ./cache/B02601.xml, ./cache/B02601.xml, ./cache/B02601.xml, ./cache/B02601.xml titles(s): The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey, and his adherents, to the utter ruin of many families, for his and his accomplices private interest. | The case of R. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. | The case of R. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. | The case of R. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. | The case of R. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. Type: zip2carrel title: subject-coachDrivers-freebo date: 2021-05-24 time: 17:00 username: emorgan patron: Eric Morgan email: emorgan@nd.edu input: input-file.zip ==== make-pages.sh htm files ==== make-pages.sh complex files ==== make-pages.sh named enities ==== making bibliographics id: A33493 author: Cadman, Thomas. title: The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey and his adherents, to the utter ruin of many families, for his and his accomplices private interest date: nan words: 1676 sentences: 291 pages: flesch: 82 cache: ./cache/A33493.xml txt: ./txt/A33493.txt summary: The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey and his adherents, to the utter ruin of many families, for his and his accomplices private interest The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey and his adherents, to the utter ruin of many families, for his and his accomplices private interest EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). id: A81286 author: Cadman, Thomas. title: The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey, and his adherents, to the utter ruin of many families, for his and his accomplices private interest. date: nan words: 1685 sentences: 297 pages: flesch: 82 cache: ./cache/A81286.xml txt: ./txt/A81286.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey, and his adherents, to the utter ruin of many families, for his and his accomplices private interest. The case of several hackney-coachmen in and about the cities of London and Westminster and the suburbs, occasioned by one Robert Murrey, and his adherents, to the utter ruin of many families, for his and his accomplices private interest. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). id: A81287 author: Cadman, Thomas. title: The case of several of His Majesties loyal subjects, very much oppressed, contrary to the laws of this land as they are advised, humbly represented to the consideration of the right honourable the knights, citizens, and burgesses in Parliament assembled. date: nan words: 1542 sentences: 249 pages: flesch: 80 cache: ./cache/A81287.xml txt: ./txt/A81287.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. The case of several of His Majesties loyal subjects, very much oppressed, contrary to the laws of this land as they are advised, humbly represented to the consideration of the right honourable the knights, citizens, and burgesses in Parliament assembled. The case of several of His Majesties loyal subjects, very much oppressed, contrary to the laws of this land as they are advised, humbly represented to the consideration of the right honourable the knights, citizens, and burgesses in Parliament assembled. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). id: B02601 author: Gee, Richard. title: The case of R. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. date: 1695.0 words: 1246 sentences: 183 pages: flesch: 77 cache: ./cache/B02601.xml txt: ./txt/B02601.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. Gee, Esq; Humbly recommended to the Commons of England, assembled in Parliament. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of EEBO-TCP is to encode one copy (usually the first edition) of every monographic English-language title published between 1473 and 1700 available in EEBO. EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel