mv: 'input-file.zip' and './input-file.zip' are the same file Creating study carrel named subject-coronations-freebo Initializing database Unzipping Archive: input-file.zip inflating: ./tmp/input/A79342.xml inflating: ./tmp/input/A26748.xml inflating: ./tmp/input/xml2htm.xsl inflating: ./tmp/input/A38804.xml inflating: ./tmp/input/metadata.csv inflating: ./tmp/input/A48000.xml inflating: ./tmp/input/A94390.xml caution: excluded filename not matched: *MACOSX* === DIRECTORIES: ./tmp/input === DIRECTORY: === metadata file: ./tmp/input/metadata.csv === found metadata file === updating bibliographic database Building study carrel named subject-coronations-freebo May 24, 2021 5:07:39 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. May 24, 2021 5:07:39 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: Tesseract OCR is installed and will be automatically applied to image files unless you've excluded the TesseractOCRParser from the default parser. Tesseract may dramatically slow down content extraction (TIKA-2359). As of Tika 1.15 (and prior versions), Tesseract is automatically called. In future versions of Tika, users may need to turn the TesseractOCRParser on via TikaConfig. May 24, 2021 5:07:39 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version. INFO Starting Apache Tika 1.24.1 server INFO Setting the server's publish address to be http://localhost:9998/ INFO Logging initialized @2367ms to org.eclipse.jetty.util.log.Slf4jLog INFO jetty-9.4.27.v20200227; built: 2020-02-27T18:37:21.340Z; git: a304fd9f351f337e7c0e2a7c28878dd536149c6c; jvm 1.8.0_281-b09 INFO Started ServerConnector@3e74829{HTTP/1.1, (http/1.1)}{localhost:9998} INFO Started @2448ms WARN Empty contextPath INFO Started o.e.j.s.h.ContextHandler@70f02c32{/,null,AVAILABLE} INFO Started Apache Tika server at http://localhost:9998/ INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) FILE: cache/A94390.xml OUTPUT: txt/A94390.txt FILE: cache/A79342.xml OUTPUT: txt/A79342.txt FILE: cache/A26748.xml OUTPUT: txt/A26748.txt FILE: cache/A48000.xml OUTPUT: txt/A48000.txt FILE: cache/A38804.xml OUTPUT: txt/A38804.txt === file2bib.sh === INFO Detecting media type for Filename: b'A94390.xml' INFO Detecting media type for Filename: b'A26748.xml' INFO rmeta/text (autodetecting type) INFO Detecting media type for Filename: b'A48000.xml' INFO rmeta/text (autodetecting type) INFO Detecting media type for Filename: b'A38804.xml' INFO rmeta/text (autodetecting type) INFO Detecting media type for Filename: b'A79342.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) A79342 txt/../wrd/A79342.wrd A48000 txt/../wrd/A48000.wrd A26748 txt/../wrd/A26748.wrd A48000 txt/../pos/A48000.pos A79342 txt/../pos/A79342.pos A26748 txt/../ent/A26748.ent A94390 txt/../ent/A94390.ent A94390 txt/../wrd/A94390.wrd A94390 txt/../pos/A94390.pos A26748 txt/../pos/A26748.pos A48000 txt/../ent/A48000.ent === file2bib.sh === id: A94390 author: Throckmorton, William. title: To all and every the constables of St. Clements Danes of the Dutchy Liberty, of Covent-garden, and St. Martins in the Fields date: 1661 pages: extension: .xml txt: ./txt/A94390.txt cache: ./cache/A94390.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 12 resourceName b'A94390.xml' === file2bib.sh === id: A79342 author: Charles II, King of England, 1630-1685. title: By the King. A proclamation for the better regulating His Majesties royal proceeding from the Tower of London to His palace at Whitehall the 22th day of April next, being the day before His Majesties coronation. date: 1661 pages: extension: .xml txt: ./txt/A79342.txt cache: ./cache/A79342.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 14 resourceName b'A79342.xml' A79342 txt/../ent/A79342.ent === file2bib.sh === id: A48000 author: Gentleman in the country. title: A letter from a gentleman in the country to his correspondent in the city, concerning the coronation medal, distributed April 11, 1689 date: 1689 pages: extension: .xml txt: ./txt/A48000.txt cache: ./cache/A48000.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 25 resourceName b'A48000.xml' A38804 txt/../ent/A38804.ent A38804 txt/../pos/A38804.pos === file2bib.sh === id: A26748 author: Basset, William, 1644-1695. title: A panegyrick on the coronation of King James the II and His Royal Consort Queen Mary on April 23, 1685 / by the author of the plea for succession, in opposition to popular exclusion. date: 1685 pages: extension: .xml txt: ./txt/A26748.txt cache: ./cache/A26748.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 35 resourceName b'A26748.xml' A38804 txt/../wrd/A38804.wrd === file2bib.sh === id: A38804 author: Evelyn, John, 1620-1706. title: A panegyric to Charles the Second presented to His Majestie the xxxiii. [sic] of April, being the day of his coronation, MDCLXI. date: 1661 pages: extension: .xml txt: ./txt/A38804.txt cache: ./cache/A38804.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 28 resourceName b'A38804.xml' Done mapping. Reducing subject-coronations-freebo === reduce.pl bib === id = A38804 author = Evelyn, John, 1620-1706. title = A panegyric to Charles the Second presented to His Majestie the xxxiii. [sic] of April, being the day of his coronation, MDCLXI. date = 1661 pages = extension = .xml mime = application/xml words = 8523 sentences = 2452 flesch = 89 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of EEBO-TCP is to encode one copy (usually the first edition) of every monographic English-language title published between 1473 and 1700 available in EEBO. EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A38804.xml txt = ./txt/A38804.txt === reduce.pl bib === id = A48000 author = Gentleman in the country. title = A letter from a gentleman in the country to his correspondent in the city, concerning the coronation medal, distributed April 11, 1689 date = 1689 pages = extension = .xml mime = application/xml words = 2469 sentences = 543 flesch = 86 summary = A letter from a gentleman in the country to his correspondent in the city, concerning the coronation medal, distributed April 11, 1689 A letter from a gentleman in the country to his correspondent in the city, concerning the coronation medal, distributed April 11, 1689 EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of EEBO-TCP is to encode one copy (usually the first edition) of every monographic English-language title published between 1473 and 1700 available in EEBO. Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A48000.xml txt = ./txt/A48000.txt === reduce.pl bib === id = A26748 author = Basset, William, 1644-1695. title = A panegyrick on the coronation of King James the II and His Royal Consort Queen Mary on April 23, 1685 / by the author of the plea for succession, in opposition to popular exclusion. date = 1685 pages = extension = .xml mime = application/xml words = 2499 sentences = 578 flesch = 91 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A panegyrick on the coronation of King James the II and His Royal Consort Queen Mary on April 23, 1685 / by the author of the plea for succession, in opposition to popular exclusion. A panegyrick on the coronation of King James the II and His Royal Consort Queen Mary on April 23, 1685 / by the author of the plea for succession, in opposition to popular exclusion. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). cache = ./cache/A26748.xml txt = ./txt/A26748.txt === reduce.pl bib === id = A79342 author = Charles II, King of England, 1630-1685. title = By the King. A proclamation for the better regulating His Majesties royal proceeding from the Tower of London to His palace at Whitehall the 22th day of April next, being the day before His Majesties coronation. date = 1661 pages = extension = .xml mime = application/xml words = 1420 sentences = 202 flesch = 81 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A proclamation for the better regulating His Majesties royal proceeding from the Tower of London to His palace at Whitehall the 22th day of April next, being the day before His Majesties coronation. A proclamation for the better regulating His Majesties royal proceeding from the Tower of London to His palace at Whitehall the 22th day of April next, being the day before His Majesties coronation. printed by Iohn Bill, printer to the King's most excellent Majesty, 1661. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). cache = ./cache/A79342.xml txt = ./txt/A79342.txt === reduce.pl bib === id = A94390 author = Throckmorton, William. title = To all and every the constables of St. Clements Danes of the Dutchy Liberty, of Covent-garden, and St. Martins in the Fields date = 1661 pages = extension = .xml mime = application/xml words = 1245 sentences = 167 flesch = 80 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. To all and every the constables of St. Clements Danes of the Dutchy Liberty, of Covent-garden, and St. Martins in the Fields To all and every the constables of St. Clements Danes of the Dutchy Liberty, of Covent-garden, and St. Martins in the Fields Printed by John Bill, Printer to the King's most Excellent Majesty, Dated and signed at end: Whitehall, by the authority above named, the eight day of April, one thousand six hundred sixty one. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). cache = ./cache/A94390.xml txt = ./txt/A94390.txt Building ./etc/reader.txt A38804 A26748 A94390 A79342 A48000 A38804 number of items: 5 sum of words: 16,156 average size in words: 3,231 average readability score: 85 nouns: day; text; texts; characters; xml; works; coronation; books; images; image; people; self; work; project; page; others; keying; encoding; elements; eebo; edition; data; things; subjects; king; title; men; instances; users; sets; selection; schema; purposes; markup; guidelines; editions; virtues; time; re; nothing; none; nature; mind; felicity; enemies; author; persons; period; rest; person verbs: be; is; was; have; were; are; been; had; do; encoded; did; make; made; let; has; based; said; -; see; given; represented; come; published; being; produce; marked; give; done; created; create; corrected; set; known; having; tell; go; distributed; use; take; sent; meet; bear; according; understanding; seem; say; ride; returned; render; remain adjectives: early; great; such; own; english; other; more; many; available; good; glorious; first; happy; very; sacred; general; due; illegible; true; greater; secure; second; new; much; last; best; present; possible; original; excellent; worthy; wide; subject; old; most; light; external; clear; usual; universal; textual; syntactic; structural; ready; readable; quality; public; overall; monographic; lossless adverbs: not; so; then; now; even; more; only; as; yet; never; up; therefore; well; very; much; indeed; again; sometimes; out; online; once; too; there; long; just; truly; over; most; here; above; in; ever; down; alone; usually; rather; far; early; back; already; variously; thus; respectfully; notably; mainly; hardly; also; accurately; sooner; lately pronouns: your; you; his; it; their; i; our; we; they; he; us; them; him; me; my; yours; thy; themselves; her; its; himself; she; ours; turn''d; one; itself; ''em proper nouns: tcp; majesty; king; majesties; text; tei; eebo; prince; english; england; oxford; april; charles; proquest; princes; phase; partnership; ii; heaven; god; creation; royal; london; james; queen; majestie; 〉; ◊; utf-8; unicode; transcribed; tower; p5; online; ncbel; michigan; 〈; william; sir; father; whitehall; st.; john; world; universal; thee; printed; head; gentleman; emblem keywords: tcp; majesty; majesties; king; subjects; queen; prince; john; great; good; day one topic; one dimension: text file(s): ./cache/A94390.xml titles(s): To all and every the constables of St. Clements Danes of the Dutchy Liberty, of Covent-garden, and St. Martins in the Fields three topics; one dimension: majesty; text; gentleman file(s): ./cache/A38804.xml, ./cache/A26748.xml, ./cache/A48000.xml titles(s): A panegyric to Charles the Second presented to His Majestie the xxxiii. [sic] of April, being the day of his coronation, MDCLXI. | A panegyrick on the coronation of King James the II and His Royal Consort Queen Mary on April 23, 1685 / by the author of the plea for succession, in opposition to popular exclusion. | A letter from a gentleman in the country to his correspondent in the city, concerning the coronation medal, distributed April 11, 1689 five topics; three dimensions: majesty day great; king text tcp; duty painted kept; duty painted kept; duty painted kept file(s): ./cache/A38804.xml, ./cache/A48000.xml, ./cache/A94390.xml, ./cache/A94390.xml, ./cache/A94390.xml titles(s): A panegyric to Charles the Second presented to His Majestie the xxxiii. [sic] of April, being the day of his coronation, MDCLXI. | A letter from a gentleman in the country to his correspondent in the city, concerning the coronation medal, distributed April 11, 1689 | To all and every the constables of St. Clements Danes of the Dutchy Liberty, of Covent-garden, and St. Martins in the Fields | To all and every the constables of St. Clements Danes of the Dutchy Liberty, of Covent-garden, and St. Martins in the Fields | To all and every the constables of St. Clements Danes of the Dutchy Liberty, of Covent-garden, and St. Martins in the Fields Type: zip2carrel title: subject-coronations-freebo date: 2021-05-24 time: 17:07 username: emorgan patron: Eric Morgan email: emorgan@nd.edu input: input-file.zip ==== make-pages.sh htm files ==== make-pages.sh complex files ==== make-pages.sh named enities ==== making bibliographics id: A26748 author: Basset, William, 1644-1695. title: A panegyrick on the coronation of King James the II and His Royal Consort Queen Mary on April 23, 1685 / by the author of the plea for succession, in opposition to popular exclusion. date: 1685 words: 2499 sentences: 578 pages: flesch: 91 cache: ./cache/A26748.xml txt: ./txt/A26748.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A panegyrick on the coronation of King James the II and His Royal Consort Queen Mary on April 23, 1685 / by the author of the plea for succession, in opposition to popular exclusion. A panegyrick on the coronation of King James the II and His Royal Consort Queen Mary on April 23, 1685 / by the author of the plea for succession, in opposition to popular exclusion. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). id: A79342 author: Charles II, King of England, 1630-1685. title: By the King. A proclamation for the better regulating His Majesties royal proceeding from the Tower of London to His palace at Whitehall the 22th day of April next, being the day before His Majesties coronation. date: 1661 words: 1420 sentences: 202 pages: flesch: 81 cache: ./cache/A79342.xml txt: ./txt/A79342.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A proclamation for the better regulating His Majesties royal proceeding from the Tower of London to His palace at Whitehall the 22th day of April next, being the day before His Majesties coronation. A proclamation for the better regulating His Majesties royal proceeding from the Tower of London to His palace at Whitehall the 22th day of April next, being the day before His Majesties coronation. printed by Iohn Bill, printer to the King''s most excellent Majesty, 1661. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). id: A38804 author: Evelyn, John, 1620-1706. title: A panegyric to Charles the Second presented to His Majestie the xxxiii. [sic] of April, being the day of his coronation, MDCLXI. date: 1661 words: 8523 sentences: 2452 pages: flesch: 89 cache: ./cache/A38804.xml txt: ./txt/A38804.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of EEBO-TCP is to encode one copy (usually the first edition) of every monographic English-language title published between 1473 and 1700 available in EEBO. EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A48000 author: Gentleman in the country. title: A letter from a gentleman in the country to his correspondent in the city, concerning the coronation medal, distributed April 11, 1689 date: 1689 words: 2469 sentences: 543 pages: flesch: 86 cache: ./cache/A48000.xml txt: ./txt/A48000.txt summary: A letter from a gentleman in the country to his correspondent in the city, concerning the coronation medal, distributed April 11, 1689 A letter from a gentleman in the country to his correspondent in the city, concerning the coronation medal, distributed April 11, 1689 EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of EEBO-TCP is to encode one copy (usually the first edition) of every monographic English-language title published between 1473 and 1700 available in EEBO. Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A94390 author: Throckmorton, William. title: To all and every the constables of St. Clements Danes of the Dutchy Liberty, of Covent-garden, and St. Martins in the Fields date: 1661 words: 1245 sentences: 167 pages: flesch: 80 cache: ./cache/A94390.xml txt: ./txt/A94390.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. To all and every the constables of St. Clements Danes of the Dutchy Liberty, of Covent-garden, and St. Martins in the Fields To all and every the constables of St. Clements Danes of the Dutchy Liberty, of Covent-garden, and St. Martins in the Fields Printed by John Bill, Printer to the King''s most Excellent Majesty, Dated and signed at end: Whitehall, by the authority above named, the eight day of April, one thousand six hundred sixty one. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel