mv: ‘./input-file.zip’ and ‘./input-file.zip’ are the same file Creating study carrel named subject-illyria-gutenberg Initializing database Unzipping Archive: input-file.zip creating: ./tmp/input/input-file/ inflating: ./tmp/input/input-file/1123.txt inflating: ./tmp/input/input-file/1527.txt inflating: ./tmp/input/input-file/2247.txt inflating: ./tmp/input/input-file/38901.txt inflating: ./tmp/input/input-file/metadata.csv caution: excluded filename not matched: *MACOSX* === DIRECTORIES: ./tmp/input === DIRECTORY: ./tmp/input/input-file === metadata file: ./tmp/input/input-file/metadata.csv === found metadata file === updating bibliographic database Building study carrel named subject-illyria-gutenberg FILE: cache/2247.txt OUTPUT: txt/2247.txt FILE: cache/1527.txt OUTPUT: txt/1527.txt FILE: cache/1123.txt OUTPUT: txt/1123.txt FILE: cache/38901.txt OUTPUT: txt/38901.txt === file2bib.sh === id: 1527 author: Shakespeare, William title: Twelfth Night; Or, What You Will date: pages: extension: .txt txt: ./txt/1527.txt cache: ./cache/1527.txt Content-Encoding ISO-8859-1 Content-Type text/plain; charset=ISO-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.csv.TextAndCSVParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'1527.txt' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: 2247 author: Shakespeare, William title: Twelfth Night date: pages: extension: .txt txt: ./txt/2247.txt cache: ./cache/2247.txt Content-Encoding ISO-8859-1 Content-Type text/plain; charset=ISO-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.csv.TextAndCSVParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 resourceName b'2247.txt' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' 1123 txt/../pos/1123.pos 2247 txt/../wrd/2247.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point 1123 txt/../ent/1123.ent 1527 txt/../pos/1527.pos 1527 txt/../wrd/1527.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point 1123 txt/../wrd/1123.wrd 1527 txt/../ent/1527.ent 2247 txt/../pos/2247.pos 2247 txt/../ent/2247.ent === file2bib.sh === id: 1123 author: Shakespeare, William title: Twelfth Night; Or, What You Will date: pages: extension: .txt txt: ./txt/1123.txt cache: ./cache/1123.txt Content-Encoding UTF-8 Content-Type text/plain; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.csv.TextAndCSVParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 resourceName b'1123.txt' 38901 txt/../wrd/38901.wrd 38901 txt/../pos/38901.pos === file2bib.sh === id: 38901 author: Kemble, John Philip title: Twelfth Night; or, What You Will date: pages: extension: .txt txt: ./txt/38901.txt cache: ./cache/38901.txt Content-Encoding UTF-8 Content-Type text/plain; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.csv.TextAndCSVParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 5 resourceName b'38901.txt' 38901 txt/../ent/38901.ent Done mapping. Reducing subject-illyria-gutenberg === reduce.pl bib === id = 1123 author = Shakespeare, William title = Twelfth Night; Or, What You Will date = pages = extension = .txt mime = text/plain words = 40 sentences = 10 flesch = 88 summary = THIS EBOOK WAS ONE OF PROJECT GUTENBERG'S EARLY FILES PRODUCED AT A TIME WHEN PROOFING METHODS AND TOOLS WERE NOT WELL DEVELOPED. IS AN IMPROVED EDITION OF THIS TITLE WHICH MAY BE VIEWED AS EBOOK (#38901) at https://www.gutenberg.org/ebooks/38901 cache = ./cache/1123.txt txt = ./txt/1123.txt === reduce.pl bib === === reduce.pl bib === id = 38901 author = Kemble, John Philip title = Twelfth Night; or, What You Will date = pages = extension = .txt mime = text/plain words = 18314 sentences = 3610 flesch = 102 summary = _Mar._ By my troth, Sir Toby, you must come in earlier o' nights; turn o' the toe like a parish-top--See, here comes Sir Andrew Ague-face. _Sir To._ Art thou good at these kick-shaws, knight? _Clo._ Good Sir Toby,---_Sir And._ Begin, fool: it begins,--[_Sings._] _Hold thy peace._ _Mar._ Nay, good Sir Toby. _Sir To._ He shall think, by the letters that thou wilt drop, that _Sir To._ Let's to bed, knight.--Thou hadst need send for more _Duke._ Come hither, boy:--If ever thou shalt love, _Vio._ But, if she cannot love you, sir? _Sir To._ Come thy ways, Signior Fabian. _Vio._ Art not thou the Lady Olivia's fool? _Clo._ No, indeed, sir; the Lady Olivia has no folly: she will keep _Fab._ [_Parts them._] O good Sir Toby, hold; here come the _Sir To._ What, man!--Come on. _Fab._ Hold, good Sir Toby, hold:--my lady here! _Vio._ Here comes the man, sir, that did rescue me. cache = ./cache/38901.txt txt = ./txt/38901.txt === reduce.pl bib === Building ./etc/reader.txt /data-disk/reader-compute/reader-classic/bin/topic-model.py:68: UserWarning: The handle has a label of '_sir sir thou' which cannot be automatically added to the legend. axis.legend( title = "Topics", labels = df[ 'words' ] ) 38901 2247 1527 38901 2247 1527 number of items: 4 sum of words: 18,354 average size in words: 9,177 average readability score: 95 nouns: sir; lady; man; love; fool; hand; youth; time; letter; heart; gentleman; lord; thy; peace; niece; day; none; knight; brother; name; way; soul; nothing; master; house; bed; reason; matter; life; fellow; world; wit; thing; eye; woman; night; men; art; sword; rain; money; hath; faith; face; eyes; years; word; thee; servant; o verbs: is; be; have; do; ''s; am; let; are; come; know; was; make; go; did; see; were; think; say; give; had; comes; tell; take; speak; does; love; put; has; keep; hold; done; call; been; set; pray; hear; find; being; saw; made; believe; live; takes; smile; hurt; heard; draw; bring; write; told adjectives: good; more; great; mad; sweet; much; little; dear; true; fair; better; own; young; other; yellow; such; poor; old; excellent; wise; very; noble; foolish; worse; full; dry; bloody; best; worth; strong; strange; same; notable; mine; many; alone; willing; troth; third; sick; sad; ill; happy; gentle; gartered; fresh; free; exquisite; drunken; dishonest adverbs: not; so; now; here; well; then; as; never; too; very; most; away; yet; up; again; more; ever; therefore; still; much; even; thus; on; there; rather; out; no; indeed; enough; off; once; in; late; first; else; down; longer; long; better; together; perhaps; is; further; before; ago; that; sometimes; quickly; perchance; over pronouns: i; you; my; me; him; it; he; your; his; her; she; thy; thee; we; they; our; them; us; myself; himself; yourself; yours; their; mine; thyself; themselves; one; itself; ''em; yourselves; you.--here; ourselves; on''t; o; is''t; herself; ay; aloof.--cesario; ''s proper nouns: _; sir; to; oli; thou; vio; mal; clo; duke; enter; toby; fab; mar.; malvolio; exit; olivia; andrew; maria; madam; seb; fabian; exeunt; ant; viola; antonio; orsino; sebastian; clown; .; scene; nay; mr; lord; hath; come; heaven; cesario; ay; art; topas; sings; pr''ythee; good; rob; marry; house; thee; room; madonna; illyria keywords: vio; toby; sir; oli; march; mal; exit; enter; ebook; duke; clo one topic; one dimension: _sir file(s): ./cache/1123.txt titles(s): Twelfth Night; Or, What You Will three topics; one dimension: _sir; 38901; early file(s): ./cache/38901.txt, ./cache/1123.txt, titles(s): Twelfth Night; or, What You Will | Twelfth Night; Or, What You Will | Twelfth Night; Or, What You Will five topics; three dimensions: _sir sir thou; 38901 ebook gutenberg; early time www; early time www; early time www file(s): ./cache/38901.txt, ./cache/1123.txt, , , titles(s): Twelfth Night; or, What You Will | Twelfth Night; Or, What You Will | Twelfth Night; Or, What You Will | Twelfth Night; Or, What You Will | Twelfth Night; Or, What You Will Type: gutenberg title: subject-illyria-gutenberg date: 2021-06-06 time: 18:06 username: emorgan patron: Eric Morgan email: emorgan@nd.edu input: facet_subject:"Illyria" ==== make-pages.sh htm files ==== make-pages.sh complex files ==== make-pages.sh named enities ==== making bibliographics id: 38901 author: Kemble, John Philip title: Twelfth Night; or, What You Will date: words: 18314.0 sentences: 3610.0 pages: flesch: 102.0 cache: ./cache/38901.txt txt: ./txt/38901.txt summary: _Mar._ By my troth, Sir Toby, you must come in earlier o'' nights; turn o'' the toe like a parish-top--See, here comes Sir Andrew Ague-face. _Sir To._ Art thou good at these kick-shaws, knight? _Clo._ Good Sir Toby,---_Sir And._ Begin, fool: it begins,--[_Sings._] _Hold thy peace._ _Mar._ Nay, good Sir Toby. _Sir To._ He shall think, by the letters that thou wilt drop, that _Sir To._ Let''s to bed, knight.--Thou hadst need send for more _Duke._ Come hither, boy:--If ever thou shalt love, _Vio._ But, if she cannot love you, sir? _Sir To._ Come thy ways, Signior Fabian. _Vio._ Art not thou the Lady Olivia''s fool? _Clo._ No, indeed, sir; the Lady Olivia has no folly: she will keep _Fab._ [_Parts them._] O good Sir Toby, hold; here come the _Sir To._ What, man!--Come on. _Fab._ Hold, good Sir Toby, hold:--my lady here! _Vio._ Here comes the man, sir, that did rescue me. id: 1123 author: Shakespeare, William title: Twelfth Night; Or, What You Will date: words: 40.0 sentences: 10.0 pages: flesch: 88.0 cache: ./cache/1123.txt txt: ./txt/1123.txt summary: THIS EBOOK WAS ONE OF PROJECT GUTENBERG''S EARLY FILES PRODUCED AT A TIME WHEN PROOFING METHODS AND TOOLS WERE NOT WELL DEVELOPED. IS AN IMPROVED EDITION OF THIS TITLE WHICH MAY BE VIEWED AS EBOOK (#38901) at https://www.gutenberg.org/ebooks/38901 id: 1527 author: Shakespeare, William title: Twelfth Night; Or, What You Will date: words: nan sentences: nan pages: flesch: nan cache: txt: summary: id: 2247 author: Shakespeare, William title: Twelfth Night date: words: nan sentences: nan pages: flesch: nan cache: txt: summary: ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel