Creating study carrel named planet-infomotions Initializing database Getting URL (http://planet.infomotions.com) and saving it (/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/tmp/planet-infomotions) --2020-11-28 14:29:30-- http://planet.infomotions.com/ Resolving planet.infomotions.com... 162.243.47.94 Connecting to planet.infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 109730 (107K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/tmp/planet-infomotions’ 0K .......... .......... .......... .......... .......... 46% 3.45M 0s 50K .......... .......... .......... .......... .......... 93% 6.88M 0s 100K ....... 100% 129M=0.02s 2020-11-28 14:29:30 (4.91 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/tmp/planet-infomotions’ saved [109730/109730] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/tmp/planet-infomotions... 2-10 Converted links in 1 files in 0.001 seconds. Extracting URLs (/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/tmp/planet-infomotions) and saving (/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/tmp/planet-infomotions.txt) Processing each URL in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/tmp/planet-infomotions.txt Warning: Unprocessed MIME-type(text/css) . Ignorning. Call Eric? 403 Forbidden at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 404 Not Found at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. Warning: Unprocessed MIME-type(text/css) . Ignorning. Call Eric? 404 Not Found at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-7836.gif --2020-11-28 14:29:31-- http://infomotions.com/logo.gif Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 1447 (1.4K) [image/gif] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-7836.gif’ 0K . 100% 287M=0s 2020-11-28 14:29:31 (287 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-7836.gif’ saved [1447/1447] Warning: Unprocessed MIME-type(text/css) . Ignorning. Call Eric? Warning: Unprocessed MIME-type(text/css) . Ignorning. Call Eric? filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3637.html --2020-11-28 14:29:31-- http://infomotions.com/alex/ Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3637.html’ 0K ...... 399M=0s 2020-11-28 14:29:31 (399 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3637.html’ saved [6897] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3637.html... 3-10 Converted links in 1 files in 0.003 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-8963.png --2020-11-28 14:29:31-- http://planet.infomotions.com/images/feed-icon-10x10.png Resolving planet.infomotions.com... 162.243.47.94 Connecting to planet.infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 469 [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-8963.png’ 0K 100% 104M=0s 2020-11-28 14:29:31 (104 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-8963.png’ saved [469/469] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-172.html --2020-11-28 14:29:31-- http://infomotions.com/sandbox/ Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-172.html’ 0K ...... 112M=0s 2020-11-28 14:29:31 (112 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-172.html’ saved [6296] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-172.html... 2-55 Converted links in 1 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3852.html --2020-11-28 14:29:31-- http://infomotions.com/ Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 5308 (5.2K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3852.html’ 0K ..... 100% 973M=0s 2020-11-28 14:29:31 (973 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3852.html’ saved [5308/5308] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3852.html... 2-19 Converted links in 1 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-7919.xml --2020-11-28 14:29:31-- http://planet.infomotions.com/foafroll.xml Resolving planet.infomotions.com... 162.243.47.94 Connecting to planet.infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 3627 (3.5K) [text/xml] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-7919.xml’ 0K ... 100% 736M=0s 2020-11-28 14:29:31 (736 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-7919.xml’ saved [3627/3627] 404 Not Found at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-3359.xml --2020-11-28 14:29:31-- http://planet.infomotions.com/rss10.xml Resolving planet.infomotions.com... 162.243.47.94 Connecting to planet.infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 2990206 (2.9M) [text/xml] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-3359.xml’ 0K .......... .......... .......... .......... .......... 1% 3.62M 1s 50K .......... .......... .......... .......... .......... 3% 7.80M 1s 100K .......... .......... .......... .......... .......... 5% 150M 0s 150K .......... .......... .......... .......... .......... 6% 7.88M 0s 200K .......... .......... .......... .......... .......... 8% 128M 0s 250K .......... .......... .......... .......... .......... 10% 209M 0s 300K .......... .......... .......... .......... .......... 11% 8.41M 0s 350K .......... .......... .......... .......... .......... 13% 155M 0s 400K .......... .......... .......... .......... .......... 15% 103M 0s 450K .......... .......... .......... .......... .......... 17% 367M 0s 500K .......... .......... .......... .......... .......... 18% 328M 0s 550K .......... .......... .......... .......... .......... 20% 378M 0s 600K .......... .......... .......... .......... .......... 22% 416M 0s 650K .......... .......... .......... .......... .......... 23% 8.31M 0s 700K .......... .......... .......... .......... .......... 25% 143M 0s 750K .......... .......... .......... .......... .......... 27% 267M 0s 800K .......... .......... .......... .......... .......... 29% 337M 0s 850K .......... .......... .......... .......... .......... 30% 308M 0s 900K .......... .......... .......... .......... .......... 32% 360M 0s 950K .......... .......... .......... .......... .......... 34% 376M 0s 1000K .......... .......... .......... .......... .......... 35% 322M 0s 1050K .......... .......... .......... .......... .......... 37% 359M 0s 1100K .......... .......... .......... .......... .......... 39% 243M 0s 1150K .......... .......... .......... .......... .......... 41% 347M 0s 1200K .......... .......... .......... .......... .......... 42% 10.9M 0s 1250K .......... .......... .......... .......... .......... 44% 76.0M 0s 1300K .......... .......... .......... .......... .......... 46% 118M 0s 1350K .......... .......... .......... .......... .......... 47% 357M 0s 1400K .......... .......... .......... .......... .......... 49% 212M 0s 1450K .......... .......... .......... .......... .......... 51% 394M 0s 1500K .......... .......... .......... .......... .......... 53% 365M 0s 1550K .......... .......... .......... .......... .......... 54% 358M 0s 1600K .......... .......... .......... .......... .......... 56% 269M 0s 1650K .......... .......... .......... .......... .......... 58% 352M 0s 1700K .......... .......... .......... .......... .......... 59% 268M 0s 1750K .......... .......... .......... .......... .......... 61% 371M 0s 1800K .......... .......... .......... .......... .......... 63% 334M 0s 1850K .......... .......... .......... .......... .......... 65% 378M 0s 1900K .......... .......... .......... .......... .......... 66% 360M 0s 1950K .......... .......... .......... .......... .......... 68% 344M 0s 2000K .......... .......... .......... .......... .......... 70% 277M 0s 2050K .......... .......... .......... .......... .......... 71% 204M 0s 2100K .......... .......... .......... .......... .......... 73% 340M 0s 2150K .......... .......... .......... .......... .......... 75% 14.9M 0s 2200K .......... .......... .......... .......... .......... 77% 84.8M 0s 2250K .......... .......... .......... .......... .......... 78% 296M 0s 2300K .......... .......... .......... .......... .......... 80% 414M 0s 2350K .......... .......... .......... .......... .......... 82% 379M 0s 2400K .......... .......... .......... .......... .......... 83% 302M 0s 2450K .......... .......... .......... .......... .......... 85% 364M 0s 2500K .......... .......... .......... .......... .......... 87% 258M 0s 2550K .......... .......... .......... .......... .......... 89% 373M 0s 2600K .......... .......... .......... .......... .......... 90% 314M 0s 2650K .......... .......... .......... .......... .......... 92% 352M 0s 2700K .......... .......... .......... .......... .......... 94% 384M 0s 2750K .......... .......... .......... .......... .......... 95% 399M 0s 2800K .......... .......... .......... .......... .......... 97% 339M 0s 2850K .......... .......... .......... .......... .......... 99% 279M 0s 2900K .......... .......... 100% 391M=0.06s 2020-11-28 14:29:31 (51.4 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-3359.xml’ saved [2990206/2990206] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-8900.xml --2020-11-28 14:29:31-- http://planet.infomotions.com/rss20.xml Resolving planet.infomotions.com... 162.243.47.94 Connecting to planet.infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 2942902 (2.8M) [text/xml] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-8900.xml’ 0K .......... .......... .......... .......... .......... 1% 3.75M 1s 50K .......... .......... .......... .......... .......... 3% 7.50M 1s 100K .......... .......... .......... .......... .......... 5% 203M 0s 150K .......... .......... .......... .......... .......... 6% 209M 0s 200K .......... .......... .......... .......... .......... 8% 7.77M 0s 250K .......... .......... .......... .......... .......... 10% 372M 0s 300K .......... .......... .......... .......... .......... 12% 380M 0s 350K .......... .......... .......... .......... .......... 13% 374M 0s 400K .......... .......... .......... .......... .......... 15% 8.04M 0s 450K .......... .......... .......... .......... .......... 17% 155M 0s 500K .......... .......... .......... .......... .......... 19% 396M 0s 550K .......... .......... .......... .......... .......... 20% 254M 0s 600K .......... .......... .......... .......... .......... 22% 559M 0s 650K .......... .......... .......... .......... .......... 24% 489M 0s 700K .......... .......... .......... .......... .......... 26% 371M 0s 750K .......... .......... .......... .......... .......... 27% 542M 0s 800K .......... .......... .......... .......... .......... 29% 572M 0s 850K .......... .......... .......... .......... .......... 31% 9.02M 0s 900K .......... .......... .......... .......... .......... 33% 228M 0s 950K .......... .......... .......... .......... .......... 34% 193M 0s 1000K .......... .......... .......... .......... .......... 36% 544M 0s 1050K .......... .......... .......... .......... .......... 38% 273M 0s 1100K .......... .......... .......... .......... .......... 40% 367M 0s 1150K .......... .......... .......... .......... .......... 41% 293M 0s 1200K .......... .......... .......... .......... .......... 43% 544M 0s 1250K .......... .......... .......... .......... .......... 45% 581M 0s 1300K .......... .......... .......... .......... .......... 46% 329M 0s 1350K .......... .......... .......... .......... .......... 48% 600M 0s 1400K .......... .......... .......... .......... .......... 50% 603M 0s 1450K .......... .......... .......... .......... .......... 52% 574M 0s 1500K .......... .......... .......... .......... .......... 53% 413M 0s 1550K .......... .......... .......... .......... .......... 55% 10.4M 0s 1600K .......... .......... .......... .......... .......... 57% 326M 0s 1650K .......... .......... .......... .......... .......... 59% 394M 0s 1700K .......... .......... .......... .......... .......... 60% 496M 0s 1750K .......... .......... .......... .......... .......... 62% 328M 0s 1800K .......... .......... .......... .......... .......... 64% 506M 0s 1850K .......... .......... .......... .......... .......... 66% 361M 0s 1900K .......... .......... .......... .......... .......... 67% 490M 0s 1950K .......... .......... .......... .......... .......... 69% 416M 0s 2000K .......... .......... .......... .......... .......... 71% 326M 0s 2050K .......... .......... .......... .......... .......... 73% 445M 0s 2100K .......... .......... .......... .......... .......... 74% 379M 0s 2150K .......... .......... .......... .......... .......... 76% 405M 0s 2200K .......... .......... .......... .......... .......... 78% 399M 0s 2250K .......... .......... .......... .......... .......... 80% 349M 0s 2300K .......... .......... .......... .......... .......... 81% 239M 0s 2350K .......... .......... .......... .......... .......... 83% 385M 0s 2400K .......... .......... .......... .......... .......... 85% 466M 0s 2450K .......... .......... .......... .......... .......... 86% 570M 0s 2500K .......... .......... .......... .......... .......... 88% 547M 0s 2550K .......... .......... .......... .......... .......... 90% 570M 0s 2600K .......... .......... .......... .......... .......... 92% 496M 0s 2650K .......... .......... .......... .......... .......... 93% 562M 0s 2700K .......... .......... .......... .......... .......... 95% 593M 0s 2750K .......... .......... .......... .......... .......... 97% 551M 0s 2800K .......... .......... .......... .......... .......... 99% 347M 0s 2850K .......... .......... ... 100% 485M=0.05s 2020-11-28 14:29:31 (57.6 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-8900.xml’ saved [2942902/2942902] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-9545.xml --2020-11-28 14:29:31-- http://planet.infomotions.com/atom.xml Resolving planet.infomotions.com... 162.243.47.94 Connecting to planet.infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 3315897 (3.2M) [text/xml] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-9545.xml’ 0K .......... .......... .......... .......... .......... 1% 3.95M 1s 50K .......... .......... .......... .......... .......... 3% 8.06M 1s 100K .......... .......... .......... .......... .......... 4% 193M 0s 150K .......... .......... .......... .......... .......... 6% 8.34M 0s 200K .......... .......... .......... .......... .......... 7% 429M 0s 250K .......... .......... .......... .......... .......... 9% 253M 0s 300K .......... .......... .......... .......... .......... 10% 8.47M 0s 350K .......... .......... .......... .......... .......... 12% 210M 0s 400K .......... .......... .......... .......... .......... 13% 165M 0s 450K .......... .......... .......... .......... .......... 15% 452M 0s 500K .......... .......... .......... .......... .......... 16% 403M 0s 550K .......... .......... .......... .......... .......... 18% 585M 0s 600K .......... .......... .......... .......... .......... 20% 547M 0s 650K .......... .......... .......... .......... .......... 21% 9.19M 0s 700K .......... .......... .......... .......... .......... 23% 257M 0s 750K .......... .......... .......... .......... .......... 24% 178M 0s 800K .......... .......... .......... .......... .......... 26% 268M 0s 850K .......... .......... .......... .......... .......... 27% 415M 0s 900K .......... .......... .......... .......... .......... 29% 563M 0s 950K .......... .......... .......... .......... .......... 30% 564M 0s 1000K .......... .......... .......... .......... .......... 32% 480M 0s 1050K .......... .......... .......... .......... .......... 33% 580M 0s 1100K .......... .......... .......... .......... .......... 35% 255M 0s 1150K .......... .......... .......... .......... .......... 37% 10.3M 0s 1200K .......... .......... .......... .......... .......... 38% 160M 0s 1250K .......... .......... .......... .......... .......... 40% 267M 0s 1300K .......... .......... .......... .......... .......... 41% 570M 0s 1350K .......... .......... .......... .......... .......... 43% 590M 0s 1400K .......... .......... .......... .......... .......... 44% 243M 0s 1450K .......... .......... .......... .......... .......... 46% 497M 0s 1500K .......... .......... .......... .......... .......... 47% 591M 0s 1550K .......... .......... .......... .......... .......... 49% 194M 0s 1600K .......... .......... .......... .......... .......... 50% 560M 0s 1650K .......... .......... .......... .......... .......... 52% 247M 0s 1700K .......... .......... .......... .......... .......... 54% 336M 0s 1750K .......... .......... .......... .......... .......... 55% 336M 0s 1800K .......... .......... .......... .......... .......... 57% 583M 0s 1850K .......... .......... .......... .......... .......... 58% 516M 0s 1900K .......... .......... .......... .......... .......... 60% 576M 0s 1950K .......... .......... .......... .......... .......... 61% 324M 0s 2000K .......... .......... .......... .......... .......... 63% 585M 0s 2050K .......... .......... .......... .......... .......... 64% 521M 0s 2100K .......... .......... .......... .......... .......... 66% 582M 0s 2150K .......... .......... .......... .......... .......... 67% 596M 0s 2200K .......... .......... .......... .......... .......... 69% 14.2M 0s 2250K .......... .......... .......... .......... .......... 71% 152M 0s 2300K .......... .......... .......... .......... .......... 72% 316M 0s 2350K .......... .......... .......... .......... .......... 74% 352M 0s 2400K .......... .......... .......... .......... .......... 75% 324M 0s 2450K .......... .......... .......... .......... .......... 77% 525M 0s 2500K .......... .......... .......... .......... .......... 78% 135M 0s 2550K .......... .......... .......... .......... .......... 80% 357M 0s 2600K .......... .......... .......... .......... .......... 81% 442M 0s 2650K .......... .......... .......... .......... .......... 83% 583M 0s 2700K .......... .......... .......... .......... .......... 84% 442M 0s 2750K .......... .......... .......... .......... .......... 86% 584M 0s 2800K .......... .......... .......... .......... .......... 88% 575M 0s 2850K .......... .......... .......... .......... .......... 89% 539M 0s 2900K .......... .......... .......... .......... .......... 91% 154M 0s 2950K .......... .......... .......... .......... .......... 92% 541M 0s 3000K .......... .......... .......... .......... .......... 94% 580M 0s 3050K .......... .......... .......... .......... .......... 95% 542M 0s 3100K .......... .......... .......... .......... .......... 97% 469M 0s 3150K .......... .......... .......... .......... .......... 98% 571M 0s 3200K .......... .......... .......... ........ 100% 483M=0.05s 2020-11-28 14:29:31 (61.1 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-9545.xml’ saved [3315897/3315897] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/serials-infomotions-com-5908.html --2020-11-28 14:29:31-- http://serials.infomotions.com/ Resolving serials.infomotions.com... 162.243.47.94 Connecting to serials.infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/serials-infomotions-com-5908.html’ 0K ......... 141M=0s 2020-11-28 14:29:31 (141 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/serials-infomotions-com-5908.html’ saved [10161] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/serials-infomotions-com-5908.html... 3-75 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-4104.html --2020-11-28 14:29:31-- http://planet.infomotions.com/timeline/ Resolving planet.infomotions.com... 162.243.47.94 Connecting to planet.infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 2653 (2.6K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-4104.html’ 0K .. 100% 506M=0s 2020-11-28 14:29:31 (506 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-4104.html’ saved [2653/2653] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/planet-infomotions-com-4104.html... 1-0 Converted links in 1 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-9318.xml --2020-11-28 14:29:31-- http://infomotions.com/musings/musings.rss Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 149783 (146K) [application/rss+xml] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-9318.xml’ 0K .......... .......... .......... .......... .......... 34% 3.97M 0s 50K .......... .......... .......... .......... .......... 68% 8.08M 0s 100K .......... .......... .......... .......... ...... 100% 190M=0.02s 2020-11-28 14:29:31 (7.69 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-9318.xml’ saved [149783/149783] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-555.html --2020-11-28 14:29:31-- http://infomotions.com/water/index.xml Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 302 Found Location: http://infomotions.com/water/water.htm [following] --2020-11-28 14:29:31-- http://infomotions.com/water/water.htm Reusing existing connection to infomotions.com:80. HTTP request sent, awaiting response... 200 OK Length: 1322 (1.3K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-555.html’ 0K . 100% 350M=0s 2020-11-28 14:29:31 (350 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-555.html’ saved [1322/1322] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-555.html... 0-2 Converted links in 1 files in 0 seconds. 403 Forbidden at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 405 Method Not Allowed at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/mallet-cs-umass-edu-3654.html --2020-11-28 14:29:31-- http://mallet.cs.umass.edu/ Resolving mallet.cs.umass.edu... 128.119.246.70 Connecting to mallet.cs.umass.edu|128.119.246.70|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 4699 (4.6K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/mallet-cs-umass-edu-3654.html’ 0K .... 100% 72.9M=0s 2020-11-28 14:29:31 (72.9 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/mallet-cs-umass-edu-3654.html’ saved [4699/4699] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/mallet-cs-umass-edu-3654.html... 2-29 Converted links in 1 files in 0 seconds. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/bit-ly-5230.html --2020-11-28 14:29:31-- http://bit.ly/2mgTKsp Resolving bit.ly... 67.199.248.10, 67.199.248.11 Connecting to bit.ly|67.199.248.10|:80... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://chrome.google.com/webstore/detail/link-grabber/caodelkhipncidmoebgbbeemedohcdma [following] --2020-11-28 14:29:31-- https://chrome.google.com/webstore/detail/link-grabber/caodelkhipncidmoebgbbeemedohcdma Resolving chrome.google.com... 172.217.164.174, 2607:f8b0:4004:804::200e Connecting to chrome.google.com|172.217.164.174|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/bit-ly-5230.html’ 0K .......... .......... .......... .......... .......... 15.1M 50K .......... .. 98.9M=0.003s 2020-11-28 14:29:31 (18.3 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/bit-ly-5230.html’ saved [64228] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/bit-ly-5230.html... 1-8 Converted links in 1 files in 0.001 seconds. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/bit-ly-8913.html --2020-11-28 14:29:31-- http://bit.ly/distantreader-slack Resolving bit.ly... 67.199.248.10, 67.199.248.11 Connecting to bit.ly|67.199.248.10|:80... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://join.slack.com/t/distantreader/shared_invite/enQtODY4MDE4MjQ1NTczLTIxNWM4NThlYzdmM2E0MWJkZjRjNmZhYWNiYWQ4N2NkOGFmMzc4MmMxMWExZGRkNGE1ZGRkYzczMjQ5NWNlYmU [following] --2020-11-28 14:29:31-- https://join.slack.com/t/distantreader/shared_invite/enQtODY4MDE4MjQ1NTczLTIxNWM4NThlYzdmM2E0MWJkZjRjNmZhYWNiYWQ4N2NkOGFmMzc4MmMxMWExZGRkNGE1ZGRkYzczMjQ5NWNlYmU Resolving join.slack.com... 54.211.89.16, 18.214.242.166, 54.87.197.95 Connecting to join.slack.com|54.211.89.16|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://distantreader.slack.com/join/shared_invite/enQtODY4MDE4MjQ1NTczLTIxNWM4NThlYzdmM2E0MWJkZjRjNmZhYWNiYWQ4N2NkOGFmMzc4MmMxMWExZGRkNGE1ZGRkYzczMjQ5NWNlYmU [following] --2020-11-28 14:29:31-- https://distantreader.slack.com/join/shared_invite/enQtODY4MDE4MjQ1NTczLTIxNWM4NThlYzdmM2E0MWJkZjRjNmZhYWNiYWQ4N2NkOGFmMzc4MmMxMWExZGRkNGE1ZGRkYzczMjQ5NWNlYmU Resolving distantreader.slack.com... 54.87.197.95, 54.211.89.16, 18.214.242.166 Connecting to distantreader.slack.com|54.87.197.95|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/bit-ly-8913.html’ 0K .......... .......... .......... ... 10.6M=0.003s 2020-11-28 14:29:31 (10.6 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/bit-ly-8913.html’ saved [34133] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/bit-ly-8913.html... 1-0 Converted links in 1 files in 0 seconds. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 404 Not Found at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/2020-code4lib-org-5785.html --2020-11-28 14:29:31-- https://2020.code4lib.org/workshops/Using-hacking-the-Distant-Reader Resolving 2020.code4lib.org... 185.199.108.153, 185.199.109.153, 185.199.110.153, ... Connecting to 2020.code4lib.org|185.199.108.153|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 11520 (11K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/2020-code4lib-org-5785.html’ 0K .......... . 100% 108M=0s 2020-11-28 14:29:32 (108 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/2020-code4lib-org-5785.html’ saved [11520/11520] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/2020-code4lib-org-5785.html... 1-25 Converted links in 1 files in 0 seconds. 500 Can't connect to carrels.distantreader.org:443 (certificate verify failed) at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 400 URL must be absolute at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 No Host option provided at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/twitter-com-9838.html --2020-11-28 14:29:31-- http://twitter.com/readerdistant Resolving twitter.com... 104.244.42.65, 104.244.42.193 Connecting to twitter.com|104.244.42.65|:80... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://twitter.com/readerdistant [following] --2020-11-28 14:29:31-- https://twitter.com/readerdistant Connecting to twitter.com|104.244.42.65|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/twitter-com-9838.html’ 0K .......... .......... .......... .......... ... 1.11M=0.04s 2020-11-28 14:29:32 (1.11 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/twitter-com-9838.html’ saved [44755] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/twitter-com-9838.html... 3-13 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-tufts-edu-6731.xml --2020-11-28 14:29:32-- http://sites.tufts.edu/liam/feed/ Resolving sites.tufts.edu... 130.64.212.136 Connecting to sites.tufts.edu|130.64.212.136|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 733 [application/rss+xml] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-tufts-edu-6731.xml’ 0K 100% 125M=0s 2020-11-28 14:29:32 (125 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-tufts-edu-6731.xml’ saved [733/733] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/stedolan-github-io-4569.html --2020-11-28 14:29:32-- https://stedolan.github.io/jq/ Resolving stedolan.github.io... 185.199.108.153, 185.199.109.153, 185.199.110.153, ... Connecting to stedolan.github.io|185.199.108.153|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 7984 (7.8K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/stedolan-github-io-4569.html’ 0K ....... 100% 105M=0s 2020-11-28 14:29:32 (105 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/stedolan-github-io-4569.html’ saved [7984/7984] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/stedolan-github-io-4569.html... 3-11 Converted links in 1 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/tika-apache-org-2948.html --2020-11-28 14:29:31-- http://tika.apache.org/ Resolving tika.apache.org... 95.216.24.32, 40.79.78.1, 95.216.26.30, ... Connecting to tika.apache.org|95.216.24.32|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 41947 (41K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/tika-apache-org-2948.html’ 0K .......... .......... .......... .......... 100% 364K=0.1s 2020-11-28 14:29:32 (364 KB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/tika-apache-org-2948.html’ saved [41947/41947] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/tika-apache-org-2948.html... 0-39 Converted links in 1 files in 0.001 seconds. 404 Not Found at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/dh-crc-nd-edu-7757.html --2020-11-28 14:29:31-- http://dh.crc.nd.edu/sandbox/gutenberg/cgi-bin/search.cgi Resolving dh.crc.nd.edu... 129.74.246.86 Connecting to dh.crc.nd.edu|129.74.246.86|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/dh-crc-nd-edu-7757.html’ 0K 161M=0s 2020-11-28 14:29:32 (161 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/dh-crc-nd-edu-7757.html’ saved [979] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/dh-crc-nd-edu-7757.html... 2-2 Converted links in 1 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/www-gnu-org-8892.html --2020-11-28 14:29:32-- https://www.gnu.org/software/parallel/ Resolving www.gnu.org... 209.51.188.148, 2001:470:142:3::a Connecting to www.gnu.org|209.51.188.148|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/www-gnu-org-8892.html’ 0K .......... .......... .. 1.77M=0.01s 2020-11-28 14:29:32 (1.77 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/www-gnu-org-8892.html’ saved [22995] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/www-gnu-org-8892.html... 5-77 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/www-gutenberg-org-6207.html --2020-11-28 14:29:32-- https://www.gutenberg.org/ Resolving www.gutenberg.org... 152.19.134.47, 2610:28:3090:3000:0:bad:cafe:47 Connecting to www.gutenberg.org|152.19.134.47|:443... connected. ERROR: The certificate of ‘www.gutenberg.org’ is not trusted. ERROR: The certificate of ‘www.gutenberg.org’ has expired. Converted links in 0 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/ucla-zoom-us-1408.html --2020-11-28 14:29:32-- https://ucla.zoom.us/j/3107947789 Resolving ucla.zoom.us... 52.202.62.202 Connecting to ucla.zoom.us|52.202.62.202|:443... connected. HTTP request sent, awaiting response... 200 Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/ucla-zoom-us-1408.html’ 0K ........ 95.4M=0s 2020-11-28 14:29:32 (95.4 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/ucla-zoom-us-1408.html’ saved [9101] no-follow attribute found in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/ucla-zoom-us-1408.html. Will not follow any links on this page Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/ucla-zoom-us-1408.html... 0-2 Converted links in 1 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/www-gutenberg-org-941.html --2020-11-28 14:29:32-- https://www.gutenberg.org/wiki/Gutenberg:Feeds Resolving www.gutenberg.org... 152.19.134.47, 2610:28:3090:3000:0:bad:cafe:47 Connecting to www.gutenberg.org|152.19.134.47|:443... connected. ERROR: The certificate of ‘www.gutenberg.org’ is not trusted. ERROR: The certificate of ‘www.gutenberg.org’ has expired. Converted links in 0 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-953.html --2020-11-28 14:29:32-- http://infomotions.com/musings/ Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-953.html’ 0K ..... 574M=0s 2020-11-28 14:29:32 (574 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-953.html’ saved [6016] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-953.html... 5-15 Converted links in 1 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-7801.html --2020-11-28 14:29:32-- https://github.com/ericleasemorgan/reader-lite Resolving github.com... 140.82.113.3 Connecting to github.com|140.82.113.3|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-7801.html’ 0K .......... .......... .......... .......... .......... 13.6M 50K .......... .......... .......... .......... .......... 22.5M 100K .......... .......... 33.5M=0.006s 2020-11-28 14:29:32 (18.5 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-7801.html’ saved [123508] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-7801.html... 15-71 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/distantreader-org-7009.html --2020-11-28 14:29:32-- https://distantreader.org/ Resolving distantreader.org... 156.56.104.84 Connecting to distantreader.org|156.56.104.84|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 10242 (10K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/distantreader-org-7009.html’ 0K .......... 100% 189M=0s 2020-11-28 14:29:32 (189 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/distantreader-org-7009.html’ saved [10242/10242] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/distantreader-org-7009.html... 0-18 Converted links in 1 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/distantreader-org-6471.html --2020-11-28 14:29:32-- https://distantreader.org/ Resolving distantreader.org... 156.56.104.84 Connecting to distantreader.org|156.56.104.84|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 10242 (10K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/distantreader-org-6471.html’ 0K .......... 100% 178M=0s 2020-11-28 14:29:32 (178 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/distantreader-org-6471.html’ saved [10242/10242] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/distantreader-org-6471.html... 0-18 Converted links in 1 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8326.html --2020-11-28 14:29:32-- https://github.com/ericleasemorgan/htid2books Resolving github.com... 140.82.113.3 Connecting to github.com|140.82.113.3|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8326.html’ 0K .......... .......... .......... .......... .......... 22.4M 50K .......... .......... .......... .......... .......... 45.2M 100K .......... .......... .......... .......... ... 70.1M=0.004s 2020-11-28 14:29:32 (36.3 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8326.html’ saved [146993] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8326.html... 27-81 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-2983.html --2020-11-28 14:29:32-- https://github.com/ericleasemorgan/reader-extras Resolving github.com... 140.82.113.3 Connecting to github.com|140.82.113.3|:443... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://github.com/ericleasemorgan/reader-toolbox [following] --2020-11-28 14:29:32-- https://github.com/ericleasemorgan/reader-toolbox Reusing existing connection to github.com:443. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-2983.html’ 0K .......... .......... .......... .......... .......... 15.0M 50K .......... .......... .......... .......... .......... 29.4M 100K .......... .......... .......... 57.7M=0.005s 2020-11-28 14:29:32 (23.4 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-2983.html’ saved [133758] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-2983.html... 15-80 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/dh-crc-nd-edu-1806.xml --2020-11-28 14:29:32-- http://dh.crc.nd.edu/blog/feed/ Resolving dh.crc.nd.edu... 129.74.246.86 Connecting to dh.crc.nd.edu|129.74.246.86|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [application/rss+xml] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/dh-crc-nd-edu-1806.xml’ 0K .......... .......... .......... .......... .... 367K=0.1s 2020-11-28 14:29:32 (367 KB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/dh-crc-nd-edu-1806.xml’ saved [46036] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/www-xsede-org-5929.html --2020-11-28 14:29:32-- https://www.xsede.org/ Resolving www.xsede.org... 129.114.97.122 Connecting to www.xsede.org|129.114.97.122|:443... connected. ERROR: The certificate of ‘www.xsede.org’ is not trusted. ERROR: The certificate of ‘www.xsede.org’ has expired. Converted links in 0 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/docs-pkp-sfu-ca-7101.html --2020-11-28 14:29:32-- https://docs.pkp.sfu.ca/dev/api/ojs/3.1 Resolving docs.pkp.sfu.ca... 204.187.13.158 Connecting to docs.pkp.sfu.ca|204.187.13.158|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 4194 (4.1K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/docs-pkp-sfu-ca-7101.html’ 0K .... 100% 339M=0s 2020-11-28 14:29:33 (339 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/docs-pkp-sfu-ca-7101.html’ saved [4194/4194] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/docs-pkp-sfu-ca-7101.html... 0-9 Converted links in 1 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8025.html --2020-11-28 14:29:32-- https://github.com/ericleasemorgan/ojs-toolbox Resolving github.com... 140.82.113.3 Connecting to github.com|140.82.113.3|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8025.html’ 0K .......... .......... .......... .......... .......... 12.0M 50K .......... .......... .......... .......... .......... 13.7M 100K .......... .......... ... 29.4M=0.008s 2020-11-28 14:29:33 (14.3 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8025.html’ saved [126515] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8025.html... 16-69 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8202.html --2020-11-28 14:29:32-- https://github.com/senderle/topic-modeling-tool Resolving github.com... 140.82.113.3 Connecting to github.com|140.82.113.3|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8202.html’ 0K .......... .......... .......... .......... .......... 10.1M 50K .......... .......... .......... .......... .......... 38.9M 100K .......... .......... .......... .......... .... 78.2M=0.007s 2020-11-28 14:29:33 (21.3 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8202.html’ saved [148036] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-8202.html... 23-87 Converted links in 1 files in 0.002 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-379.html --2020-11-28 14:29:32-- https://github.com/ericleasemorgan/gutenberg-index Resolving github.com... 140.82.113.3 Connecting to github.com|140.82.113.3|:443... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://github.com/ericleasemorgan/reader-gutenberg [following] --2020-11-28 14:29:33-- https://github.com/ericleasemorgan/reader-gutenberg Reusing existing connection to github.com:443. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-379.html’ 0K .......... .......... .......... .......... .......... 22.7M 50K .......... .......... .......... .......... .......... 26.2M 100K .......... .......... ..... 39.1M=0.005s 2020-11-28 14:29:33 (26.3 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-379.html’ saved [128372] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-379.html... 15-76 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/www-laurenceanthony-net-8779.html --2020-11-28 14:29:32-- https://www.laurenceanthony.net/software/antconc/ Resolving www.laurenceanthony.net... 173.254.104.55 Connecting to www.laurenceanthony.net|173.254.104.55|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/www-laurenceanthony-net-8779.html’ 0K .......... .......... 304K=0.07s 2020-11-28 14:29:33 (304 KB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/www-laurenceanthony-net-8779.html’ saved [20673] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/www-laurenceanthony-net-8779.html... 2-94 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/youtu-be-1944.html --2020-11-28 14:29:32-- https://youtu.be/eeoBbSN9Esg Resolving youtu.be... 172.217.164.142, 2607:f8b0:4004:802::200e Connecting to youtu.be|172.217.164.142|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://www.youtube.com/watch?v=eeoBbSN9Esg&feature=youtu.be [following] --2020-11-28 14:29:32-- https://www.youtube.com/watch?v=eeoBbSN9Esg&feature=youtu.be Resolving www.youtube.com... 172.217.15.110, 172.217.2.110, 142.250.73.238, ... Connecting to www.youtube.com|172.217.15.110|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/youtu-be-1944.html’ 0K .......... .......... .......... .......... .......... 476K 50K .......... .......... .......... .......... .......... 85.0M 100K .......... .......... .......... .......... .......... 163K 150K .......... .......... .......... .......... .......... 71.6M 200K .......... .......... .......... .......... .......... 151M 250K .......... .......... .......... .......... .......... 125M 300K .......... .......... .......... .......... .......... 145M 350K .......... .......... .......... .......... 137M=0.4s 2020-11-28 14:29:33 (941 KB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/youtu-be-1944.html’ saved [399713] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/youtu-be-1944.html... 1-11 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-9780.html --2020-11-28 14:29:32-- https://github.com/ericleasemorgan/reader Resolving github.com... 140.82.113.4 Connecting to github.com|140.82.113.4|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-9780.html’ 0K .......... .......... .......... .......... .......... 12.2M 50K .......... .......... .......... .......... .......... 31.5M 100K .......... .......... .......... .......... ........ 13.0M=0.009s 2020-11-28 14:29:33 (15.8 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-9780.html’ saved [152483] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/github-com-9780.html... 16-96 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/dh-crc-nd-edu-9558.html --2020-11-28 14:29:32-- http://dh.crc.nd.edu/blog Resolving dh.crc.nd.edu... 129.74.246.86 Connecting to dh.crc.nd.edu|129.74.246.86|:80... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: http://dh.crc.nd.edu/blog/ [following] --2020-11-28 14:29:32-- http://dh.crc.nd.edu/blog/ Reusing existing connection to dh.crc.nd.edu:80. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/dh-crc-nd-edu-9558.html’ 0K .......... .......... .......... .......... .......... 500K 50K .......... .. 212K=0.2s 2020-11-28 14:29:33 (394 KB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/dh-crc-nd-edu-9558.html’ saved [63776] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/dh-crc-nd-edu-9558.html... 6-4 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/curl-haxx-se-8721.html --2020-11-28 14:29:32-- https://curl.haxx.se/ Resolving curl.haxx.se... 151.101.210.49, 2a04:4e42:3b::561 Connecting to curl.haxx.se|151.101.210.49|:443... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://curl.se/ [following] --2020-11-28 14:29:33-- https://curl.se/ Resolving curl.se... 151.101.66.49, 151.101.2.49, 151.101.130.49, ... Connecting to curl.se|151.101.66.49|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 8646 (8.4K) [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/curl-haxx-se-8721.html’ 0K ........ 100% 109M=0s 2020-11-28 14:29:33 (109 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/curl-haxx-se-8721.html’ saved [8646/8646] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/curl-haxx-se-8721.html... 0-7 Converted links in 1 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-9966.html --2020-11-28 14:29:32-- http://infomotions.com/blog/2010/03/michael-hart-in-roanoke-indiana/ Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-9966.html’ 0K .......... .......... .......... .......... .......... 134K 50K . 2926G=0.4s 2020-11-28 14:29:33 (138 KB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-9966.html’ saved [52771] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-9966.html... 8-2 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3769.html --2020-11-28 14:29:32-- http://infomotions.com/blog/2011/05/fun-with-rss-and-the-rss-aggregator-called-planet/ Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3769.html’ 0K .......... .......... .......... .......... 149K=0.3s 2020-11-28 14:29:33 (149 KB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3769.html’ saved [41331] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-3769.html... 3-2 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-2987.html --2020-11-28 14:29:32-- http://infomotions.com/blog/ Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-2987.html’ 0K .......... .......... .......... .......... .......... 490K 50K .......... .......... .......... .......... .......... 599K 100K .......... .......... .......... .......... .......... 492K 150K .......... .......... .......... .......... .......... 1.47M 200K .......... .......... .......... .......... .......... 332K 250K .......... .......... .......... .......... .......... 497K 300K .......... .......... .......... .......... .......... 293K 350K .......... .......... .......... .......... .......... 7.33M 400K .......... .......... .......... .......... .......... 772K 450K .......... .......... .......... .......... .......... 496K 500K .......... .......... ......... 500K=1.0s 2020-11-28 14:29:34 (544 KB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-2987.html’ saved [542214] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-2987.html... 40-2 Converted links in 1 files in 0.005 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/pkp-sfu-ca-4628.html --2020-11-28 14:29:32-- https://pkp.sfu.ca/ojs/ Resolving pkp.sfu.ca... 204.187.13.80 Connecting to pkp.sfu.ca|204.187.13.80|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/pkp-sfu-ca-4628.html’ 0K .......... .......... .......... .......... .......... 49.0K 50K ..... 10712G=1.0s 2020-11-28 14:29:34 (54.6 KB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/pkp-sfu-ca-4628.html’ saved [56951] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/pkp-sfu-ca-4628.html... 3-16 Converted links in 1 files in 0.001 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-9504.html --2020-11-28 14:29:33-- http://infomotions.com/blog Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: http://infomotions.com/blog/ [following] --2020-11-28 14:29:33-- http://infomotions.com/blog/ Reusing existing connection to infomotions.com:80. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-9504.html’ 0K .......... .......... .......... .......... .......... 326K 50K .......... .......... .......... .......... .......... 735K 100K .......... .......... .......... .......... .......... 980K 150K .......... .......... .......... .......... .......... 2.95M 200K .......... .......... .......... .......... .......... 697K 250K .......... .......... .......... .......... .......... 921K 300K .......... .......... .......... .......... .......... 569K 350K .......... .......... .......... .......... .......... 352M 400K .......... .......... .......... .......... .......... 1.34M 450K .......... .......... .......... .......... .......... 891K 500K .......... .......... ......... 600K=0.6s 2020-11-28 14:29:34 (821 KB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-9504.html’ saved [542214] Converting links in /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-9504.html... 40-2 Converted links in 1 files in 0.005 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-6757.xml --2020-11-28 14:29:32-- http://infomotions.com/blog/feed/ Resolving infomotions.com... 162.243.47.94 Connecting to infomotions.com|162.243.47.94|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [application/rss+xml] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-6757.xml’ 0K .......... .......... .......... .......... .......... 637K 50K .......... .......... .......... .......... .......... 662K 100K .......... .......... .......... .......... .......... 1.10M 150K .......... .......... .......... .......... .......... 723K 200K .......... .......... .......... .......... .......... 593K 250K .......... .......... .......... .......... .......... 819K 300K .......... .......... .......... .......... .......... 331K 350K .......... .......... .......... .......... .......... 7.97M 400K .......... .......... .......... .......... .......... 748K 450K .......... .......... .......... .......... .......... 516K 500K .......... .......... .......... .......... .......... 953K 550K .......... .......... .......... .......... .......... 552K 600K .......... .......... .......... .......... .......... 617K 650K .......... .......... .......... .......... .......... 753K 700K .......... .......... .......... .......... .......... 558K 750K .......... .......... .......... .......... .......... 502K 800K .......... .......... .......... .......... .......... 659K 850K .......... .......... .......... .......... .......... 660K 900K .......... .......... .......... .......... .......... 562K 950K .......... .......... .......... .......... .......... 667K 1000K .......... .......... .......... .......... .......... 703K 1050K .......... .......... .......... .......... .......... 376K 1100K .......... .......... .......... .......... .......... 776K 1150K .......... .......... .......... .......... .......... 994K 1200K .......... .......... .......... .......... .......... 848K 1250K .......... .......... .......... .......... .......... 674K 1300K .......... .......... .......... .......... .......... 549K 1350K .......... .......... .......... .......... .......... 697K 1400K .......... .......... .......... .......... .......... 673K 1450K .......... .......... .......... 434K=2.3s 2020-11-28 14:29:36 (647 KB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/infomotions-com-6757.xml’ saved [1515806] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-7928.jpg --2020-11-28 14:29:31-- http://sites.nd.edu/emorgan/files/2019/12/reader.jpg Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 107739 (105K) [image/jpeg] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-7928.jpg’ 0K .......... .......... .......... .......... .......... 47% 13.9M 0s 50K .......... .......... .......... .......... .......... 95% 50.6M 0s 100K ..... 100% 5.73M=0.005s 2020-11-28 14:29:36 (19.2 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-7928.jpg’ saved [107739/107739] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2497.png --2020-11-28 14:29:31-- http://sites.nd.edu/emorgan/files/2019/11/search.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 1051334 (1.0M) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2497.png’ 0K .......... .......... .......... .......... .......... 4% 3.78M 0s 50K .......... .......... .......... .......... .......... 9% 52.5M 0s 100K .......... .......... .......... .......... .......... 14% 24.9M 0s 150K .......... .......... .......... .......... .......... 19% 49.2M 0s 200K .......... .......... .......... .......... .......... 24% 24.9M 0s 250K .......... .......... .......... .......... .......... 29% 47.3M 0s 300K .......... .......... .......... .......... .......... 34% 46.9M 0s 350K .......... .......... .......... .......... .......... 38% 47.9M 0s 400K .......... .......... .......... .......... .......... 43% 48.5M 0s 450K .......... .......... .......... .......... .......... 48% 48.1M 0s 500K .......... .......... .......... .......... .......... 53% 47.4M 0s 550K .......... .......... .......... .......... .......... 58% 49.9M 0s 600K .......... .......... .......... .......... .......... 63% 47.6M 0s 650K .......... .......... .......... .......... .......... 68% 49.7M 0s 700K .......... .......... .......... .......... .......... 73% 48.8M 0s 750K .......... .......... .......... .......... .......... 77% 48.0M 0s 800K .......... .......... .......... .......... .......... 82% 539M 0s 850K .......... .......... .......... .......... .......... 87% 49.2M 0s 900K .......... .......... .......... .......... .......... 92% 49.9M 0s 950K .......... .......... .......... .......... .......... 97% 56.5M 0s 1000K .......... .......... ...... 100% 364M=0.03s 2020-11-28 14:29:36 (30.5 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2497.png’ saved [1051334/1051334] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6245.jpg --2020-11-28 14:29:31-- http://sites.nd.edu/emorgan/files/2019/10/wall-paper-01-300x225.jpg Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 23842 (23K) [image/jpeg] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6245.jpg’ 0K .......... .......... ... 100% 31.3M=0.001s 2020-11-28 14:29:36 (31.3 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6245.jpg’ saved [23842/23842] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6582.png --2020-11-28 14:29:34-- http://sites.nd.edu/emorgan/files/2020/01/antconc-dispersion-300x208.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 32914 (32K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6582.png’ 0K .......... .......... .......... .. 100% 45.2M=0.001s 2020-11-28 14:29:36 (45.2 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6582.png’ saved [32914/32914] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2178.jpg --2020-11-28 14:29:32-- http://sites.nd.edu/emorgan/files/2019/12/mobile.jpg Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 159142 (155K) [image/jpeg] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2178.jpg’ 0K .......... .......... .......... .......... .......... 32% 4.94M 0s 50K .......... .......... .......... .......... .......... 64% 64.4M 0s 100K .......... .......... .......... .......... .......... 96% 380M 0s 150K ..... 100% 302M=0.01s 2020-11-28 14:29:36 (14.1 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2178.jpg’ saved [159142/159142] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6875.jpg --2020-11-28 14:29:34-- http://sites.nd.edu/emorgan/files/2019/12/wall-paper.jpeg Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 47082 (46K) [image/jpeg] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6875.jpg’ 0K .......... .......... .......... .......... ..... 100% 4.40M=0.01s 2020-11-28 14:29:36 (4.40 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6875.jpg’ saved [47082/47082] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3940.png --2020-11-28 14:29:32-- http://sites.nd.edu/emorgan/files/2020/01/antconc-occurences.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 245164 (239K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3940.png’ 0K .......... .......... .......... .......... .......... 20% 2.69M 0s 50K .......... .......... .......... .......... .......... 41% 52.5M 0s 100K .......... .......... .......... .......... .......... 62% 31.0M 0s 150K .......... .......... .......... .......... .......... 83% 50.3M 0s 200K .......... .......... .......... ......... 100% 28.2M=0.02s 2020-11-28 14:29:36 (10.2 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3940.png’ saved [245164/245164] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-393.png --2020-11-28 14:29:31-- http://sites.nd.edu/emorgan/files/2019/11/search-300x249.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 45712 (45K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-393.png’ 0K .......... .......... .......... .......... .... 100% 4.18M=0.01s 2020-11-28 14:29:36 (4.18 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-393.png’ saved [45712/45712] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3471.jpg --2020-11-28 14:29:33-- http://sites.nd.edu/emorgan/files/2019/10/IMG_0902-300x225.jpeg Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 26478 (26K) [image/jpeg] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3471.jpg’ 0K .......... .......... ..... 100% 32.5M=0.001s 2020-11-28 14:29:37 (32.5 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3471.jpg’ saved [26478/26478] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3574.png --2020-11-28 14:29:32-- http://sites.nd.edu/emorgan/files/2019/11/urls.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 1189621 (1.1M) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3574.png’ 0K .......... .......... .......... .......... .......... 4% 1.85M 1s 50K .......... .......... .......... .......... .......... 8% 54.7M 0s 100K .......... .......... .......... .......... .......... 12% 31.3M 0s 150K .......... .......... .......... .......... .......... 17% 16.5M 0s 200K .......... .......... .......... .......... .......... 21% 62.0M 0s 250K .......... .......... .......... .......... .......... 25% 30.2M 0s 300K .......... .......... .......... .......... .......... 30% 33.4M 0s 350K .......... .......... .......... .......... .......... 34% 57.3M 0s 400K .......... .......... .......... .......... .......... 38% 56.7M 0s 450K .......... .......... .......... .......... .......... 43% 54.7M 0s 500K .......... .......... .......... .......... .......... 47% 63.0M 0s 550K .......... .......... .......... .......... .......... 51% 53.4M 0s 600K .......... .......... .......... .......... .......... 55% 61.9M 0s 650K .......... .......... .......... .......... .......... 60% 54.2M 0s 700K .......... .......... .......... .......... .......... 64% 247M 0s 750K .......... .......... .......... .......... .......... 68% 68.1M 0s 800K .......... .......... .......... .......... .......... 73% 42.0M 0s 850K .......... .......... .......... .......... .......... 77% 70.3M 0s 900K .......... .......... .......... .......... .......... 81% 270M 0s 950K .......... .......... .......... .......... .......... 86% 64.8M 0s 1000K .......... .......... .......... .......... .......... 90% 55.8M 0s 1050K .......... .......... .......... .......... .......... 94% 64.3M 0s 1100K .......... .......... .......... .......... .......... 98% 254M 0s 1150K .......... . 100% 666M=0.05s 2020-11-28 14:29:37 (24.1 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3574.png’ saved [1189621/1189621] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8707.png --2020-11-28 14:29:34-- http://sites.nd.edu/emorgan/files/2020/01/wordle-interesting-300x259.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 62553 (61K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8707.png’ 0K .......... .......... .......... .......... .......... 81% 18.4M 0s 50K .......... . 100% 338M=0.003s 2020-11-28 14:29:37 (22.2 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8707.png’ saved [62553/62553] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2246.png --2020-11-28 14:29:34-- http://sites.nd.edu/emorgan/files/2020/01/wordle-words-300x259.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 71918 (70K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2246.png’ 0K .......... .......... .......... .......... .......... 71% 2.70M 0s 50K .......... .......... 100% 306M=0.02s 2020-11-28 14:29:37 (3.78 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2246.png’ saved [71918/71918] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6066.jpg --2020-11-28 14:29:34-- http://sites.nd.edu/emorgan/files/2019/12/eric-and-marlon.jpg Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 99839 (97K) [image/jpeg] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6066.jpg’ 0K .......... .......... .......... .......... .......... 51% 20.7M 0s 50K .......... .......... .......... .......... ....... 100% 81.2M=0.003s 2020-11-28 14:29:37 (32.6 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6066.jpg’ saved [99839/99839] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1179.jpg --2020-11-28 14:29:34-- http://sites.nd.edu/emorgan/files/2019/11/wall-paper-1-300x225.jpg Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 23377 (23K) [image/jpeg] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1179.jpg’ 0K .......... .......... .. 100% 31.6M=0.001s 2020-11-28 14:29:37 (31.6 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1179.jpg’ saved [23377/23377] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1886.png --2020-11-28 14:29:34-- http://sites.nd.edu/emorgan/files/2020/01/antconc-significant.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 203032 (198K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1886.png’ 0K .......... .......... .......... .......... .......... 25% 32.6M 0s 50K .......... .......... .......... .......... .......... 50% 42.5M 0s 100K .......... .......... .......... .......... .......... 75% 40.7M 0s 150K .......... .......... .......... .......... ........ 100% 34.3M=0.005s 2020-11-28 14:29:37 (37.1 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1886.png’ saved [203032/203032] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3721.png --2020-11-28 14:29:34-- http://sites.nd.edu/emorgan/files/2019/11/urls-300x249.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 54788 (54K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3721.png’ 0K .......... .......... .......... .......... .......... 93% 13.5M 0s 50K ... 100% 6683G=0.004s 2020-11-28 14:29:37 (14.5 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3721.png’ saved [54788/54788] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2910.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/02/model-chapters-300x204.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 33299 (33K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2910.png’ 0K .......... .......... .......... .. 100% 42.3M=0.001s 2020-11-28 14:29:37 (42.3 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2910.png’ saved [33299/33299] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3585.jpg --2020-11-28 14:29:34-- http://sites.nd.edu/emorgan/files/2019/11/wall-paper.jpg Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 128220 (125K) [image/jpeg] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3585.jpg’ 0K .......... .......... .......... .......... .......... 39% 7.82M 0s 50K .......... .......... .......... .......... .......... 79% 64.1M 0s 100K .......... .......... ..... 100% 409M=0.007s 2020-11-28 14:29:37 (17.3 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3585.jpg’ saved [128220/128220] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6302.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/01/antconc-occurences-300x204.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 57705 (56K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6302.png’ 0K .......... .......... .......... .......... .......... 88% 14.4M 0s 50K ...... 100% 12117G=0.003s 2020-11-28 14:29:37 (16.2 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6302.png’ saved [57705/57705] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-5464.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/01/worlde-nouns-300x259.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 65653 (64K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-5464.png’ 0K .......... .......... .......... .......... .......... 77% 8.81M 0s 50K .......... .... 100% 291M=0.006s 2020-11-28 14:29:37 (11.2 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-5464.png’ saved [65653/65653] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8419.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/01/antconc-dispersion.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 221048 (216K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8419.png’ 0K .......... .......... .......... .......... .......... 23% 13.0M 0s 50K .......... .......... .......... .......... .......... 46% 50.5M 0s 100K .......... .......... .......... .......... .......... 69% 27.9M 0s 150K .......... .......... .......... .......... .......... 92% 10.3M 0s 200K .......... ..... 100% 274M=0.01s 2020-11-28 14:29:38 (18.7 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8419.png’ saved [221048/221048] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8489.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/02/model-chapters.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 385002 (376K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8489.png’ 0K .......... .......... .......... .......... .......... 13% 3.06M 0s 50K .......... .......... .......... .......... .......... 26% 54.2M 0s 100K .......... .......... .......... .......... .......... 39% 28.3M 0s 150K .......... .......... .......... .......... .......... 53% 49.2M 0s 200K .......... .......... .......... .......... .......... 66% 27.9M 0s 250K .......... .......... .......... .......... .......... 79% 53.8M 0s 300K .......... .......... .......... .......... .......... 93% 50.6M 0s 350K .......... .......... ..... 100% 29.3M=0.02s 2020-11-28 14:29:38 (15.3 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8489.png’ saved [385002/385002] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1918.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/01/worlde-nouns.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 287617 (281K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1918.png’ 0K .......... .......... .......... .......... .......... 17% 2.23M 0s 50K .......... .......... .......... .......... .......... 35% 46.2M 0s 100K .......... .......... .......... .......... .......... 53% 28.9M 0s 150K .......... .......... .......... .......... .......... 71% 27.1M 0s 200K .......... .......... .......... .......... .......... 89% 47.6M 0s 250K .......... .......... .......... 100% 32.0M=0.03s 2020-11-28 14:29:38 (9.67 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1918.png’ saved [287617/287617] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3187.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/02/model-books-300x204.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 7680 (7.5K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3187.png’ 0K ....... 100% 951M=0s 2020-11-28 14:29:38 (951 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3187.png’ saved [7680/7680] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2573.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/02/model-books.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 146687 (143K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2573.png’ 0K .......... .......... .......... .......... .......... 34% 1.73M 0s 50K .......... .......... .......... .......... .......... 69% 66.2M 0s 100K .......... .......... .......... .......... ... 100% 31.8M=0.03s 2020-11-28 14:29:38 (4.63 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2573.png’ saved [146687/146687] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8762.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/02/model-topics-01-284x300.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 41318 (40K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8762.png’ 0K .......... .......... .......... .......... 100% 35.2M=0.001s 2020-11-28 14:29:38 (35.2 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8762.png’ saved [41318/41318] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-7840.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/01/wordle-words.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 271112 (265K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-7840.png’ 0K .......... .......... .......... .......... .......... 18% 4.82M 0s 50K .......... .......... .......... .......... .......... 37% 58.2M 0s 100K .......... .......... .......... .......... .......... 56% 25.9M 0s 150K .......... .......... .......... .......... .......... 75% 48.8M 0s 200K .......... .......... .......... .......... .......... 94% 25.3M 0s 250K .......... .... 100% 406M=0.02s 2020-11-28 14:29:38 (16.3 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-7840.png’ saved [271112/271112] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8691.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/02/model-topics-02-250x300.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 20482 (20K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8691.png’ 0K .......... .......... 100% 27.6M=0.001s 2020-11-28 14:29:38 (27.6 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8691.png’ saved [20482/20482] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8089.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/02/openrefine-entities.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 375122 (366K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8089.png’ 0K .......... .......... .......... .......... .......... 13% 7.85M 0s 50K .......... .......... .......... .......... .......... 27% 67.0M 0s 100K .......... .......... .......... .......... .......... 40% 252M 0s 150K .......... .......... .......... .......... .......... 54% 568M 0s 200K .......... .......... .......... .......... .......... 68% 5.82M 0s 250K .......... .......... .......... .......... .......... 81% 387M 0s 300K .......... .......... .......... .......... .......... 95% 74.1M 0s 350K .......... ...... 100% 212M=0.02s 2020-11-28 14:29:38 (21.7 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8089.png’ saved [375122/375122] 503 Service Temporarily Unavailable at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-9146.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/01/antconc-significant-300x204.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 37744 (37K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-9146.png’ 0K .......... .......... .......... ...... 100% 50.5M=0.001s 2020-11-28 14:29:38 (50.5 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-9146.png’ saved [37744/37744] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-7631.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/02/openrefine-entities-300x227.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 44133 (43K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-7631.png’ 0K .......... .......... .......... .......... ... 100% 8.13M=0.005s 2020-11-28 14:29:38 (8.13 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-7631.png’ saved [44133/44133] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1522.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/02/model-topics-01.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 159817 (156K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1522.png’ 0K .......... .......... .......... .......... .......... 32% 4.72M 0s 50K .......... .......... .......... .......... .......... 64% 47.5M 0s 100K .......... .......... .......... .......... .......... 96% 25.9M 0s 150K ...... 100% 11580G=0.01s 2020-11-28 14:29:39 (11.5 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1522.png’ saved [159817/159817] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8448.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/01/wordle-interesting.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 293737 (287K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8448.png’ 0K .......... .......... .......... .......... .......... 17% 4.04M 0s 50K .......... .......... .......... .......... .......... 34% 66.5M 0s 100K .......... .......... .......... .......... .......... 52% 34.0M 0s 150K .......... .......... .......... .......... .......... 69% 67.4M 0s 200K .......... .......... .......... .......... .......... 87% 34.4M 0s 250K .......... .......... .......... ...... 100% 52.8M=0.02s 2020-11-28 14:29:39 (16.4 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-8448.png’ saved [293737/293737] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-9996.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/02/openrefine-faceted-persons-300x227.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 46950 (46K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-9996.png’ 0K .......... .......... .......... .......... ..... 100% 13.4M=0.003s 2020-11-28 14:29:39 (13.4 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-9996.png’ saved [46950/46950] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-460.png --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/files/2020/02/openrefine-lemmatized-nouns-300x227.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 45643 (45K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-460.png’ 0K .......... .......... .......... .......... .... 100% 2.80M=0.02s 2020-11-28 14:29:39 (2.80 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-460.png’ saved [45643/45643] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2908.png --2020-11-28 14:29:36-- http://sites.nd.edu/emorgan/files/2020/02/openrefine-faceted-persons.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 399913 (391K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2908.png’ 0K .......... .......... .......... .......... .......... 12% 1.79M 0s 50K .......... .......... .......... .......... .......... 25% 53.9M 0s 100K .......... .......... .......... .......... .......... 38% 29.2M 0s 150K .......... .......... .......... .......... .......... 51% 54.9M 0s 200K .......... .......... .......... .......... .......... 64% 29.4M 0s 250K .......... .......... .......... .......... .......... 76% 58.7M 0s 300K .......... .......... .......... .......... .......... 89% 54.8M 0s 350K .......... .......... .......... .......... 100% 41.6M=0.04s 2020-11-28 14:29:39 (10.9 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2908.png’ saved [399913/399913] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-755.png --2020-11-28 14:29:36-- http://sites.nd.edu/emorgan/files/2020/02/openrefine-pos-300x227.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 43437 (42K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-755.png’ 0K .......... .......... .......... .......... .. 100% 11.9M=0.003s 2020-11-28 14:29:39 (11.9 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-755.png’ saved [43437/43437] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6181.png --2020-11-28 14:29:36-- http://sites.nd.edu/emorgan/files/2020/02/model-topics-02.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 112212 (110K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6181.png’ 0K .......... .......... .......... .......... .......... 45% 4.21M 0s 50K .......... .......... .......... .......... .......... 91% 59.5M 0s 100K ......... 100% 12.5M=0.01s 2020-11-28 14:29:39 (8.13 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6181.png’ saved [112212/112212] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6432.png --2020-11-28 14:29:36-- http://sites.nd.edu/emorgan/files/2020/02/openrefine-lemmatized-nouns.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 369678 (361K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6432.png’ 0K .......... .......... .......... .......... .......... 13% 4.48M 0s 50K .......... .......... .......... .......... .......... 27% 60.7M 0s 100K .......... .......... .......... .......... .......... 41% 26.5M 0s 150K .......... .......... .......... .......... .......... 55% 21.4M 0s 200K .......... .......... .......... .......... .......... 69% 60.4M 0s 250K .......... .......... .......... .......... .......... 83% 34.5M 0s 300K .......... .......... .......... .......... .......... 96% 54.7M 0s 350K .......... . 100% 404M=0.02s 2020-11-28 14:29:39 (18.6 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6432.png’ saved [369678/369678] filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1664.png --2020-11-28 14:29:36-- http://sites.nd.edu/emorgan/files/2020/02/openrefine-pos.png Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 360547 (352K) [image/png] Saving to: ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1664.png’ 0K .......... .......... .......... .......... .......... 14% 6.69M 0s 50K .......... .......... .......... .......... .......... 28% 45.4M 0s 100K .......... .......... .......... .......... .......... 42% 27.5M 0s 150K .......... .......... .......... .......... .......... 56% 51.1M 0s 200K .......... .......... .......... .......... .......... 71% 24.4M 0s 250K .......... .......... .......... .......... .......... 85% 56.0M 0s 300K .......... .......... .......... .......... .......... 99% 45.8M 0s 350K .. 100% 3999G=0.02s 2020-11-28 14:29:39 (22.9 MB/s) - ‘/data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1664.png’ saved [360547/360547] 503 Service Unavailable at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 read timeout at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. 500 read timeout at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2650.html --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/2019/10/ojs-toolbox/ Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 503 Service Temporarily Unavailable 2020-11-28 14:29:46 ERROR 503: Service Temporarily Unavailable. Converted links in 0 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-6366.html --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/2019/10/dr-inputs/ Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 503 Service Temporarily Unavailable 2020-11-28 14:29:46 ERROR 503: Service Temporarily Unavailable. Converted links in 0 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3469.html --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/2020/02/topic-modeling/ Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 503 Service Temporarily Unavailable 2020-11-28 14:29:46 ERROR 503: Service Temporarily Unavailable. Converted links in 0 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-5154.html --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/2019/11/pg-dr/ Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 503 Service Temporarily Unavailable 2020-11-28 14:29:46 ERROR 503: Service Temporarily Unavailable. Converted links in 0 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-2818.html --2020-11-28 14:29:35-- http://sites.nd.edu/emorgan/2019/11/reader/ Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 503 Service Temporarily Unavailable 2020-11-28 14:29:46 ERROR 503: Service Temporarily Unavailable. Converted links in 0 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-1720.html --2020-11-28 14:29:38-- http://sites.nd.edu/emorgan/category/distant-reader/ Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 503 Service Unavailable 2020-11-28 14:29:46 ERROR 503: Service Unavailable. Converted links in 0 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-9191.html --2020-11-28 14:29:36-- http://sites.nd.edu/emorgan/2020/01/wordle/ Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 503 Service Temporarily Unavailable 2020-11-28 14:29:47 ERROR 503: Service Temporarily Unavailable. Converted links in 0 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3678.html --2020-11-28 14:29:36-- http://sites.nd.edu/emorgan/2019/12/reader-manifest/ Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 503 Service Temporarily Unavailable 2020-11-28 14:29:47 ERROR 503: Service Temporarily Unavailable. Converted links in 0 files in 0 seconds. 500 read timeout at /data-disk/reader-compute/reader-classic/bin/urls2cache.pl line 82. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3073.html --2020-11-28 14:29:37-- http://sites.nd.edu/emorgan/2020/01/dr-ucla/ Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 503 Service Temporarily Unavailable 2020-11-28 14:29:47 ERROR 503: Service Temporarily Unavailable. Converted links in 0 files in 0 seconds. filename: /data-disk/reader-compute/reader-classic/carrels/planet-infomotions/cache/sites-nd-edu-3118.html --2020-11-28 14:29:37-- http://sites.nd.edu/emorgan/2019/12/bloomington/ Resolving sites.nd.edu... 54.173.74.173 Connecting to sites.nd.edu|54.173.74.173|:80... connected. HTTP request sent, awaiting response... 503 Service Temporarily Unavailable 2020-11-28 14:29:48 ERROR 503: Service Temporarily Unavailable. Converted links in 0 files in 0 seconds. Building study carrel named planet-infomotions FILE: cache/infomotions-com-7836.gif OUTPUT: txt/infomotions-com-7836.txt FILE: cache/planet-infomotions-com-8963.png OUTPUT: txt/planet-infomotions-com-8963.txt FILE: cache/sites-nd-edu-9996.png OUTPUT: txt/sites-nd-edu-9996.txt FILE: cache/sites-nd-edu-460.png OUTPUT: txt/sites-nd-edu-460.txt FILE: cache/planet-infomotions-com-7919.xml OUTPUT: txt/planet-infomotions-com-7919.txt FILE: cache/sites-nd-edu-1522.png OUTPUT: txt/sites-nd-edu-1522.txt FILE: cache/serials-infomotions-com-5908.html OUTPUT: txt/serials-infomotions-com-5908.txt FILE: cache/infomotions-com-3852.html OUTPUT: txt/infomotions-com-3852.txt FILE: cache/infomotions-com-3637.html OUTPUT: txt/infomotions-com-3637.txt FILE: cache/sites-nd-edu-8448.png OUTPUT: txt/sites-nd-edu-8448.txt FILE: cache/planet-infomotions-com-4104.html OUTPUT: txt/planet-infomotions-com-4104.txt FILE: cache/infomotions-com-172.html OUTPUT: txt/infomotions-com-172.txt FILE: cache/bit-ly-5230.html OUTPUT: txt/bit-ly-5230.txt FILE: cache/infomotions-com-9318.xml OUTPUT: txt/infomotions-com-9318.txt FILE: cache/sites-nd-edu-2650.html OUTPUT: txt/sites-nd-edu-2650.txt FILE: cache/infomotions-com-9504.html OUTPUT: txt/infomotions-com-9504.txt FILE: cache/infomotions-com-555.html OUTPUT: txt/infomotions-com-555.txt FILE: cache/bit-ly-8913.html OUTPUT: txt/bit-ly-8913.txt FILE: cache/sites-nd-edu-6582.png OUTPUT: txt/sites-nd-edu-6582.txt FILE: cache/sites-nd-edu-2908.png OUTPUT: txt/sites-nd-edu-2908.txt FILE: cache/mallet-cs-umass-edu-3654.html OUTPUT: txt/mallet-cs-umass-edu-3654.txt FILE: cache/sites-nd-edu-2497.png OUTPUT: txt/sites-nd-edu-2497.txt FILE: cache/twitter-com-9838.html OUTPUT: txt/twitter-com-9838.txt FILE: cache/sites-nd-edu-9191.html OUTPUT: txt/sites-nd-edu-9191.txt FILE: cache/2020-code4lib-org-5785.html OUTPUT: txt/2020-code4lib-org-5785.txt FILE: cache/sites-nd-edu-755.png OUTPUT: txt/sites-nd-edu-755.txt FILE: cache/sites-nd-edu-3940.png OUTPUT: txt/sites-nd-edu-3940.txt FILE: cache/planet-infomotions-com-9545.xml OUTPUT: txt/planet-infomotions-com-9545.txt FILE: cache/sites-nd-edu-393.png OUTPUT: txt/sites-nd-edu-393.txt FILE: cache/dh-crc-nd-edu-7757.html OUTPUT: txt/dh-crc-nd-edu-7757.txt FILE: cache/www-gutenberg-org-941.html OUTPUT: txt/www-gutenberg-org-941.txt FILE: cache/sites-tufts-edu-6731.xml OUTPUT: txt/sites-tufts-edu-6731.txt FILE: cache/tika-apache-org-2948.html OUTPUT: txt/tika-apache-org-2948.txt FILE: cache/planet-infomotions-com-3359.xml OUTPUT: txt/planet-infomotions-com-3359.txt FILE: cache/stedolan-github-io-4569.html OUTPUT: txt/stedolan-github-io-4569.txt FILE: cache/www-gutenberg-org-6207.html OUTPUT: txt/www-gutenberg-org-6207.txt FILE: cache/dh-crc-nd-edu-1806.xml OUTPUT: txt/dh-crc-nd-edu-1806.txt FILE: cache/sites-nd-edu-6245.jpg OUTPUT: txt/sites-nd-edu-6245.txt FILE: cache/ucla-zoom-us-1408.html OUTPUT: txt/ucla-zoom-us-1408.txt FILE: cache/sites-nd-edu-3721.png OUTPUT: txt/sites-nd-edu-3721.txt FILE: cache/sites-nd-edu-7928.jpg OUTPUT: txt/sites-nd-edu-7928.txt FILE: cache/sites-nd-edu-1179.jpg OUTPUT: txt/sites-nd-edu-1179.txt FILE: cache/sites-nd-edu-6181.png OUTPUT: txt/sites-nd-edu-6181.txt FILE: cache/www-gnu-org-8892.html OUTPUT: txt/www-gnu-org-8892.txt FILE: cache/sites-nd-edu-3585.jpg OUTPUT: txt/sites-nd-edu-3585.txt FILE: cache/sites-nd-edu-2178.jpg OUTPUT: txt/sites-nd-edu-2178.txt FILE: cache/sites-nd-edu-6302.png OUTPUT: txt/sites-nd-edu-6302.txt FILE: cache/sites-nd-edu-6432.png OUTPUT: txt/sites-nd-edu-6432.txt FILE: cache/sites-nd-edu-3574.png OUTPUT: txt/sites-nd-edu-3574.txt FILE: cache/curl-haxx-se-8721.html OUTPUT: txt/curl-haxx-se-8721.txt FILE: cache/sites-nd-edu-1886.png OUTPUT: txt/sites-nd-edu-1886.txt FILE: cache/infomotions-com-953.html OUTPUT: txt/infomotions-com-953.txt FILE: cache/distantreader-org-7009.html OUTPUT: txt/distantreader-org-7009.txt FILE: cache/sites-nd-edu-6875.jpg OUTPUT: txt/sites-nd-edu-6875.txt FILE: cache/pkp-sfu-ca-4628.html OUTPUT: txt/pkp-sfu-ca-4628.txt FILE: cache/infomotions-com-9966.html OUTPUT: txt/infomotions-com-9966.txt FILE: cache/sites-nd-edu-2910.png OUTPUT: txt/sites-nd-edu-2910.txt FILE: cache/planet-infomotions-com-8900.xml OUTPUT: txt/planet-infomotions-com-8900.txt FILE: cache/www-xsede-org-5929.html OUTPUT: txt/www-xsede-org-5929.txt FILE: cache/github-com-2983.html OUTPUT: txt/github-com-2983.txt FILE: cache/sites-nd-edu-5464.png OUTPUT: txt/sites-nd-edu-5464.txt FILE: cache/sites-nd-edu-1664.png OUTPUT: txt/sites-nd-edu-1664.txt FILE: cache/distantreader-org-6471.html OUTPUT: txt/distantreader-org-6471.txt FILE: cache/github-com-8326.html OUTPUT: txt/github-com-8326.txt FILE: cache/sites-nd-edu-3678.html OUTPUT: txt/sites-nd-edu-3678.txt FILE: cache/sites-nd-edu-2246.png OUTPUT: txt/sites-nd-edu-2246.txt FILE: cache/infomotions-com-3769.html OUTPUT: txt/infomotions-com-3769.txt FILE: cache/github-com-7801.html OUTPUT: txt/github-com-7801.txt FILE: cache/sites-nd-edu-8419.png OUTPUT: txt/sites-nd-edu-8419.txt FILE: cache/youtu-be-1944.html OUTPUT: txt/youtu-be-1944.txt FILE: cache/sites-nd-edu-8707.png OUTPUT: txt/sites-nd-edu-8707.txt FILE: cache/github-com-379.html OUTPUT: txt/github-com-379.txt FILE: cache/docs-pkp-sfu-ca-7101.html OUTPUT: txt/docs-pkp-sfu-ca-7101.txt FILE: cache/infomotions-com-6757.xml OUTPUT: txt/infomotions-com-6757.txt FILE: cache/infomotions-com-2987.html OUTPUT: txt/infomotions-com-2987.txt FILE: cache/www-laurenceanthony-net-8779.html OUTPUT: txt/www-laurenceanthony-net-8779.txt FILE: cache/sites-nd-edu-1918.png OUTPUT: txt/sites-nd-edu-1918.txt FILE: cache/github-com-8202.html OUTPUT: txt/github-com-8202.txt FILE: cache/github-com-8025.html OUTPUT: txt/github-com-8025.txt FILE: cache/sites-nd-edu-7840.png OUTPUT: txt/sites-nd-edu-7840.txt FILE: cache/sites-nd-edu-3471.jpg OUTPUT: txt/sites-nd-edu-3471.txt FILE: cache/sites-nd-edu-8762.png OUTPUT: txt/sites-nd-edu-8762.txt FILE: cache/sites-nd-edu-3187.png OUTPUT: txt/sites-nd-edu-3187.txt FILE: cache/dh-crc-nd-edu-9558.html OUTPUT: txt/dh-crc-nd-edu-9558.txt FILE: cache/sites-nd-edu-3469.html OUTPUT: txt/sites-nd-edu-3469.txt FILE: cache/sites-nd-edu-2573.png OUTPUT: txt/sites-nd-edu-2573.txt FILE: cache/sites-nd-edu-8089.png OUTPUT: txt/sites-nd-edu-8089.txt FILE: cache/sites-nd-edu-7631.png OUTPUT: txt/sites-nd-edu-7631.txt FILE: cache/sites-nd-edu-3073.html OUTPUT: txt/sites-nd-edu-3073.txt FILE: cache/github-com-9780.html OUTPUT: txt/github-com-9780.txt FILE: cache/sites-nd-edu-8489.png OUTPUT: txt/sites-nd-edu-8489.txt FILE: cache/sites-nd-edu-6066.jpg OUTPUT: txt/sites-nd-edu-6066.txt FILE: cache/sites-nd-edu-5154.html OUTPUT: txt/sites-nd-edu-5154.txt FILE: cache/sites-nd-edu-8691.png OUTPUT: txt/sites-nd-edu-8691.txt FILE: cache/sites-nd-edu-2818.html OUTPUT: txt/sites-nd-edu-2818.txt FILE: cache/sites-nd-edu-6366.html OUTPUT: txt/sites-nd-edu-6366.txt FILE: cache/sites-nd-edu-1720.html OUTPUT: txt/sites-nd-edu-1720.txt FILE: cache/sites-nd-edu-9146.png OUTPUT: txt/sites-nd-edu-9146.txt FILE: cache/sites-nd-edu-3118.html OUTPUT: txt/sites-nd-edu-3118.txt === file2bib.sh === id: planet-infomotions-com-8963 author: title: planet-infomotions-com-8963 date: pages: extension: .png txt: ./txt/planet-infomotions-com-8963.txt cache: ./cache/planet-infomotions-com-8963.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=10, height=10, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=Software, value=Adobe ImageReady, encoding=ISO-8859-1, compression=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 gAMA 45000 height 10 resourceName b'planet-infomotions-com-8963.png' tEXt tEXtEntry keyword=Software, value=Adobe ImageReady tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 10 tiff:ImageWidth 10 width 10 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-3585 author: title: sites-nd-edu-3585 date: 2019-09-16 pages: extension: .jpg txt: ./txt/sites-nd-edu-3585.txt cache: ./cache/sites-nd-edu-3585.jpg Application Record Version 2 Caption Digest 37 26 105 78 35 39 7 109 73 151 16 252 134 26 150 242 Coded Character Set UTF-8 Component 1 Y component: Quantization table 0, Sampling factors 2 horiz/2 vert Component 2 Cb component: Quantization table 1, Sampling factors 1 horiz/1 vert Component 3 Cr component: Quantization table 1, Sampling factors 1 horiz/1 vert Compression Type Baseline Content-Type image/jpeg Creation-Date 2019-09-16T21:20:26 Data Precision 8 bits Date Created 2019:09:16 Digital Date Created 2019:09:16 Digital Time Created 21:20:26 Exif IFD0:Date/Time 2019:09:16 21:20:26 Exif IFD0:Make Apple Exif IFD0:Model iPhone SE Exif IFD0:Orientation Top, left side (Horizontal / normal) Exif IFD0:Resolution Unit Inch Exif IFD0:Software 12.4.1 Exif IFD0:X Resolution 72 dots per inch Exif IFD0:Y Resolution 72 dots per inch Exif SubIFD:Aperture Value f/2.2 Exif SubIFD:Brightness Value 2.387 Exif SubIFD:Color Space sRGB Exif SubIFD:Components Configuration YCbCr Exif SubIFD:Date/Time Digitized 2019:09:16 21:20:26 Exif SubIFD:Date/Time Original 2019:09:16 21:20:26 Exif SubIFD:Exif Image Height 480 pixels Exif SubIFD:Exif Image Width 640 pixels Exif SubIFD:Exif Version 2.21 Exif SubIFD:Exposure Bias Value 0 EV Exif SubIFD:Exposure Mode Auto exposure Exif SubIFD:Exposure Program Program normal Exif SubIFD:Exposure Time 1/30 sec Exif SubIFD:F-Number f/2.2 Exif SubIFD:Flash Flash did not fire Exif SubIFD:FlashPix Version 1.00 Exif SubIFD:Focal Length 4.2 mm Exif SubIFD:Focal Length 35 29 mm Exif SubIFD:ISO Speed Ratings 125 Exif SubIFD:Lens Make Apple Exif SubIFD:Lens Model iPhone SE back camera 4.15mm f/2.2 Exif SubIFD:Lens Specification 2175733/524273mm f/2.2 Exif SubIFD:Metering Mode Spot Exif SubIFD:Scene Capture Type Standard Exif SubIFD:Scene Type Directly photographed image Exif SubIFD:Sensing Method One-chip color area sensor Exif SubIFD:Shutter Speed Value 1/30 sec Exif SubIFD:Sub-Sec Time Digitized 448 Exif SubIFD:Sub-Sec Time Original 448 Exif SubIFD:Subject Location 2206 1515 753 756 Exif SubIFD:White Balance Mode Auto white balance File Modified Date Sat Nov 28 14:29:55 +00:00 2020 File Name apache-tika-6433934366015418722.tmp File Size 128220 bytes GPS:GPS Altitude 223.65 metres GPS:GPS Altitude Ref Sea level GPS:GPS Date Stamp 2019:09:17 GPS:GPS Dest Bearing 97.67 degrees GPS:GPS Dest Bearing Ref True direction GPS:GPS H Positioning Error 65 metres GPS:GPS Img Direction 97.67 degrees GPS:GPS Img Direction Ref True direction GPS:GPS Latitude 41° 41' 25.13" GPS:GPS Latitude Ref N GPS:GPS Longitude -86° 14' 56.98" GPS:GPS Longitude Ref W GPS:GPS Speed 0 km/h GPS:GPS Speed Ref km/h GPS:GPS Time-Stamp 01:20:20.000 UTC Image Height 480 pixels Image Width 640 pixels Last-Modified 2019-09-16T21:20:26 Last-Save-Date 2019-09-16T21:20:26 Number of Components 3 Number of Tables 4 Huffman tables Resolution Units none Run Time [104 values] Thumbnail Height Pixels 0 Thumbnail Width Pixels 0 Time Created 21:20:26 Unknown tag (0x0001) 10 Unknown tag (0x0002) [558 values] Unknown tag (0x0004) 1 Unknown tag (0x0005) 218 Unknown tag (0x0006) 212 Unknown tag (0x0007) 1 Unknown tag (0x0008) 1638/100253 -15329/15478 -10465/215882 Unknown tag (0x0009) 4371 Unknown tag (0x000e) 0 Unknown tag (0x0014) 4 Unknown tag (0x0017) 0 Unknown tag (0x0019) 0 Unknown tag (0x001f) 0 Unknown tag (0x0025) 0 Unknown tag (0x0026) 0 Unknown tag (0x0027) 0 X Resolution 72 dots X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.jpeg.JpegParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 7 XMP Value Count 4 Y Resolution 72 dots date 2019-09-16T21:20:26 dcterms:created 2019-09-16T21:20:26 dcterms:modified 2019-09-16T21:20:26 exif:DateTimeOriginal 2019-09-16T21:20:26 exif:ExposureTime 0.03333333333333333 exif:FNumber 2.2 exif:Flash false exif:FocalLength 4.15 exif:IsoSpeedRatings 125 geo:lat 41.690314 geo:long -86.249161 meta:creation-date 2019-09-16T21:20:26 meta:save-date 2019-09-16T21:20:26 modified 2019-09-16T21:20:26 resourceName b'sites-nd-edu-3585.jpg' tiff:BitsPerSample 8 tiff:ImageLength 480 tiff:ImageWidth 640 tiff:Make Apple tiff:Model iPhone SE tiff:Orientation 1 tiff:ResolutionUnit Inch tiff:Software 12.4.1 tiff:XResolution 72.0 tiff:YResolution 72.0 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-2908 author: title: sites-nd-edu-2908 date: pages: extension: .png txt: ./txt/sites-nd-edu-2908.txt cache: ./cache/sites-nd-edu-2908.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=1245, height=940, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 1245 Screenshot 940 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 940 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 1245 Screenshot 940 resourceName b'sites-nd-edu-2908.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 940 tiff:ImageWidth 1245 width 1245 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: infomotions-com-7836 author: title: infomotions-com-7836 date: pages: extension: .gif txt: ./txt/infomotions-com-7836.txt cache: ./cache/infomotions-com-7836.gif Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName lzw Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/gif Data SampleFormat Index Dimension HorizontalPixelOffset 0 Dimension ImageOrientation Normal Dimension VerticalPixelOffset 0 GraphicControlExtension disposalMethod=none, userInputFlag=false, transparentColorFlag=true, delayTime=0, transparentColorIndex=0 ImageDescriptor imageLeftPosition=0, imageTopPosition=0, imageWidth=85, imageHeight=88, interlaceFlag=false Transparency TransparentIndex 0 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 height 88 resourceName b'infomotions-com-7836.gif' tiff:ImageLength 88 tiff:ImageWidth 85 width 85 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-8448 author: title: sites-nd-edu-8448 date: pages: extension: .png txt: ./txt/sites-nd-edu-8448.txt cache: ./cache/sites-nd-edu-8448.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=935, height=807, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 935 Screenshot 807 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 807 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 935 Screenshot 807 resourceName b'sites-nd-edu-8448.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 807 tiff:ImageWidth 935 width 935 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-9996 author: title: sites-nd-edu-9996 date: pages: extension: .png txt: ./txt/sites-nd-edu-9996.txt cache: ./cache/sites-nd-edu-9996.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=227, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 227 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-9996.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 227 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-460 author: title: sites-nd-edu-460 date: pages: extension: .png txt: ./txt/sites-nd-edu-460.txt cache: ./cache/sites-nd-edu-460.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=227, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 227 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-460.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 227 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-6245 author: title: sites-nd-edu-6245 date: 2019-09-25 pages: extension: .jpg txt: ./txt/sites-nd-edu-6245.txt cache: ./cache/sites-nd-edu-6245.jpg Application Record Version 2 Coded Character Set UTF-8 Component 1 Y component: Quantization table 0, Sampling factors 2 horiz/2 vert Component 2 Cb component: Quantization table 1, Sampling factors 1 horiz/1 vert Component 3 Cr component: Quantization table 1, Sampling factors 1 horiz/1 vert Compression Type Baseline Content-Type image/jpeg Creation-Date 2019-09-25T09:29:54 Data Precision 8 bits Date Created 2019:09:25 Digital Date Created 2019:09:25 Digital Time Created 09:29:54 Exif IFD0:Date/Time 2019:09:25 09:29:54 Exif IFD0:Make Apple Exif IFD0:Model iPad Pro Exif IFD0:Orientation Top, left side (Horizontal / normal) Exif IFD0:Resolution Unit Inch Exif IFD0:Software 12.4.1 Exif IFD0:X Resolution 72 dots per inch Exif IFD0:Y Resolution 72 dots per inch Exif SubIFD:Aperture Value f/2.4 Exif SubIFD:Brightness Value 1.542 Exif SubIFD:Color Space sRGB Exif SubIFD:Components Configuration YCbCr Exif SubIFD:Date/Time Digitized 2019:09:25 09:29:54 Exif SubIFD:Date/Time Original 2019:09:25 09:29:54 Exif SubIFD:Exif Image Height 480 pixels Exif SubIFD:Exif Image Width 640 pixels Exif SubIFD:Exif Version 2.21 Exif SubIFD:Exposure Bias Value 0 EV Exif SubIFD:Exposure Mode Auto exposure Exif SubIFD:Exposure Program Program normal Exif SubIFD:Exposure Time 1/24 sec Exif SubIFD:F-Number f/2.4 Exif SubIFD:Flash Flash did not fire Exif SubIFD:FlashPix Version 1.00 Exif SubIFD:Focal Length 3.3 mm Exif SubIFD:Focal Length 35 31 mm Exif SubIFD:ISO Speed Ratings 250 Exif SubIFD:Lens Make Apple Exif SubIFD:Lens Model iPad Pro back camera 3.3mm f/2.4 Exif SubIFD:Lens Specification 3.3mm f/2.4 Exif SubIFD:Metering Mode Spot Exif SubIFD:Scene Capture Type Standard Exif SubIFD:Scene Type Directly photographed image Exif SubIFD:Sensing Method One-chip color area sensor Exif SubIFD:Shutter Speed Value 1/24 sec Exif SubIFD:Sub-Sec Time Digitized 799 Exif SubIFD:Sub-Sec Time Original 799 Exif SubIFD:Subject Location 1626 1294 610 612 Exif SubIFD:White Balance Mode Auto white balance File Modified Date Sat Nov 28 14:29:55 +00:00 2020 File Name apache-tika-7339202503970258145.tmp File Size 23842 bytes Image Height 225 pixels Image Width 300 pixels Last-Modified 2019-09-25T09:29:54 Last-Save-Date 2019-09-25T09:29:54 Number of Components 3 Number of Tables 4 Huffman tables Resolution Units inch Run Time [104 values] Thumbnail Height Pixels 0 Thumbnail Width Pixels 0 Time Created 09:29:54 Unknown tag (0x0001) 10 Unknown tag (0x0002) [558 values] Unknown tag (0x0004) 1 Unknown tag (0x0005) 195 Unknown tag (0x0006) 200 Unknown tag (0x0007) 1 Unknown tag (0x0008) 4463/198304 -17117/17062 -8765/95787 Unknown tag (0x0009) 4371 Unknown tag (0x000e) 4 Unknown tag (0x0014) 4 Unknown tag (0x0017) 0 Unknown tag (0x0019) 0 Unknown tag (0x001f) 0 Unknown tag (0x0025) 0 Unknown tag (0x0026) 0 Unknown tag (0x0027) 0 X Resolution 72 dots X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.jpeg.JpegParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 19 XMP Value Count 4 Y Resolution 72 dots date 2019-09-25T09:29:54 dcterms:created 2019-09-25T09:29:54 dcterms:modified 2019-09-25T09:29:54 exif:DateTimeOriginal 2019-09-25T09:29:54 exif:ExposureTime 0.041666666666666664 exif:FNumber 2.4 exif:Flash false exif:FocalLength 3.3 exif:IsoSpeedRatings 250 meta:creation-date 2019-09-25T09:29:54 meta:save-date 2019-09-25T09:29:54 modified 2019-09-25T09:29:54 resourceName b'sites-nd-edu-6245.jpg' tiff:BitsPerSample 8 tiff:ImageLength 480 tiff:ImageWidth 640 tiff:Make Apple tiff:Model iPad Pro tiff:Orientation 1 tiff:ResolutionUnit Inch tiff:Software 12.4.1 tiff:XResolution 72.0 tiff:YResolution 72.0 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-755 author: title: sites-nd-edu-755 date: pages: extension: .png txt: ./txt/sites-nd-edu-755.txt cache: ./cache/sites-nd-edu-755.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=227, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 227 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-755.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 227 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-7928 author: title: sites-nd-edu-7928 date: 2019-12-16 pages: extension: .jpg txt: ./txt/sites-nd-edu-7928.txt cache: ./cache/sites-nd-edu-7928.jpg Application Record Version 2 Caption Digest 110 134 221 126 28 6 143 244 56 174 43 71 231 249 254 252 Coded Character Set UTF-8 Component 1 Y component: Quantization table 0, Sampling factors 2 horiz/2 vert Component 2 Cb component: Quantization table 1, Sampling factors 1 horiz/1 vert Component 3 Cr component: Quantization table 1, Sampling factors 1 horiz/1 vert Compression Type Baseline Content-Type image/jpeg Creation-Date 2019-12-16T11:32:22 Data Precision 8 bits Date Created 2019:12:16 Digital Date Created 2019:12:16 Digital Time Created 11:32:22 Exif IFD0:Date/Time 2019:12:16 11:32:22 Exif IFD0:Make Apple Exif IFD0:Model iPhone SE Exif IFD0:Orientation Top, left side (Horizontal / normal) Exif IFD0:Resolution Unit Inch Exif IFD0:Software 13.1.3 Exif IFD0:X Resolution 72 dots per inch Exif IFD0:Y Resolution 72 dots per inch Exif SubIFD:Aperture Value f/2.2 Exif SubIFD:Brightness Value 0.982 Exif SubIFD:Color Space sRGB Exif SubIFD:Components Configuration YCbCr Exif SubIFD:Date/Time Digitized 2019:12:16 11:32:22 Exif SubIFD:Date/Time Original 2019:12:16 11:32:22 Exif SubIFD:Exif Image Height 640 pixels Exif SubIFD:Exif Image Width 480 pixels Exif SubIFD:Exif Version 2.31 Exif SubIFD:Exposure Bias Value 0 EV Exif SubIFD:Exposure Mode Auto exposure Exif SubIFD:Exposure Program Program normal Exif SubIFD:Exposure Time 1/30 sec Exif SubIFD:F-Number f/2.2 Exif SubIFD:Flash Flash did not fire Exif SubIFD:FlashPix Version 1.00 Exif SubIFD:Focal Length 4.2 mm Exif SubIFD:Focal Length 35 29 mm Exif SubIFD:ISO Speed Ratings 200 Exif SubIFD:Lens Make Apple Exif SubIFD:Lens Model iPhone SE back camera 4.15mm f/2.2 Exif SubIFD:Lens Specification 2175733/524273mm f/2.2 Exif SubIFD:Metering Mode Multi-segment Exif SubIFD:Scene Capture Type Standard Exif SubIFD:Scene Type Directly photographed image Exif SubIFD:Sensing Method One-chip color area sensor Exif SubIFD:Shutter Speed Value 1/30 sec Exif SubIFD:Sub-Sec Time Digitized 683 Exif SubIFD:Sub-Sec Time Original 683 Exif SubIFD:Subject Location 2015 1511 2217 1330 Exif SubIFD:White Balance Mode Auto white balance File Modified Date Sat Nov 28 14:29:56 +00:00 2020 File Name apache-tika-8950349326505332683.tmp File Size 107739 bytes GPS:GPS Altitude 262.01 metres GPS:GPS Altitude Ref Sea level GPS:GPS Dest Bearing 10.54 degrees GPS:GPS Dest Bearing Ref True direction GPS:GPS H Positioning Error 50 metres GPS:GPS Img Direction 10.54 degrees GPS:GPS Img Direction Ref True direction GPS:GPS Latitude 39° 10' 25.74" GPS:GPS Latitude Ref N GPS:GPS Longitude -86° 30' 1.9" GPS:GPS Longitude Ref W GPS:GPS Speed 0.42 km/h GPS:GPS Speed Ref km/h Image Height 640 pixels Image Width 480 pixels Last-Modified 2019-12-16T11:32:22 Last-Save-Date 2019-12-16T11:32:22 Number of Components 3 Number of Tables 4 Huffman tables Resolution Units none Run Time [104 values] Thumbnail Height Pixels 0 Thumbnail Width Pixels 0 Time Created 11:32:22 Unknown tag (0x0001) 11 Unknown tag (0x0002) [558 values] Unknown tag (0x0004) 1 Unknown tag (0x0005) 128 Unknown tag (0x0006) 117 Unknown tag (0x0007) 1 Unknown tag (0x0008) 12891/237131 -114343/126955 -18727/41351 Unknown tag (0x0009) 4371 Unknown tag (0x000e) 0 Unknown tag (0x0014) 4 Unknown tag (0x0017) 0 Unknown tag (0x0019) 0 Unknown tag (0x001f) 0 Unknown tag (0x0020) 186D4B29-3197-48BC-B2E7-13DC4DF43B5E Unknown tag (0x0025) 0 Unknown tag (0x0026) 0 Unknown tag (0x0027) 0 Unknown tag (0x002b) EB03E3E1-C119-46AA-8E43-E7FB50AA64CE X Resolution 72 dots X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.jpeg.JpegParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 7 XMP Value Count 4 Y Resolution 72 dots date 2019-12-16T11:32:22 dcterms:created 2019-12-16T11:32:22 dcterms:modified 2019-12-16T11:32:22 exif:DateTimeOriginal 2019-12-16T11:32:22 exif:ExposureTime 0.03333333333333333 exif:FNumber 2.2 exif:Flash false exif:FocalLength 4.15 exif:IsoSpeedRatings 200 geo:lat 39.173817 geo:long -86.500528 meta:creation-date 2019-12-16T11:32:22 meta:save-date 2019-12-16T11:32:22 modified 2019-12-16T11:32:22 resourceName b'sites-nd-edu-7928.jpg' tiff:BitsPerSample 8 tiff:ImageLength 640 tiff:ImageWidth 480 tiff:Make Apple tiff:Model iPhone SE tiff:Orientation 1 tiff:ResolutionUnit Inch tiff:Software 13.1.3 tiff:XResolution 72.0 tiff:YResolution 72.0 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-2497 author: title: sites-nd-edu-2497 date: pages: extension: .png txt: ./txt/sites-nd-edu-2497.txt cache: ./cache/sites-nd-edu-2497.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension HorizontalPixelSize 0.17639795 Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 Dimension VerticalPixelSize 0.17639795 IHDR width=2520, height=2094, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 2094 2520 1 , language=, compression=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 2094 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 2094 2520 1 pHYs pixelsPerUnitXAxis=5669, pixelsPerUnitYAxis=5669, unitSpecifier=meter resourceName b'sites-nd-edu-2497.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 2094 tiff:ImageWidth 2520 width 2520 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-2650 author: title: sites-nd-edu-2650 date: pages: extension: .html txt: ./txt/sites-nd-edu-2650.txt cache: ./cache/sites-nd-edu-2650.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'sites-nd-edu-2650.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-6582 author: title: sites-nd-edu-6582 date: pages: extension: .png txt: ./txt/sites-nd-edu-6582.txt cache: ./cache/sites-nd-edu-6582.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=208, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 208 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-6582.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 208 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: www-gutenberg-org-6207 author: title: www-gutenberg-org-6207 date: pages: extension: .html txt: ./txt/www-gutenberg-org-6207.txt cache: ./cache/www-gutenberg-org-6207.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'www-gutenberg-org-6207.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-6181 author: title: sites-nd-edu-6181 date: pages: extension: .png txt: ./txt/sites-nd-edu-6181.txt cache: ./cache/sites-nd-edu-6181.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=685, height=821, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 685 Screenshot 821 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 821 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 685 Screenshot 821 resourceName b'sites-nd-edu-6181.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 821 tiff:ImageWidth 685 width 685 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-9191 author: title: sites-nd-edu-9191 date: pages: extension: .html txt: ./txt/sites-nd-edu-9191.txt cache: ./cache/sites-nd-edu-9191.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'sites-nd-edu-9191.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: bit-ly-8913 author: title: Create Account | Slack date: pages: extension: .html txt: ./txt/bit-ly-8913.txt cache: ./cache/bit-ly-8913.html Content-Encoding UTF-8 Content-Language en-US Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 4 author Slack dc:title Create Account | Slack description Slack is a new way to communicate with your team. It’s faster, better organized, and more secure than email. keywords og:description Slack is a new way to communicate with your team. It’s faster, better organized, and more secure than email. og:image https://a.slack-edge.com/80588/marketing/img/meta/slack_hash_256.png og:site_name Slack og:title Create Account og:type website og:url https://slack.com/join/shared_invite/enQtODY4MDE4MjQ1NTczLTIxNWM4NThlYzdmM2E0MWJkZjRjNmZhYWNiYWQ4N2NkOGFmMzc4MmMxMWExZGRkNGE1ZGRkYzczMjQ5NWNlYmU referrer no-referrer refresh 0; URL=bit-ly-8913.html resourceName b'bit-ly-8913.html' superfish nofish title Create Account | Slack Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 112, in summary = summarize( text, word_count=COUNT, split=False ) File "/data-disk/python/lib/python3.8/site-packages/gensim/summarization/summarizer.py", line 428, in summarize raise ValueError("input must have more than one sentence") ValueError: input must have more than one sentence sites-nd-edu-1522 txt/../ent/sites-nd-edu-1522.ent === file2bib.sh === id: sites-nd-edu-1522 author: title: sites-nd-edu-1522 date: pages: extension: .png txt: ./txt/sites-nd-edu-1522.txt cache: ./cache/sites-nd-edu-1522.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=671, height=709, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= Screenshot 671 709 1 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 709 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= Screenshot 671 709 1 resourceName b'sites-nd-edu-1522.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 709 tiff:ImageWidth 671 width 671 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-3940 author: title: sites-nd-edu-3940 date: pages: extension: .png txt: ./txt/sites-nd-edu-3940.txt cache: ./cache/sites-nd-edu-3940.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=1046, height=712, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 1046 Screenshot 712 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 712 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 1046 Screenshot 712 resourceName b'sites-nd-edu-3940.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 712 tiff:ImageWidth 1046 width 1046 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' infomotions-com-7836 txt/../ent/infomotions-com-7836.ent === file2bib.sh === id: sites-nd-edu-1179 author: title: sites-nd-edu-1179 date: 2019-09-16 pages: extension: .jpg txt: ./txt/sites-nd-edu-1179.txt cache: ./cache/sites-nd-edu-1179.jpg Application Record Version 2 Coded Character Set UTF-8 Component 1 Y component: Quantization table 0, Sampling factors 2 horiz/2 vert Component 2 Cb component: Quantization table 1, Sampling factors 1 horiz/1 vert Component 3 Cr component: Quantization table 1, Sampling factors 1 horiz/1 vert Compression Type Baseline Content-Type image/jpeg Creation-Date 2019-09-16T21:20:46 Data Precision 8 bits Date Created 2019:09:16 Digital Date Created 2019:09:16 Digital Time Created 21:20:46 Exif IFD0:Date/Time 2019:09:16 21:20:46 Exif IFD0:Make Apple Exif IFD0:Model iPhone SE Exif IFD0:Orientation Top, left side (Horizontal / normal) Exif IFD0:Resolution Unit Inch Exif IFD0:Software 12.4.1 Exif IFD0:X Resolution 72 dots per inch Exif IFD0:Y Resolution 72 dots per inch Exif SubIFD:Aperture Value f/2.2 Exif SubIFD:Brightness Value 4.024 Exif SubIFD:Color Space sRGB Exif SubIFD:Components Configuration YCbCr Exif SubIFD:Date/Time Digitized 2019:09:16 21:20:46 Exif SubIFD:Date/Time Original 2019:09:16 21:20:46 Exif SubIFD:Exif Image Height 480 pixels Exif SubIFD:Exif Image Width 640 pixels Exif SubIFD:Exif Version 2.21 Exif SubIFD:Exposure Bias Value 0 EV Exif SubIFD:Exposure Mode Auto exposure Exif SubIFD:Exposure Program Program normal Exif SubIFD:Exposure Time 1/30 sec Exif SubIFD:F-Number f/2.2 Exif SubIFD:Flash Flash did not fire Exif SubIFD:FlashPix Version 1.00 Exif SubIFD:Focal Length 4.2 mm Exif SubIFD:Focal Length 35 29 mm Exif SubIFD:ISO Speed Ratings 40 Exif SubIFD:Lens Make Apple Exif SubIFD:Lens Model iPhone SE back camera 4.15mm f/2.2 Exif SubIFD:Lens Specification 2175733/524273mm f/2.2 Exif SubIFD:Metering Mode Spot Exif SubIFD:Scene Capture Type Standard Exif SubIFD:Scene Type Directly photographed image Exif SubIFD:Sensing Method One-chip color area sensor Exif SubIFD:Shutter Speed Value 1/30 sec Exif SubIFD:Sub-Sec Time Digitized 316 Exif SubIFD:Sub-Sec Time Original 316 Exif SubIFD:Subject Location 2617 1321 753 756 Exif SubIFD:White Balance Mode Auto white balance File Modified Date Sat Nov 28 14:29:57 +00:00 2020 File Name apache-tika-5771209030791206067.tmp File Size 23377 bytes GPS:GPS Altitude 223.65 metres GPS:GPS Altitude Ref Sea level GPS:GPS Date Stamp 2019:09:17 GPS:GPS Dest Bearing 101.42 degrees GPS:GPS Dest Bearing Ref True direction GPS:GPS H Positioning Error 65 metres GPS:GPS Img Direction 101.42 degrees GPS:GPS Img Direction Ref True direction GPS:GPS Latitude 41° 41' 25.13" GPS:GPS Latitude Ref N GPS:GPS Longitude -86° 14' 56.98" GPS:GPS Longitude Ref W GPS:GPS Speed 0 km/h GPS:GPS Speed Ref km/h GPS:GPS Time-Stamp 01:20:45.000 UTC Image Height 225 pixels Image Width 300 pixels Last-Modified 2019-09-16T21:20:46 Last-Save-Date 2019-09-16T21:20:46 Number of Components 3 Number of Tables 4 Huffman tables Resolution Units inch Run Time [104 values] Thumbnail Height Pixels 0 Thumbnail Width Pixels 0 Time Created 21:20:46 Unknown tag (0x0001) 10 Unknown tag (0x0002) [558 values] Unknown tag (0x0004) 1 Unknown tag (0x0005) 221 Unknown tag (0x0006) 213 Unknown tag (0x0007) 1 Unknown tag (0x0008) 3725/273464 -14474/14553 -8128/109105 Unknown tag (0x0009) 4371 Unknown tag (0x000e) 0 Unknown tag (0x0014) 4 Unknown tag (0x0017) 0 Unknown tag (0x0019) 0 Unknown tag (0x001f) 0 Unknown tag (0x0025) 0 Unknown tag (0x0026) 0 Unknown tag (0x0027) 0 X Resolution 72 dots X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.jpeg.JpegParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 7 XMP Value Count 4 Y Resolution 72 dots date 2019-09-16T21:20:46 dcterms:created 2019-09-16T21:20:46 dcterms:modified 2019-09-16T21:20:46 exif:DateTimeOriginal 2019-09-16T21:20:46 exif:ExposureTime 0.03333333333333333 exif:FNumber 2.2 exif:Flash false exif:FocalLength 4.15 exif:IsoSpeedRatings 40 geo:lat 41.690314 geo:long -86.249161 meta:creation-date 2019-09-16T21:20:46 meta:save-date 2019-09-16T21:20:46 modified 2019-09-16T21:20:46 resourceName b'sites-nd-edu-1179.jpg' tiff:BitsPerSample 8 tiff:ImageLength 480 tiff:ImageWidth 640 tiff:Make Apple tiff:Model iPhone SE tiff:Orientation 1 tiff:ResolutionUnit Inch tiff:Software 12.4.1 tiff:XResolution 72.0 tiff:YResolution 72.0 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-3721 author: title: sites-nd-edu-3721 date: pages: extension: .png txt: ./txt/sites-nd-edu-3721.txt cache: ./cache/sites-nd-edu-3721.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension HorizontalPixelSize 0.17639795 Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 Dimension VerticalPixelSize 0.17639795 IHDR width=300, height=249, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 249 pHYs pixelsPerUnitXAxis=5669, pixelsPerUnitYAxis=5669, unitSpecifier=meter resourceName b'sites-nd-edu-3721.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 249 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' infomotions-com-3637 txt/../ent/infomotions-com-3637.ent === file2bib.sh === id: sites-nd-edu-393 author: title: sites-nd-edu-393 date: pages: extension: .png txt: ./txt/sites-nd-edu-393.txt cache: ./cache/sites-nd-edu-393.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension HorizontalPixelSize 0.17639795 Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 Dimension VerticalPixelSize 0.17639795 IHDR width=300, height=249, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 249 pHYs pixelsPerUnitXAxis=5669, pixelsPerUnitYAxis=5669, unitSpecifier=meter resourceName b'sites-nd-edu-393.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 249 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-393 txt/../ent/sites-nd-edu-393.ent infomotions-com-7836 txt/../pos/infomotions-com-7836.pos planet-infomotions-com-8963 txt/../ent/planet-infomotions-com-8963.ent sites-nd-edu-8448 txt/../pos/sites-nd-edu-8448.pos sites-nd-edu-9191 txt/../ent/sites-nd-edu-9191.ent sites-nd-edu-2650 txt/../ent/sites-nd-edu-2650.ent sites-nd-edu-1179 txt/../ent/sites-nd-edu-1179.ent sites-nd-edu-460 txt/../ent/sites-nd-edu-460.ent sites-nd-edu-1522 txt/../pos/sites-nd-edu-1522.pos sites-nd-edu-3585 txt/../pos/sites-nd-edu-3585.pos sites-nd-edu-2908 txt/../ent/sites-nd-edu-2908.ent sites-nd-edu-9996 txt/../pos/sites-nd-edu-9996.pos planet-infomotions-com-4104 txt/../ent/planet-infomotions-com-4104.ent sites-nd-edu-7928 txt/../ent/sites-nd-edu-7928.ent dh-crc-nd-edu-7757 txt/../ent/dh-crc-nd-edu-7757.ent === file2bib.sh === id: sites-nd-edu-2178 author: title: sites-nd-edu-2178 date: 2019-12-16 pages: extension: .jpg txt: ./txt/sites-nd-edu-2178.txt cache: ./cache/sites-nd-edu-2178.jpg Application Record Version 2 Caption Digest 50 10 161 203 1 162 179 108 219 210 252 149 77 65 185 68 Coded Character Set UTF-8 Component 1 Y component: Quantization table 0, Sampling factors 2 horiz/2 vert Component 2 Cb component: Quantization table 1, Sampling factors 1 horiz/1 vert Component 3 Cr component: Quantization table 1, Sampling factors 1 horiz/1 vert Compression Type Baseline Content-Type image/jpeg Creation-Date 2019-12-16T13:58:58 Data Precision 8 bits Date Created 2019:12:16 Digital Date Created 2019:12:16 Digital Time Created 13:58:58 Exif IFD0:Date/Time 2019:12:16 13:58:58 Exif IFD0:Make Apple Exif IFD0:Model iPhone SE Exif IFD0:Orientation Top, left side (Horizontal / normal) Exif IFD0:Resolution Unit Inch Exif IFD0:Software 13.1.3 Exif IFD0:X Resolution 72 dots per inch Exif IFD0:Y Resolution 72 dots per inch Exif SubIFD:Aperture Value f/2.2 Exif SubIFD:Brightness Value 5.779 Exif SubIFD:Color Space sRGB Exif SubIFD:Components Configuration YCbCr Exif SubIFD:Date/Time Digitized 2019:12:16 13:58:58 Exif SubIFD:Date/Time Original 2019:12:16 13:58:58 Exif SubIFD:Exif Image Height 640 pixels Exif SubIFD:Exif Image Width 480 pixels Exif SubIFD:Exif Version 2.31 Exif SubIFD:Exposure Bias Value 0 EV Exif SubIFD:Exposure Mode Auto exposure Exif SubIFD:Exposure Program Program normal Exif SubIFD:Exposure Time 1/120 sec Exif SubIFD:F-Number f/2.2 Exif SubIFD:Flash Flash did not fire Exif SubIFD:FlashPix Version 1.00 Exif SubIFD:Focal Length 4.2 mm Exif SubIFD:Focal Length 35 29 mm Exif SubIFD:ISO Speed Ratings 25 Exif SubIFD:Lens Make Apple Exif SubIFD:Lens Model iPhone SE back camera 4.15mm f/2.2 Exif SubIFD:Lens Specification 2175733/524273mm f/2.2 Exif SubIFD:Metering Mode Multi-segment Exif SubIFD:Scene Capture Type Standard Exif SubIFD:Scene Type Directly photographed image Exif SubIFD:Sensing Method One-chip color area sensor Exif SubIFD:Shutter Speed Value 1/120 sec Exif SubIFD:Sub-Sec Time Digitized 632 Exif SubIFD:Sub-Sec Time Original 632 Exif SubIFD:Subject Location 2015 1511 2217 1330 Exif SubIFD:White Balance Mode Auto white balance File Modified Date Sat Nov 28 14:29:58 +00:00 2020 File Name apache-tika-3084762349667601297.tmp File Size 159142 bytes GPS:GPS Altitude 250.16 metres GPS:GPS Altitude Ref Sea level GPS:GPS Dest Bearing 184.69 degrees GPS:GPS Dest Bearing Ref True direction GPS:GPS H Positioning Error 65 metres GPS:GPS Img Direction 184.69 degrees GPS:GPS Img Direction Ref True direction GPS:GPS Latitude 39° 10' 22.54" GPS:GPS Latitude Ref N GPS:GPS Longitude -86° 31' 23.08" GPS:GPS Longitude Ref W GPS:GPS Speed 0 km/h GPS:GPS Speed Ref km/h Image Height 640 pixels Image Width 480 pixels Last-Modified 2019-12-16T13:58:58 Last-Save-Date 2019-12-16T13:58:58 Number of Components 3 Number of Tables 4 Huffman tables Resolution Units none Run Time [104 values] Thumbnail Height Pixels 0 Thumbnail Width Pixels 0 Time Created 13:58:58 Unknown tag (0x0001) 11 Unknown tag (0x0002) [558 values] Unknown tag (0x0004) 1 Unknown tag (0x0005) 128 Unknown tag (0x0006) 113 Unknown tag (0x0007) 1 Unknown tag (0x0008) 2488/138669 -37447/54367 20065/28232 Unknown tag (0x000e) 0 Unknown tag (0x0014) 1 Unknown tag (0x0017) 0 Unknown tag (0x0019) 0 Unknown tag (0x001f) 0 Unknown tag (0x0020) 51B96B09-3E10-4DF6-A8CF-BB72ACC3CABE Unknown tag (0x0025) 0 Unknown tag (0x0026) 0 Unknown tag (0x0027) 0 Unknown tag (0x002b) 41DC6045-2483-47F7-B038-0FC2594CFBE0 X Resolution 72 dots X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.jpeg.JpegParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 7 XMP Value Count 4 Y Resolution 72 dots date 2019-12-16T13:58:58 dcterms:created 2019-12-16T13:58:58 dcterms:modified 2019-12-16T13:58:58 exif:DateTimeOriginal 2019-12-16T13:58:58 exif:ExposureTime 0.008333333333333333 exif:FNumber 2.2 exif:Flash false exif:FocalLength 4.15 exif:IsoSpeedRatings 25 geo:lat 39.172928 geo:long -86.523078 meta:creation-date 2019-12-16T13:58:58 meta:save-date 2019-12-16T13:58:58 modified 2019-12-16T13:58:58 resourceName b'sites-nd-edu-2178.jpg' tiff:BitsPerSample 8 tiff:ImageLength 640 tiff:ImageWidth 480 tiff:Make Apple tiff:Model iPhone SE tiff:Orientation 1 tiff:ResolutionUnit Inch tiff:Software 13.1.3 tiff:XResolution 72.0 tiff:YResolution 72.0 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: www-xsede-org-5929 author: title: www-xsede-org-5929 date: pages: extension: .html txt: ./txt/www-xsede-org-5929.txt cache: ./cache/www-xsede-org-5929.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'www-xsede-org-5929.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' infomotions-com-555 txt/../ent/infomotions-com-555.ent === file2bib.sh === id: sites-nd-edu-6432 author: title: sites-nd-edu-6432 date: pages: extension: .png txt: ./txt/sites-nd-edu-6432.txt cache: ./cache/sites-nd-edu-6432.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=1245, height=940, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 1245 Screenshot 940 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 940 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 1245 Screenshot 940 resourceName b'sites-nd-edu-6432.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 940 tiff:ImageWidth 1245 width 1245 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-2497 txt/../ent/sites-nd-edu-2497.ent planet-infomotions-com-8963 txt/../pos/planet-infomotions-com-8963.pos === file2bib.sh === id: www-gutenberg-org-941 author: title: www-gutenberg-org-941 date: pages: extension: .html txt: ./txt/www-gutenberg-org-941.txt cache: ./cache/www-gutenberg-org-941.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 0 resourceName b'www-gutenberg-org-941.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-3585 txt/../ent/sites-nd-edu-3585.ent sites-nd-edu-8448 txt/../ent/sites-nd-edu-8448.ent twitter-com-9838 txt/../ent/twitter-com-9838.ent sites-nd-edu-6582 txt/../ent/sites-nd-edu-6582.ent === file2bib.sh === id: ucla-zoom-us-1408 author: title: Launch Meeting - Zoom date: pages: extension: .html txt: ./txt/ucla-zoom-us-1408.txt cache: ./cache/ucla-zoom-us-1408.html Content-Encoding UTF-8 Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 X-UA-Compatible IE=edge,Chrome=1 dc:title Launch Meeting - Zoom description Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom Rooms is the original software-based conference room solution used around the world in board, conference, huddle, and training rooms, as well as executive offices and classrooms. Founded in 2011, Zoom helps businesses and organizations bring their teams together in a frictionless environment to get more done. Zoom is a publicly traded company headquartered in San Jose, CA. fb:app_id 113289095462482 keywords zoom, zoom.us, video conferencing, video conference, online meetings, web meeting, video meeting, cloud meeting, cloud video, group video call, group video chat, screen share, application share, mobility, mobile collaboration, desktop share, video collaboration, group messaging og:description Zoom is the leader in modern enterprise video communications, with an easy, reliable cloud platform for video and audio conferencing, chat, and webinars across mobile, desktop, and room systems. Zoom Rooms is the original software-based conference room solution used around the world in board, conference, huddle, and training rooms, as well as executive offices and classrooms. Founded in 2011, Zoom helps businesses and organizations bring their teams together in a frictionless environment to get more done. Zoom is a publicly traded company headquartered in San Jose, CA. og:site_name Zoom Video og:title Join our Cloud HD Video Meeting og:type activity og:url https://ucla.zoom.us/j/3107947789 referrer origin-when-cross-origin resourceName b'ucla-zoom-us-1408.html' robots noindex,nofollow title Launch Meeting - Zoom twitter:account_id 522701657 viewport width=device-width,initial-scale=1,minimum-scale=1.0 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 112, in summary = summarize( text, word_count=COUNT, split=False ) File "/data-disk/python/lib/python3.8/site-packages/gensim/summarization/summarizer.py", line 428, in summarize raise ValueError("input must have more than one sentence") ValueError: input must have more than one sentence sites-nd-edu-9996 txt/../ent/sites-nd-edu-9996.ent planet-infomotions-com-7919 txt/../wrd/planet-infomotions-com-7919.wrd === file2bib.sh === id: sites-nd-edu-2246 author: title: sites-nd-edu-2246 date: pages: extension: .png txt: ./txt/sites-nd-edu-2246.txt cache: ./cache/sites-nd-edu-2246.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=259, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 259 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-2246.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 259 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-6181 txt/../ent/sites-nd-edu-6181.ent === file2bib.sh === id: sites-nd-edu-6875 author: title: sites-nd-edu-6875 date: 2019-09-25 pages: extension: .jpg txt: ./txt/sites-nd-edu-6875.txt cache: ./cache/sites-nd-edu-6875.jpg Application Record Version 2 Caption Digest 68 61 121 199 192 185 128 102 112 101 164 23 205 71 91 108 Coded Character Set UTF-8 Component 1 Y component: Quantization table 0, Sampling factors 2 horiz/2 vert Component 2 Cb component: Quantization table 1, Sampling factors 1 horiz/1 vert Component 3 Cr component: Quantization table 1, Sampling factors 1 horiz/1 vert Compression Type Baseline Content-Type image/jpeg Creation-Date 2019-09-25T09:30:03 Data Precision 8 bits Date Created 2019:09:25 Digital Date Created 2019:09:25 Digital Time Created 09:30:03 Exif IFD0:Date/Time 2019:09:25 09:30:03 Exif IFD0:Make Apple Exif IFD0:Model iPad Pro Exif IFD0:Orientation Top, left side (Horizontal / normal) Exif IFD0:Resolution Unit Inch Exif IFD0:Software 12.4.1 Exif IFD0:X Resolution 72 dots per inch Exif IFD0:Y Resolution 72 dots per inch Exif SubIFD:Aperture Value f/2.4 Exif SubIFD:Brightness Value 3.715 Exif SubIFD:Color Space sRGB Exif SubIFD:Components Configuration YCbCr Exif SubIFD:Date/Time Digitized 2019:09:25 09:30:03 Exif SubIFD:Date/Time Original 2019:09:25 09:30:03 Exif SubIFD:Exif Image Height 225 pixels Exif SubIFD:Exif Image Width 300 pixels Exif SubIFD:Exif Version 2.21 Exif SubIFD:Exposure Bias Value 0 EV Exif SubIFD:Exposure Mode Auto exposure Exif SubIFD:Exposure Program Program normal Exif SubIFD:Exposure Time 1/30 sec Exif SubIFD:F-Number f/2.4 Exif SubIFD:Flash Flash did not fire Exif SubIFD:FlashPix Version 1.00 Exif SubIFD:Focal Length 3.3 mm Exif SubIFD:Focal Length 35 31 mm Exif SubIFD:ISO Speed Ratings 64 Exif SubIFD:Lens Make Apple Exif SubIFD:Lens Model iPad Pro back camera 3.3mm f/2.4 Exif SubIFD:Lens Specification 3.3mm f/2.4 Exif SubIFD:Metering Mode Spot Exif SubIFD:Scene Capture Type Standard Exif SubIFD:Scene Type Directly photographed image Exif SubIFD:Sensing Method One-chip color area sensor Exif SubIFD:Shutter Speed Value 1/30 sec Exif SubIFD:Sub-Sec Time Digitized 837 Exif SubIFD:Sub-Sec Time Original 837 Exif SubIFD:Subject Location 742 1258 610 612 Exif SubIFD:White Balance Mode Auto white balance File Modified Date Sat Nov 28 14:29:59 +00:00 2020 File Name apache-tika-3868271603042430116.tmp File Size 47082 bytes Image Height 225 pixels Image Width 300 pixels Last-Modified 2019-09-25T09:30:03 Last-Save-Date 2019-09-25T09:30:03 Number of Components 3 Number of Tables 4 Huffman tables Resolution Units none Run Time [104 values] Thumbnail Height Pixels 0 Thumbnail Width Pixels 0 Time Created 09:30:03 Unknown tag (0x0001) 10 Unknown tag (0x0002) [558 values] Unknown tag (0x0004) 1 Unknown tag (0x0005) 204 Unknown tag (0x0006) 209 Unknown tag (0x0007) 1 Unknown tag (0x0008) 127005/126233 1892/73863 7082/91717 Unknown tag (0x0009) 4371 Unknown tag (0x000e) 4 Unknown tag (0x0014) 4 Unknown tag (0x0017) 0 Unknown tag (0x0019) 0 Unknown tag (0x001f) 0 Unknown tag (0x0025) 0 Unknown tag (0x0026) 0 Unknown tag (0x0027) 0 X Resolution 72 dots X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.jpeg.JpegParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 5 Y Resolution 72 dots date 2019-09-25T09:30:03 dcterms:created 2019-09-25T09:30:03 dcterms:modified 2019-09-25T09:30:03 exif:DateTimeOriginal 2019-09-25T09:30:03 exif:ExposureTime 0.03333333333333333 exif:FNumber 2.4 exif:Flash false exif:FocalLength 3.3 exif:IsoSpeedRatings 64 meta:creation-date 2019-09-25T09:30:03 meta:save-date 2019-09-25T09:30:03 modified 2019-09-25T09:30:03 resourceName b'sites-nd-edu-6875.jpg' tiff:BitsPerSample 8 tiff:ImageLength 225 tiff:ImageWidth 300 tiff:Make Apple tiff:Model iPad Pro tiff:Orientation 1 tiff:ResolutionUnit Inch tiff:Software 12.4.1 tiff:XResolution 72.0 tiff:YResolution 72.0 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-6302 author: title: sites-nd-edu-6302 date: pages: extension: .png txt: ./txt/sites-nd-edu-6302.txt cache: ./cache/sites-nd-edu-6302.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=204, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 204 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-6302.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 204 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' infomotions-com-3637 txt/../pos/infomotions-com-3637.pos sites-nd-edu-7928 txt/../wrd/sites-nd-edu-7928.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-9191 txt/../wrd/sites-nd-edu-9191.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point === file2bib.sh === id: sites-nd-edu-8419 author: title: sites-nd-edu-8419 date: pages: extension: .png txt: ./txt/sites-nd-edu-8419.txt cache: ./cache/sites-nd-edu-8419.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=1090, height=756, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 1090 Screenshot 756 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 756 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 1090 Screenshot 756 resourceName b'sites-nd-edu-8419.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 756 tiff:ImageWidth 1090 width 1090 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' infomotions-com-172 txt/../ent/infomotions-com-172.ent === file2bib.sh === id: sites-nd-edu-1886 author: title: sites-nd-edu-1886 date: pages: extension: .png txt: ./txt/sites-nd-edu-1886.txt cache: ./cache/sites-nd-edu-1886.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=1046, height=712, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 1046 Screenshot 712 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 712 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 1046 Screenshot 712 resourceName b'sites-nd-edu-1886.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 712 tiff:ImageWidth 1046 width 1046 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-3721 txt/../ent/sites-nd-edu-3721.ent stedolan-github-io-4569 txt/../ent/stedolan-github-io-4569.ent sites-nd-edu-9996 txt/../wrd/sites-nd-edu-9996.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-3574 txt/../ent/sites-nd-edu-3574.ent planet-infomotions-com-7919 txt/../pos/planet-infomotions-com-7919.pos sites-nd-edu-6875 txt/../ent/sites-nd-edu-6875.ent === file2bib.sh === id: sites-nd-edu-7840 author: title: sites-nd-edu-7840 date: pages: extension: .png txt: ./txt/sites-nd-edu-7840.txt cache: ./cache/sites-nd-edu-7840.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=935, height=807, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 935 Screenshot 807 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 807 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 935 Screenshot 807 resourceName b'sites-nd-edu-7840.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 807 tiff:ImageWidth 935 width 935 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-2650 txt/../pos/sites-nd-edu-2650.pos === file2bib.sh === id: sites-nd-edu-3678 author: title: sites-nd-edu-3678 date: pages: extension: .html txt: ./txt/sites-nd-edu-3678.txt cache: ./cache/sites-nd-edu-3678.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 0 resourceName b'sites-nd-edu-3678.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-3940 txt/../pos/sites-nd-edu-3940.pos === file2bib.sh === id: sites-nd-edu-3471 author: title: sites-nd-edu-3471 date: 2019-09-25 pages: extension: .jpg txt: ./txt/sites-nd-edu-3471.txt cache: ./cache/sites-nd-edu-3471.jpg Application Record Version 2 Coded Character Set UTF-8 Component 1 Y component: Quantization table 0, Sampling factors 2 horiz/2 vert Component 2 Cb component: Quantization table 1, Sampling factors 1 horiz/1 vert Component 3 Cr component: Quantization table 1, Sampling factors 1 horiz/1 vert Compression Type Baseline Content-Type image/jpeg Creation-Date 2019-09-25T09:30:03 Data Precision 8 bits Date Created 2019:09:25 Digital Date Created 2019:09:25 Digital Time Created 09:30:03 Exif IFD0:Date/Time 2019:09:25 09:30:03 Exif IFD0:Make Apple Exif IFD0:Model iPad Pro Exif IFD0:Orientation Top, left side (Horizontal / normal) Exif IFD0:Resolution Unit Inch Exif IFD0:Software 12.4.1 Exif IFD0:X Resolution 72 dots per inch Exif IFD0:Y Resolution 72 dots per inch Exif SubIFD:Aperture Value f/2.4 Exif SubIFD:Brightness Value 3.715 Exif SubIFD:Color Space sRGB Exif SubIFD:Components Configuration YCbCr Exif SubIFD:Date/Time Digitized 2019:09:25 09:30:03 Exif SubIFD:Date/Time Original 2019:09:25 09:30:03 Exif SubIFD:Exif Image Height 225 pixels Exif SubIFD:Exif Image Width 300 pixels Exif SubIFD:Exif Version 2.21 Exif SubIFD:Exposure Bias Value 0 EV Exif SubIFD:Exposure Mode Auto exposure Exif SubIFD:Exposure Program Program normal Exif SubIFD:Exposure Time 1/30 sec Exif SubIFD:F-Number f/2.4 Exif SubIFD:Flash Flash did not fire Exif SubIFD:FlashPix Version 1.00 Exif SubIFD:Focal Length 3.3 mm Exif SubIFD:Focal Length 35 31 mm Exif SubIFD:ISO Speed Ratings 64 Exif SubIFD:Lens Make Apple Exif SubIFD:Lens Model iPad Pro back camera 3.3mm f/2.4 Exif SubIFD:Lens Specification 3.3mm f/2.4 Exif SubIFD:Metering Mode Spot Exif SubIFD:Scene Capture Type Standard Exif SubIFD:Scene Type Directly photographed image Exif SubIFD:Sensing Method One-chip color area sensor Exif SubIFD:Shutter Speed Value 1/30 sec Exif SubIFD:Sub-Sec Time Digitized 837 Exif SubIFD:Sub-Sec Time Original 837 Exif SubIFD:Subject Location 742 1258 610 612 Exif SubIFD:White Balance Mode Auto white balance File Modified Date Sat Nov 28 14:29:59 +00:00 2020 File Name apache-tika-283383317004029301.tmp File Size 26478 bytes Image Height 225 pixels Image Width 300 pixels Last-Modified 2019-09-25T09:30:03 Last-Save-Date 2019-09-25T09:30:03 Number of Components 3 Number of Tables 4 Huffman tables Resolution Units inch Run Time [104 values] Thumbnail Height Pixels 0 Thumbnail Width Pixels 0 Time Created 09:30:03 Unknown tag (0x0001) 10 Unknown tag (0x0002) [558 values] Unknown tag (0x0004) 1 Unknown tag (0x0005) 204 Unknown tag (0x0006) 209 Unknown tag (0x0007) 1 Unknown tag (0x0008) 127005/126233 1892/73863 7082/91717 Unknown tag (0x0009) 4371 Unknown tag (0x000e) 4 Unknown tag (0x0014) 4 Unknown tag (0x0017) 0 Unknown tag (0x0019) 0 Unknown tag (0x001f) 0 Unknown tag (0x0025) 0 Unknown tag (0x0026) 0 Unknown tag (0x0027) 0 X Resolution 72 dots X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.jpeg.JpegParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 7 XMP Value Count 4 Y Resolution 72 dots date 2019-09-25T09:30:03 dcterms:created 2019-09-25T09:30:03 dcterms:modified 2019-09-25T09:30:03 exif:DateTimeOriginal 2019-09-25T09:30:03 exif:ExposureTime 0.03333333333333333 exif:FNumber 2.4 exif:Flash false exif:FocalLength 3.3 exif:IsoSpeedRatings 64 meta:creation-date 2019-09-25T09:30:03 meta:save-date 2019-09-25T09:30:03 modified 2019-09-25T09:30:03 resourceName b'sites-nd-edu-3471.jpg' tiff:BitsPerSample 8 tiff:ImageLength 225 tiff:ImageWidth 300 tiff:Make Apple tiff:Model iPad Pro tiff:Orientation 1 tiff:ResolutionUnit Inch tiff:Software 12.4.1 tiff:XResolution 72.0 tiff:YResolution 72.0 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-6245 txt/../wrd/sites-nd-edu-6245.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-460 txt/../pos/sites-nd-edu-460.pos sites-nd-edu-1522 txt/../wrd/sites-nd-edu-1522.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-2650 txt/../wrd/sites-nd-edu-2650.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-8419 txt/../ent/sites-nd-edu-8419.ent infomotions-com-7836 txt/../wrd/infomotions-com-7836.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point bit-ly-5230 txt/../pos/bit-ly-5230.pos sites-nd-edu-3940 txt/../ent/sites-nd-edu-3940.ent planet-infomotions-com-8963 txt/../wrd/planet-infomotions-com-8963.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point === file2bib.sh === id: sites-nd-edu-8489 author: title: sites-nd-edu-8489 date: pages: extension: .png txt: ./txt/sites-nd-edu-8489.txt cache: ./cache/sites-nd-edu-8489.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=1454, height=989, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 1454 Screenshot 989 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 989 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 1454 Screenshot 989 resourceName b'sites-nd-edu-8489.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 989 tiff:ImageWidth 1454 width 1454 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-9146 author: title: sites-nd-edu-9146 date: pages: extension: .png txt: ./txt/sites-nd-edu-9146.txt cache: ./cache/sites-nd-edu-9146.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=204, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 204 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-9146.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 204 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' infomotions-com-172 txt/../pos/infomotions-com-172.pos sites-nd-edu-755 txt/../ent/sites-nd-edu-755.ent bit-ly-5230 txt/../ent/bit-ly-5230.ent === file2bib.sh === id: sites-nd-edu-3574 author: title: sites-nd-edu-3574 date: pages: extension: .png txt: ./txt/sites-nd-edu-3574.txt cache: ./cache/sites-nd-edu-3574.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension HorizontalPixelSize 0.17639795 Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 Dimension VerticalPixelSize 0.17639795 IHDR width=2520, height=2094, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 2520 2094 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk iDOT X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 height 2094 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 2520 2094 pHYs pixelsPerUnitXAxis=5669, pixelsPerUnitYAxis=5669, unitSpecifier=meter resourceName b'sites-nd-edu-3574.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 2094 tiff:ImageWidth 2520 width 2520 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-6582 txt/../wrd/sites-nd-edu-6582.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point ucla-zoom-us-1408 txt/../ent/ucla-zoom-us-1408.ent sites-nd-edu-6181 txt/../pos/sites-nd-edu-6181.pos === file2bib.sh === id: sites-nd-edu-3187 author: title: sites-nd-edu-3187 date: pages: extension: .png txt: ./txt/sites-nd-edu-3187.txt cache: ./cache/sites-nd-edu-3187.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=204, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 204 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-3187.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 204 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-8762 author: title: sites-nd-edu-8762 date: pages: extension: .png txt: ./txt/sites-nd-edu-8762.txt cache: ./cache/sites-nd-edu-8762.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=284, height=300, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 300 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-8762.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 300 tiff:ImageWidth 284 width 284 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' planet-infomotions-com-7919 txt/../ent/planet-infomotions-com-7919.ent sites-nd-edu-2908 txt/../wrd/sites-nd-edu-2908.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point infomotions-com-3852 txt/../pos/infomotions-com-3852.pos sites-nd-edu-6302 txt/../ent/sites-nd-edu-6302.ent sites-nd-edu-2910 txt/../ent/sites-nd-edu-2910.ent infomotions-com-3637 txt/../wrd/infomotions-com-3637.wrd sites-nd-edu-2178 txt/../ent/sites-nd-edu-2178.ent infomotions-com-172 txt/../wrd/infomotions-com-172.wrd sites-nd-edu-9191 txt/../pos/sites-nd-edu-9191.pos sites-nd-edu-393 txt/../pos/sites-nd-edu-393.pos bit-ly-8913 txt/../pos/bit-ly-8913.pos === file2bib.sh === id: sites-nd-edu-1664 author: title: sites-nd-edu-1664 date: pages: extension: .png txt: ./txt/sites-nd-edu-1664.txt cache: ./cache/sites-nd-edu-1664.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=1245, height=940, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 1245 Screenshot 940 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 940 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 1245 Screenshot 940 resourceName b'sites-nd-edu-1664.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 940 tiff:ImageWidth 1245 width 1245 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' serials-infomotions-com-5908 txt/../ent/serials-infomotions-com-5908.ent infomotions-com-555 txt/../wrd/infomotions-com-555.wrd distantreader-org-7009 txt/../ent/distantreader-org-7009.ent === file2bib.sh === id: sites-nd-edu-3469 author: title: sites-nd-edu-3469 date: pages: extension: .html txt: ./txt/sites-nd-edu-3469.txt cache: ./cache/sites-nd-edu-3469.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'sites-nd-edu-3469.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-6245 txt/../pos/sites-nd-edu-6245.pos sites-nd-edu-460 txt/../wrd/sites-nd-edu-460.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-tufts-edu-6731 txt/../ent/sites-tufts-edu-6731.ent www-xsede-org-5929 txt/../pos/www-xsede-org-5929.pos bit-ly-8913 txt/../wrd/bit-ly-8913.wrd sites-tufts-edu-6731 txt/../wrd/sites-tufts-edu-6731.wrd planet-infomotions-com-4104 txt/../pos/planet-infomotions-com-4104.pos youtu-be-1944 txt/../pos/youtu-be-1944.pos bit-ly-8913 txt/../ent/bit-ly-8913.ent === file2bib.sh === id: sites-nd-edu-1918 author: title: sites-nd-edu-1918 date: pages: extension: .png txt: ./txt/sites-nd-edu-1918.txt cache: ./cache/sites-nd-edu-1918.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=935, height=807, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 935 Screenshot 807 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 807 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 935 Screenshot 807 resourceName b'sites-nd-edu-1918.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 807 tiff:ImageWidth 935 width 935 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-tufts-edu-6731 txt/../pos/sites-tufts-edu-6731.pos sites-nd-edu-3471 txt/../ent/sites-nd-edu-3471.ent mallet-cs-umass-edu-3654 txt/../ent/mallet-cs-umass-edu-3654.ent sites-nd-edu-6245 txt/../ent/sites-nd-edu-6245.ent sites-nd-edu-3469 txt/../ent/sites-nd-edu-3469.ent sites-nd-edu-2908 txt/../pos/sites-nd-edu-2908.pos === file2bib.sh === id: sites-nd-edu-5464 author: title: sites-nd-edu-5464 date: pages: extension: .png txt: ./txt/sites-nd-edu-5464.txt cache: ./cache/sites-nd-edu-5464.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=259, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 259 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-5464.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 259 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: sites-nd-edu-8707 author: title: sites-nd-edu-8707 date: pages: extension: .png txt: ./txt/sites-nd-edu-8707.txt cache: ./cache/sites-nd-edu-8707.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=259, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 259 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-8707.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 259 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' twitter-com-9838 txt/../pos/twitter-com-9838.pos === file2bib.sh === id: sites-nd-edu-7631 author: title: sites-nd-edu-7631 date: pages: extension: .png txt: ./txt/sites-nd-edu-7631.txt cache: ./cache/sites-nd-edu-7631.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=227, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 227 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-7631.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 227 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-755 txt/../pos/sites-nd-edu-755.pos planet-infomotions-com-4104 txt/../wrd/planet-infomotions-com-4104.wrd === file2bib.sh === id: sites-nd-edu-8089 author: title: sites-nd-edu-8089 date: pages: extension: .png txt: ./txt/sites-nd-edu-8089.txt cache: ./cache/sites-nd-edu-8089.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=1245, height=940, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 1245 Screenshot 940 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 940 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 1245 Screenshot 940 resourceName b'sites-nd-edu-8089.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 940 tiff:ImageWidth 1245 width 1245 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-2497 txt/../pos/sites-nd-edu-2497.pos docs-pkp-sfu-ca-7101 txt/../ent/docs-pkp-sfu-ca-7101.ent sites-nd-edu-3721 txt/../pos/sites-nd-edu-3721.pos sites-nd-edu-1886 txt/../pos/sites-nd-edu-1886.pos sites-nd-edu-6432 txt/../ent/sites-nd-edu-6432.ent infomotions-com-3852 txt/../ent/infomotions-com-3852.ent infomotions-com-3852 txt/../wrd/infomotions-com-3852.wrd 2020-code4lib-org-5785 txt/../pos/2020-code4lib-org-5785.pos sites-nd-edu-755 txt/../wrd/sites-nd-edu-755.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-2910 txt/../pos/sites-nd-edu-2910.pos sites-nd-edu-3721 txt/../wrd/sites-nd-edu-3721.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point www-gutenberg-org-941 txt/../ent/www-gutenberg-org-941.ent === file2bib.sh === id: sites-nd-edu-2573 author: title: sites-nd-edu-2573 date: pages: extension: .png txt: ./txt/sites-nd-edu-2573.txt cache: ./cache/sites-nd-edu-2573.png Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=1454, height=989, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Text TextEntry keyword=XML:com.adobe.xmp, value= 1454 Screenshot 989 , language=, compression=none Transparency Alpha nonpremultipled UnknownChunks UnknownChunk eXIf X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 height 989 iCCP profileName=ICC Profile, compressionMethod=deflate iTXt iTXtEntry keyword=XML:com.adobe.xmp, compressionFlag=false, compressionMethod=0, languageTag=, translatedKeyword=, text= 1454 Screenshot 989 resourceName b'sites-nd-edu-2573.png' tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 989 tiff:ImageWidth 1454 width 1454 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-6582 txt/../pos/sites-nd-edu-6582.pos serials-infomotions-com-5908 txt/../wrd/serials-infomotions-com-5908.wrd sites-nd-edu-3187 txt/../ent/sites-nd-edu-3187.ent sites-nd-edu-3574 txt/../pos/sites-nd-edu-3574.pos === file2bib.sh === id: sites-nd-edu-8691 author: title: sites-nd-edu-8691 date: pages: extension: .png txt: ./txt/sites-nd-edu-8691.txt cache: ./cache/sites-nd-edu-8691.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=250, height=300, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 300 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-8691.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 300 tiff:ImageWidth 250 width 250 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-5464 txt/../ent/sites-nd-edu-5464.ent pkp-sfu-ca-4628 txt/../ent/pkp-sfu-ca-4628.ent sites-nd-edu-1179 txt/../wrd/sites-nd-edu-1179.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point === file2bib.sh === id: sites-nd-edu-6066 author: title: sites-nd-edu-6066 date: 2019-12-16 pages: extension: .jpg txt: ./txt/sites-nd-edu-6066.txt cache: ./cache/sites-nd-edu-6066.jpg Application Record Version 2 Caption Digest 71 196 39 236 28 2 22 214 155 55 53 161 163 157 71 171 Coded Character Set UTF-8 Component 1 Y component: Quantization table 0, Sampling factors 2 horiz/2 vert Component 2 Cb component: Quantization table 1, Sampling factors 1 horiz/1 vert Component 3 Cr component: Quantization table 1, Sampling factors 1 horiz/1 vert Compression Type Baseline Content-Type image/jpeg Creation-Date 2019-12-16T11:28:11 Data Precision 8 bits Date Created 2019:12:16 Digital Date Created 2019:12:16 Digital Time Created 11:28:11 Exif IFD0:Date/Time 2019:12:16 11:28:11 Exif IFD0:Make Apple Exif IFD0:Model iPhone SE Exif IFD0:Orientation Top, left side (Horizontal / normal) Exif IFD0:Resolution Unit Inch Exif IFD0:Software 13.1.3 Exif IFD0:X Resolution 72 dots per inch Exif IFD0:Y Resolution 72 dots per inch Exif SubIFD:Aperture Value f/2.2 Exif SubIFD:Brightness Value 3.445 Exif SubIFD:Color Space sRGB Exif SubIFD:Components Configuration YCbCr Exif SubIFD:Date/Time Digitized 2019:12:16 11:28:11 Exif SubIFD:Date/Time Original 2019:12:16 11:28:11 Exif SubIFD:Exif Image Height 480 pixels Exif SubIFD:Exif Image Width 640 pixels Exif SubIFD:Exif Version 2.31 Exif SubIFD:Exposure Bias Value 0 EV Exif SubIFD:Exposure Mode Auto exposure Exif SubIFD:Exposure Program Program normal Exif SubIFD:Exposure Time 1/30 sec Exif SubIFD:F-Number f/2.2 Exif SubIFD:Flash Flash did not fire Exif SubIFD:FlashPix Version 1.00 Exif SubIFD:Focal Length 4.2 mm Exif SubIFD:Focal Length 35 29 mm Exif SubIFD:ISO Speed Ratings 40 Exif SubIFD:Lens Make Apple Exif SubIFD:Lens Model iPhone SE back camera 4.15mm f/2.2 Exif SubIFD:Lens Specification 2175733/524273mm f/2.2 Exif SubIFD:Metering Mode Spot Exif SubIFD:Scene Capture Type Standard Exif SubIFD:Scene Type Directly photographed image Exif SubIFD:Sensing Method One-chip color area sensor Exif SubIFD:Shutter Speed Value 1/30 sec Exif SubIFD:Sub-Sec Time Digitized 419 Exif SubIFD:Sub-Sec Time Original 419 Exif SubIFD:Subject Location 2762 674 753 756 Exif SubIFD:White Balance Mode Auto white balance File Modified Date Sat Nov 28 14:29:59 +00:00 2020 File Name apache-tika-8616309937564124826.tmp File Size 99839 bytes GPS:GPS Altitude 241.1 metres GPS:GPS Altitude Ref Sea level GPS:GPS Dest Bearing 134.6 degrees GPS:GPS Dest Bearing Ref True direction GPS:GPS H Positioning Error 200 metres GPS:GPS Img Direction 134.6 degrees GPS:GPS Img Direction Ref True direction GPS:GPS Latitude 39° 10' 27.35" GPS:GPS Latitude Ref N GPS:GPS Longitude -86° 30' 2.11" GPS:GPS Longitude Ref W GPS:GPS Speed 0.06 km/h GPS:GPS Speed Ref km/h Image Height 480 pixels Image Width 640 pixels Last-Modified 2019-12-16T11:28:11 Last-Save-Date 2019-12-16T11:28:11 Number of Components 3 Number of Tables 4 Huffman tables Resolution Units none Run Time [104 values] Thumbnail Height Pixels 0 Thumbnail Width Pixels 0 Time Created 11:28:11 Unknown tag (0x0001) 11 Unknown tag (0x0002) [558 values] Unknown tag (0x0004) 1 Unknown tag (0x0005) 140 Unknown tag (0x0006) 142 Unknown tag (0x0007) 1 Unknown tag (0x0008) -98432/99331 -2344/880371 -8437/54382 Unknown tag (0x0009) 4371 Unknown tag (0x000e) 0 Unknown tag (0x0014) 4 Unknown tag (0x0017) 0 Unknown tag (0x0019) 0 Unknown tag (0x001f) 0 Unknown tag (0x0020) EF39FB3F-333E-4979-B070-105EDE6963D5 Unknown tag (0x0025) 0 Unknown tag (0x0026) 0 Unknown tag (0x0027) 0 Unknown tag (0x002b) C036B148-EA00-489F-B7D6-44DF432883FF X Resolution 72 dots X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.jpeg.JpegParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 20 XMP Value Count 18 Y Resolution 72 dots date 2019-12-16T11:28:11 dcterms:created 2019-12-16T11:28:11 dcterms:modified 2019-12-16T11:28:11 exif:DateTimeOriginal 2019-12-16T11:28:11 exif:ExposureTime 0.03333333333333333 exif:FNumber 2.2 exif:Flash false exif:FocalLength 4.15 exif:IsoSpeedRatings 40 geo:lat 39.174264 geo:long -86.500586 meta:creation-date 2019-12-16T11:28:11 meta:save-date 2019-12-16T11:28:11 modified 2019-12-16T11:28:11 resourceName b'sites-nd-edu-6066.jpg' tiff:BitsPerSample 8 tiff:ImageLength 480 tiff:ImageWidth 640 tiff:Make Apple tiff:Model iPhone SE tiff:Orientation 1 tiff:ResolutionUnit Inch tiff:Software 13.1.3 tiff:XResolution 72.0 tiff:YResolution 72.0 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-3678 txt/../ent/sites-nd-edu-3678.ent mallet-cs-umass-edu-3654 txt/../pos/mallet-cs-umass-edu-3654.pos sites-nd-edu-1664 txt/../ent/sites-nd-edu-1664.ent www-gutenberg-org-6207 txt/../wrd/www-gutenberg-org-6207.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point 2020-code4lib-org-5785 txt/../wrd/2020-code4lib-org-5785.wrd sites-nd-edu-1179 txt/../pos/sites-nd-edu-1179.pos sites-nd-edu-3940 txt/../wrd/sites-nd-edu-3940.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-8762 txt/../ent/sites-nd-edu-8762.ent === file2bib.sh === id: sites-nd-edu-2818 author: title: sites-nd-edu-2818 date: pages: extension: .html txt: ./txt/sites-nd-edu-2818.txt cache: ./cache/sites-nd-edu-2818.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'sites-nd-edu-2818.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-3585 txt/../wrd/sites-nd-edu-3585.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-2497 txt/../wrd/sites-nd-edu-2497.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-6432 txt/../wrd/sites-nd-edu-6432.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point www-xsede-org-5929 txt/../wrd/www-xsede-org-5929.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-7928 txt/../pos/sites-nd-edu-7928.pos sites-nd-edu-6181 txt/../wrd/sites-nd-edu-6181.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point youtu-be-1944 txt/../ent/youtu-be-1944.ent www-xsede-org-5929 txt/../ent/www-xsede-org-5929.ent sites-nd-edu-6432 txt/../pos/sites-nd-edu-6432.pos infomotions-com-555 txt/../pos/infomotions-com-555.pos === file2bib.sh === id: sites-nd-edu-2910 author: title: sites-nd-edu-2910 date: pages: extension: .png txt: ./txt/sites-nd-edu-2910.txt cache: ./cache/sites-nd-edu-2910.png Chroma BackgroundColor red=255, green=255, blue=255 Chroma BlackIsZero true Chroma ColorSpaceType RGB Chroma Gamma 0.45455 Chroma NumChannels 4 Compression CompressionTypeName deflate Compression Lossless true Compression NumProgressiveScans 1 Content-Type image/png Data BitsPerSample 8 8 8 8 Data PlanarConfiguration PixelInterleaved Data SampleFormat UnsignedIntegral Dimension ImageOrientation Normal Dimension PixelAspectRatio 1.0 IHDR width=300, height=204, bitDepth=8, colorType=RGBAlpha, compressionMethod=deflate, filterMethod=adaptive, interlaceMethod=none Transparency Alpha nonpremultipled X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.image.ImageParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 bKGD bKGD_RGB red=255, green=255, blue=255 cHRM whitePointX=31270, whitePointY=32900, redX=64000, redY=33000, greenX=30000, greenY=60000, blueX=15000, blueY=6000 gAMA 45455 height 204 pHYs pixelsPerUnitXAxis=72, pixelsPerUnitYAxis=72, unitSpecifier=unknown resourceName b'sites-nd-edu-2910.png' sRGB Perceptual tiff:BitsPerSample 8 8 8 8 tiff:ImageLength 204 tiff:ImageWidth 300 width 300 Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-8707 txt/../ent/sites-nd-edu-8707.ent 2020-code4lib-org-5785 txt/../ent/2020-code4lib-org-5785.ent sites-nd-edu-2246 txt/../ent/sites-nd-edu-2246.ent sites-nd-edu-8707 txt/../pos/sites-nd-edu-8707.pos dh-crc-nd-edu-1806 txt/../ent/dh-crc-nd-edu-1806.ent sites-nd-edu-6875 txt/../wrd/sites-nd-edu-6875.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-8419 txt/../wrd/sites-nd-edu-8419.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-8448 txt/../wrd/sites-nd-edu-8448.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-6302 txt/../wrd/sites-nd-edu-6302.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-6875 txt/../pos/sites-nd-edu-6875.pos sites-nd-edu-6302 txt/../pos/sites-nd-edu-6302.pos sites-nd-edu-7631 txt/../wrd/sites-nd-edu-7631.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-393 txt/../wrd/sites-nd-edu-393.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-3678 txt/../pos/sites-nd-edu-3678.pos === file2bib.sh === id: sites-nd-edu-6366 author: title: sites-nd-edu-6366 date: pages: extension: .html txt: ./txt/sites-nd-edu-6366.txt cache: ./cache/sites-nd-edu-6366.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'sites-nd-edu-6366.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' www-gutenberg-org-6207 txt/../ent/www-gutenberg-org-6207.ent pkp-sfu-ca-4628 txt/../pos/pkp-sfu-ca-4628.pos twitter-com-9838 txt/../wrd/twitter-com-9838.wrd distantreader-org-6471 txt/../pos/distantreader-org-6471.pos sites-nd-edu-3471 txt/../pos/sites-nd-edu-3471.pos sites-nd-edu-7631 txt/../ent/sites-nd-edu-7631.ent www-gutenberg-org-6207 txt/../pos/www-gutenberg-org-6207.pos sites-nd-edu-1886 txt/../ent/sites-nd-edu-1886.ent dh-crc-nd-edu-7757 txt/../pos/dh-crc-nd-edu-7757.pos sites-nd-edu-7840 txt/../ent/sites-nd-edu-7840.ent sites-nd-edu-2246 txt/../pos/sites-nd-edu-2246.pos sites-nd-edu-3469 txt/../pos/sites-nd-edu-3469.pos === file2bib.sh === id: sites-nd-edu-5154 author: title: sites-nd-edu-5154 date: pages: extension: .html txt: ./txt/sites-nd-edu-5154.txt cache: ./cache/sites-nd-edu-5154.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'sites-nd-edu-5154.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-9146 txt/../ent/sites-nd-edu-9146.ent bit-ly-5230 txt/../wrd/bit-ly-5230.wrd sites-nd-edu-8089 txt/../ent/sites-nd-edu-8089.ent sites-nd-edu-1918 txt/../ent/sites-nd-edu-1918.ent serials-infomotions-com-5908 txt/../pos/serials-infomotions-com-5908.pos sites-nd-edu-8762 txt/../pos/sites-nd-edu-8762.pos www-gutenberg-org-941 txt/../pos/www-gutenberg-org-941.pos sites-nd-edu-1918 txt/../wrd/sites-nd-edu-1918.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-2178 txt/../wrd/sites-nd-edu-2178.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point infomotions-com-953 txt/../pos/infomotions-com-953.pos sites-nd-edu-8691 txt/../ent/sites-nd-edu-8691.ent distantreader-org-6471 txt/../wrd/distantreader-org-6471.wrd ucla-zoom-us-1408 txt/../wrd/ucla-zoom-us-1408.wrd github-com-2983 txt/../ent/github-com-2983.ent dh-crc-nd-edu-7757 txt/../wrd/dh-crc-nd-edu-7757.wrd sites-nd-edu-8089 txt/../pos/sites-nd-edu-8089.pos github-com-7801 txt/../pos/github-com-7801.pos sites-nd-edu-2573 txt/../pos/sites-nd-edu-2573.pos sites-nd-edu-6066 txt/../wrd/sites-nd-edu-6066.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point www-gutenberg-org-941 txt/../wrd/www-gutenberg-org-941.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point www-gnu-org-8892 txt/../wrd/www-gnu-org-8892.wrd sites-nd-edu-6066 txt/../pos/sites-nd-edu-6066.pos sites-nd-edu-2818 txt/../pos/sites-nd-edu-2818.pos github-com-7801 txt/../ent/github-com-7801.ent github-com-2983 txt/../wrd/github-com-2983.wrd sites-nd-edu-6366 txt/../pos/sites-nd-edu-6366.pos stedolan-github-io-4569 txt/../pos/stedolan-github-io-4569.pos sites-nd-edu-8489 txt/../pos/sites-nd-edu-8489.pos sites-nd-edu-1664 txt/../pos/sites-nd-edu-1664.pos sites-nd-edu-8489 txt/../wrd/sites-nd-edu-8489.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-5154 txt/../pos/sites-nd-edu-5154.pos distantreader-org-7009 txt/../wrd/distantreader-org-7009.wrd sites-nd-edu-2818 txt/../ent/sites-nd-edu-2818.ent infomotions-com-953 txt/../ent/infomotions-com-953.ent sites-nd-edu-1918 txt/../pos/sites-nd-edu-1918.pos sites-nd-edu-7631 txt/../pos/sites-nd-edu-7631.pos github-com-379 txt/../ent/github-com-379.ent sites-nd-edu-7840 txt/../pos/sites-nd-edu-7840.pos sites-nd-edu-5154 txt/../ent/sites-nd-edu-5154.ent distantreader-org-7009 txt/../pos/distantreader-org-7009.pos sites-nd-edu-5464 txt/../wrd/sites-nd-edu-5464.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point youtu-be-1944 txt/../wrd/youtu-be-1944.wrd ucla-zoom-us-1408 txt/../pos/ucla-zoom-us-1408.pos dh-crc-nd-edu-1806 txt/../pos/dh-crc-nd-edu-1806.pos stedolan-github-io-4569 txt/../wrd/stedolan-github-io-4569.wrd distantreader-org-6471 txt/../ent/distantreader-org-6471.ent sites-nd-edu-2178 txt/../pos/sites-nd-edu-2178.pos curl-haxx-se-8721 txt/../ent/curl-haxx-se-8721.ent sites-nd-edu-9146 txt/../wrd/sites-nd-edu-9146.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-8089 txt/../wrd/sites-nd-edu-8089.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point docs-pkp-sfu-ca-7101 txt/../wrd/docs-pkp-sfu-ca-7101.wrd sites-nd-edu-8691 txt/../wrd/sites-nd-edu-8691.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point github-com-7801 txt/../wrd/github-com-7801.wrd github-com-379 txt/../wrd/github-com-379.wrd docs-pkp-sfu-ca-7101 txt/../pos/docs-pkp-sfu-ca-7101.pos sites-nd-edu-6066 txt/../ent/sites-nd-edu-6066.ent sites-nd-edu-3471 txt/../wrd/sites-nd-edu-3471.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-3678 txt/../wrd/sites-nd-edu-3678.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-9146 txt/../pos/sites-nd-edu-9146.pos sites-nd-edu-3574 txt/../wrd/sites-nd-edu-3574.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-8419 txt/../pos/sites-nd-edu-8419.pos sites-nd-edu-3187 txt/../pos/sites-nd-edu-3187.pos infomotions-com-3769 txt/../pos/infomotions-com-3769.pos dh-crc-nd-edu-1806 txt/../wrd/dh-crc-nd-edu-1806.wrd === file2bib.sh === id: sites-nd-edu-3073 author: title: sites-nd-edu-3073 date: pages: extension: .html txt: ./txt/sites-nd-edu-3073.txt cache: ./cache/sites-nd-edu-3073.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'sites-nd-edu-3073.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' sites-nd-edu-8762 txt/../wrd/sites-nd-edu-8762.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-1664 txt/../wrd/sites-nd-edu-1664.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-3187 txt/../wrd/sites-nd-edu-3187.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-8691 txt/../pos/sites-nd-edu-8691.pos sites-nd-edu-2573 txt/../ent/sites-nd-edu-2573.ent curl-haxx-se-8721 txt/../wrd/curl-haxx-se-8721.wrd infomotions-com-953 txt/../wrd/infomotions-com-953.wrd planet-infomotions-com-9545 txt/../ent/planet-infomotions-com-9545.ent sites-nd-edu-1886 txt/../wrd/sites-nd-edu-1886.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point pkp-sfu-ca-4628 txt/../wrd/pkp-sfu-ca-4628.wrd www-gnu-org-8892 txt/../ent/www-gnu-org-8892.ent === file2bib.sh === id: sites-nd-edu-3118 author: title: sites-nd-edu-3118 date: pages: extension: .html txt: ./txt/sites-nd-edu-3118.txt cache: ./cache/sites-nd-edu-3118.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 0 resourceName b'sites-nd-edu-3118.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' github-com-2983 txt/../pos/github-com-2983.pos www-gnu-org-8892 txt/../pos/www-gnu-org-8892.pos mallet-cs-umass-edu-3654 txt/../wrd/mallet-cs-umass-edu-3654.wrd sites-nd-edu-8489 txt/../ent/sites-nd-edu-8489.ent sites-nd-edu-2246 txt/../wrd/sites-nd-edu-2246.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-3469 txt/../wrd/sites-nd-edu-3469.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-7840 txt/../wrd/sites-nd-edu-7840.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-6366 txt/../ent/sites-nd-edu-6366.ent infomotions-com-3769 txt/../ent/infomotions-com-3769.ent sites-nd-edu-5464 txt/../pos/sites-nd-edu-5464.pos sites-nd-edu-2818 txt/../wrd/sites-nd-edu-2818.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-6366 txt/../wrd/sites-nd-edu-6366.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point github-com-8202 txt/../ent/github-com-8202.ent curl-haxx-se-8721 txt/../pos/curl-haxx-se-8721.pos sites-nd-edu-2910 txt/../wrd/sites-nd-edu-2910.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-8707 txt/../wrd/sites-nd-edu-8707.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point === file2bib.sh === id: sites-nd-edu-1720 author: title: sites-nd-edu-1720 date: pages: extension: .html txt: ./txt/sites-nd-edu-1720.txt cache: ./cache/sites-nd-edu-1720.html Content-Type application/octet-stream X-TIKA:EXCEPTION:runtime org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:122) at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:233) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:409) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:147) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:123) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'sites-nd-edu-1720.html' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' www-laurenceanthony-net-8779 txt/../wrd/www-laurenceanthony-net-8779.wrd infomotions-com-3769 txt/../wrd/infomotions-com-3769.wrd github-com-379 txt/../pos/github-com-379.pos === file2bib.sh === id: sites-tufts-edu-6731 author: title: Comments on: date: pages: extension: .xml txt: ./txt/sites-tufts-edu-6731.txt cache: ./cache/sites-tufts-edu-6731.xml Content-Type application/rss+xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.feed.FeedParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 dc:description Tufts Self-Serve Blogs and Websites. dc:title Comments on: description Tufts Self-Serve Blogs and Websites. resourceName b'sites-tufts-edu-6731.xml' title Comments on: www-laurenceanthony-net-8779 txt/../pos/www-laurenceanthony-net-8779.pos github-com-8202 txt/../pos/github-com-8202.pos github-com-9780 txt/../pos/github-com-9780.pos sites-nd-edu-5154 txt/../wrd/sites-nd-edu-5154.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-2573 txt/../wrd/sites-nd-edu-2573.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point www-laurenceanthony-net-8779 txt/../ent/www-laurenceanthony-net-8779.ent github-com-8202 txt/../wrd/github-com-8202.wrd github-com-8326 txt/../wrd/github-com-8326.wrd infomotions-com-9966 txt/../pos/infomotions-com-9966.pos github-com-8025 txt/../pos/github-com-8025.pos github-com-9780 txt/../wrd/github-com-9780.wrd github-com-9780 txt/../ent/github-com-9780.ent === file2bib.sh === id: youtu-be-1944 author: title: Michael Hart in Roanoke (Indiana) - YouTube date: pages: extension: .html txt: ./txt/youtu-be-1944.txt cache: ./cache/youtu-be-1944.html Content-Encoding ISO-8859-1 Content-Language en-US Content-Type text/html; charset=ISO-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 15 X-UA-Compatible IE=edge al:android:app_name YouTube al:android:package com.google.android.youtube al:android:url vnd.youtube://www.youtube.com/watch?v=eeoBbSN9Esg&feature=applinks al:ios:app_name YouTube al:ios:app_store_id 544007664 al:ios:url vnd.youtube://www.youtube.com/watch?v=eeoBbSN9Esg&feature=applinks al:web:url https://www.youtube.com/watch?v=eeoBbSN9Esg&feature=applinks dc:title Michael Hart in Roanoke (Indiana) - YouTube description I made this movie on Saturday, February 27, while Paul Turner and I made our way to Roanoke (Indiana) to listen to Michael Hart tell stories about electronic... fb:app_id 87741124305 keywords Michael, Hart, Roanoke, libraries, Project, Gutenberg og:description I made this movie on Saturday, February 27, while Paul Turner and I made our way to Roanoke (Indiana) to listen to Michael Hart tell stories about electronic... og:image https://i.ytimg.com/vi/eeoBbSN9Esg/hqdefault.jpg og:image:height 360 og:image:width 480 og:site_name YouTube og:title Michael Hart in Roanoke (Indiana) og:type video.other og:url https://www.youtube.com/watch?v=eeoBbSN9Esg og:video:height 360 og:video:secure_url https://www.youtube.com/embed/eeoBbSN9Esg og:video:tag ['Michael', 'Hart', 'Roanoke', 'libraries', 'Project', 'Gutenberg'] og:video:type text/html og:video:url https://www.youtube.com/embed/eeoBbSN9Esg og:video:width 480 resourceName b'youtu-be-1944.html' theme-color rgba(255, 255, 255, 0.98) title ['Michael Hart in Roanoke (Indiana) - YouTube', 'Michael Hart in Roanoke (Indiana)'] twitter:app:id:googleplay com.google.android.youtube twitter:app:id:ipad 544007664 twitter:app:id:iphone 544007664 twitter:app:name:googleplay YouTube twitter:app:name:ipad YouTube twitter:app:name:iphone YouTube twitter:app:url:googleplay https://www.youtube.com/watch?v=eeoBbSN9Esg twitter:app:url:ipad vnd.youtube://www.youtube.com/watch?v=eeoBbSN9Esg&feature=applinks twitter:app:url:iphone vnd.youtube://www.youtube.com/watch?v=eeoBbSN9Esg&feature=applinks twitter:card player twitter:description I made this movie on Saturday, February 27, while Paul Turner and I made our way to Roanoke (Indiana) to listen to Michael Hart tell stories about electronic... twitter:image https://i.ytimg.com/vi/eeoBbSN9Esg/hqdefault.jpg twitter:player https://www.youtube.com/embed/eeoBbSN9Esg twitter:player:height 360 twitter:player:width 480 twitter:site @youtube twitter:title Michael Hart in Roanoke (Indiana) twitter:url https://www.youtube.com/watch?v=eeoBbSN9Esg === file2bib.sh === id: planet-infomotions-com-7919 author: title: Water de Jour date: pages: extension: .xml txt: ./txt/planet-infomotions-com-7919.txt cache: ./cache/planet-infomotions-com-7919.xml Content-Type application/rdf+xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 dc:title Water de Jour resourceName b'planet-infomotions-com-7919.xml' title Water de Jour === file2bib.sh === id: bit-ly-5230 author: title: Link Grabber - Chrome Web Store date: pages: extension: .html txt: ./txt/bit-ly-5230.txt cache: ./cache/bit-ly-5230.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 Description An easy to use extractor or grabber for hyperlinks on an HTML page X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 5 dc:title Link Grabber - Chrome Web Store google-site-verification 6MQ3V3iNTp9Gaek0rQdI1BT1b5HKKsN8_WzyFbu1uWU og:description An easy to use extractor or grabber for hyperlinks on an HTML page og:image https://lh3.googleusercontent.com/gCfcuzXDdWjAG1cALH2c80QynoYf-rxRK3etawkmibZ3JNskafTuo9YHJ27Iau-aHVM4ZtpzP9Y=w128-h128-e365-rj-sc0x00ffffff og:title Link Grabber og:type website og:url https://chrome.google.com/webstore/detail/link-grabber/caodelkhipncidmoebgbbeemedohcdma referrer origin resourceName b'bit-ly-5230.html' title Link Grabber - Chrome Web Store viewport width=device-width, initial-scale=1.0, maximum-scale=1.0 x-ua-compatible IE=edge === file2bib.sh === id: dh-crc-nd-edu-7757 author: title: Project Gutenberg - Home date: pages: extension: .html txt: ./txt/dh-crc-nd-edu-7757.txt cache: ./cache/dh-crc-nd-edu-7757.html Content-Encoding ISO-8859-1 Content-Type text/html; charset=ISO-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 dc:title Project Gutenberg - Home resourceName b'dh-crc-nd-edu-7757.html' title Project Gutenberg - Home viewport width=device-width, initial-scale=1.0 === file2bib.sh === id: infomotions-com-172 author: title: Index of /sandbox date: pages: extension: .html txt: ./txt/infomotions-com-172.txt cache: ./cache/infomotions-com-172.html Content-Encoding windows-1252 Content-Type text/html; charset=windows-1252 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 4 dc:title Index of /sandbox resourceName b'infomotions-com-172.html' title Index of /sandbox planet-infomotions-com-9545 txt/../pos/planet-infomotions-com-9545.pos === file2bib.sh === id: infomotions-com-3637 author: title: Alex Catalogue of Electronic Texts date: pages: extension: .html txt: ./txt/infomotions-com-3637.txt cache: ./cache/infomotions-com-3637.html Content-Encoding ISO-8859-1 Content-Type application/xhtml+xml; charset=ISO-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 dc:title Alex Catalogue of Electronic Texts resourceName b'infomotions-com-3637.html' title Alex Catalogue of Electronic Texts sites-nd-edu-1720 txt/../ent/sites-nd-edu-1720.ent sites-nd-edu-3118 txt/../ent/sites-nd-edu-3118.ent github-com-8025 txt/../ent/github-com-8025.ent infomotions-com-9966 txt/../wrd/infomotions-com-9966.wrd tika-apache-org-2948 txt/../ent/tika-apache-org-2948.ent === file2bib.sh === id: infomotions-com-3852 author: title: Infomotions, LLC date: pages: extension: .html txt: ./txt/infomotions-com-3852.txt cache: ./cache/infomotions-com-3852.html Content-Encoding ISO-8859-1 Content-Type application/xhtml+xml; charset=ISO-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 dc:title Infomotions, LLC resourceName b'infomotions-com-3852.html' title Infomotions, LLC sites-nd-edu-3073 txt/../ent/sites-nd-edu-3073.ent === file2bib.sh === id: serials-infomotions-com-5908 author: title: Index of / date: pages: extension: .html txt: ./txt/serials-infomotions-com-5908.txt cache: ./cache/serials-infomotions-com-5908.html Content-Encoding windows-1252 Content-Type text/html; charset=windows-1252 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 dc:title Index of / resourceName b'serials-infomotions-com-5908.html' title Index of / === file2bib.sh === id: infomotions-com-555 author: title: Water date: pages: extension: .html txt: ./txt/infomotions-com-555.txt cache: ./cache/infomotions-com-555.html Content-Encoding windows-1252 Content-Type text/html; charset=windows-1252 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 dc:title Water resourceName b'infomotions-com-555.html' title Water === file2bib.sh === id: 2020-code4lib-org-5785 author: title: Using & hacking the Distant Reader date: pages: extension: .html txt: ./txt/2020-code4lib-org-5785.txt cache: ./cache/2020-code4lib-org-5785.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 X-UA-Compatible IE=edge,chrome=1 dc:title Using & hacking the Distant Reader description resourceName b'2020-code4lib-org-5785.html' title Using & hacking the Distant Reader viewport width=device-width, initial-scale=1 github-com-8326 txt/../pos/github-com-8326.pos === file2bib.sh === id: mallet-cs-umass-edu-3654 author: title: MALLET homepage date: pages: extension: .html txt: ./txt/mallet-cs-umass-edu-3654.txt cache: ./cache/mallet-cs-umass-edu-3654.html Content-Encoding ISO-8859-1 Content-Type text/html; charset=ISO-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 dc:title MALLET homepage resourceName b'mallet-cs-umass-edu-3654.html' title MALLET homepage === file2bib.sh === id: planet-infomotions-com-4104 author: title: Eric Lease Morgan's Writings Timeline date: pages: extension: .html txt: ./txt/planet-infomotions-com-4104.txt cache: ./cache/planet-infomotions-com-4104.html Content-Encoding UTF-8 Content-Type text/html; charset=UTF-8 Content-Type-Hint text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 dc:title Eric Lease Morgan's Writings Timeline resourceName b'planet-infomotions-com-4104.html' title Eric Lease Morgan's Writings Timeline === file2bib.sh === id: twitter-com-9838 author: title: twitter-com-9838 date: pages: extension: .html txt: ./txt/twitter-com-9838.txt cache: ./cache/twitter-com-9838.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 4 apple-mobile-web-app-status-bar-style white apple-mobile-web-app-title Twitter fb:app_id 2231777543 google-site-verification V0yIS0Ec_o3Ii9KThrCoMCkwTYMMJ_JYx_RSaGhFYvw mobile-web-app-capable yes og:site_name Twitter origin-trial Apir4chqTX+4eFxKD+ErQlKRB/VtZ/dvnLfd9Y9Nenl5r1xJcf81alryTHYQiuUlz9Q49MqGXqyaiSmqWzHUqQwAAABneyJvcmlnaW4iOiJodHRwczovL3R3aXR0ZXIuY29tOjQ0MyIsImZlYXR1cmUiOiJDb250YWN0c01hbmFnZXIiLCJleHBpcnkiOjE1NzUwMzUyODMsImlzU3ViZG9tYWluIjp0cnVlfQ== resourceName b'twitter-com-9838.html' theme-color #ffffff viewport width=device-width,initial-scale=1,maximum-scale=1,user-scalable=0,viewport-fit=cover infomotions-com-9966 txt/../ent/infomotions-com-9966.ent tika-apache-org-2948 txt/../pos/tika-apache-org-2948.pos === file2bib.sh === id: stedolan-github-io-4569 author: title: jq date: pages: extension: .html txt: ./txt/stedolan-github-io-4569.txt cache: ./cache/stedolan-github-io-4569.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 X-UA-Compatible IE=edge dc:title jq resourceName b'stedolan-github-io-4569.html' title jq viewport width=device-width, initial-scale=1 github-com-8025 txt/../wrd/github-com-8025.wrd planet-infomotions-com-9545 txt/../wrd/planet-infomotions-com-9545.wrd === file2bib.sh === id: distantreader-org-7009 author: title: Home date: pages: extension: .html txt: ./txt/distantreader-org-7009.txt cache: ./cache/distantreader-org-7009.html Content-Encoding UTF-8 Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 X-UA-Compatible IE=edge dc:title Home description resourceName b'distantreader-org-7009.html' title Home viewport width=device-width, initial-scale=1 sites-nd-edu-3073 txt/../pos/sites-nd-edu-3073.pos tika-apache-org-2948 txt/../wrd/tika-apache-org-2948.wrd === file2bib.sh === id: curl-haxx-se-8721 author: title: curl date: pages: extension: .html txt: ./txt/curl-haxx-se-8721.txt cache: ./cache/curl-haxx-se-8721.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 Content-Type-Hint text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 dc:title curl resourceName b'curl-haxx-se-8721.html' title curl viewport width=device-width, initial-scale=1.0 === file2bib.sh === id: infomotions-com-3769 author: title: Fun with RSS and the RSS aggregator called Planet – Infomotions Mini-Musings date: pages: extension: .html txt: ./txt/infomotions-com-3769.txt cache: ./cache/infomotions-com-3769.html Content-Encoding UTF-8 Content-Language en-US Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 5 dc:title Fun with RSS and the RSS aggregator called Planet – Infomotions Mini-Musings generator WordPress 5.1.8 resourceName b'infomotions-com-3769.html' title Fun with RSS and the RSS aggregator called Planet – Infomotions Mini-Musings viewport width=device-width github-com-8326 txt/../ent/github-com-8326.ent === file2bib.sh === id: infomotions-com-953 author: Eric Lease Morgan title: Infomotions' Musings on Information and Librarianship date: pages: extension: .html txt: ./txt/infomotions-com-953.txt cache: ./cache/infomotions-com-953.html Content-Encoding ISO-8859-1 Content-Language en-US Content-Type application/xhtml+xml; charset=ISO-8859-1 Content-Type-Hint text/html; charset=iso-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 creator Eric Lease Morgan dc:title Infomotions' Musings on Information and Librarianship description This is an (incomplete) collection of the things I've written -- my musings.... It contains pre-edited, formally published articles, travel logs, descriptions of software applications, and the hand-outs of workshops and presentations. Feel free to use the items in this collection as you see fit, but please don't call the works your own. Place the blame and/or credit where the blame and/or credit is due. identifier http://infomotions.com/musings/ publisher Infomotions, Inc. resourceName b'infomotions-com-953.html' rights This document is distributed under the GNU Public License. subject libraries, librarians, and librarianship title Infomotions' Musings on Information and Librarianship sites-nd-edu-3073 txt/../wrd/sites-nd-edu-3073.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point sites-nd-edu-1720 txt/../pos/sites-nd-edu-1720.pos sites-nd-edu-3118 txt/../pos/sites-nd-edu-3118.pos === file2bib.sh === id: distantreader-org-6471 author: title: Home date: pages: extension: .html txt: ./txt/distantreader-org-6471.txt cache: ./cache/distantreader-org-6471.html Content-Encoding UTF-8 Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 X-UA-Compatible IE=edge dc:title Home description resourceName b'distantreader-org-6471.html' title Home viewport width=device-width, initial-scale=1 sites-nd-edu-1720 txt/../wrd/sites-nd-edu-1720.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point === file2bib.sh === id: dh-crc-nd-edu-1806 author: title: DH Blog @ Notre Dame date: pages: extension: .xml txt: ./txt/dh-crc-nd-edu-1806.txt cache: ./cache/dh-crc-nd-edu-1806.xml Content-Type application/rss+xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.feed.FeedParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 5 dc:description Learning about human expression through the use of computers dc:title DH Blog @ Notre Dame description Learning about human expression through the use of computers resourceName b'dh-crc-nd-edu-1806.xml' title DH Blog @ Notre Dame === file2bib.sh === id: docs-pkp-sfu-ca-7101 author: title: REST API Reference, 3.1.x - Open Journal Systems date: pages: extension: .html txt: ./txt/docs-pkp-sfu-ca-7101.txt cache: ./cache/docs-pkp-sfu-ca-7101.html Content-Encoding UTF-8 Content-Language en-US Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3 dc:title REST API Reference, 3.1.x - Open Journal Systems description This guide documents the REST API endpoints which can be accessed in Open Journal Systems v3.1.x. It is a technical reference for software developers who wish to build custom interactions with the platform. resourceName b'docs-pkp-sfu-ca-7101.html' title REST API Reference, 3.1.x - Open Journal Systems viewport width=device-width, initial-scale=1.0 sites-nd-edu-3118 txt/../wrd/sites-nd-edu-3118.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point === file2bib.sh === id: www-laurenceanthony-net-8779 author: title: Laurence Anthony's AntConc date: pages: extension: .html txt: ./txt/www-laurenceanthony-net-8779.txt cache: ./cache/www-laurenceanthony-net-8779.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 4 author Laurence Anthony dc:title Laurence Anthony's AntConc description The website of Laurence Anthony. Professor at Waseda University Japan, developer of AntConc, a freeware concordancer software program for Windows, Linux, and Macintosh OS X keywords freeware concordance, concordancer, windows, linux, macintosh, mac, os x, osx, antconc, AntConc, AntMover, ESP, CELESE resourceName b'www-laurenceanthony-net-8779.html' title Laurence Anthony's AntConc viewport width=device-width, initial-scale=1.0 === file2bib.sh === id: pkp-sfu-ca-4628 author: title: Open Journal Systems | Public Knowledge Project date: pages: extension: .html txt: ./txt/pkp-sfu-ca-4628.txt cache: ./cache/pkp-sfu-ca-4628.html Content-Encoding UTF-8 Content-Language en-US Content-Type text/html; charset=UTF-8 Content-Type-Hint text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 6 dc:title Open Journal Systems | Public Knowledge Project description Public Knowledge Project - Open Journal Systems generator WordPress 5.5.3 resourceName b'pkp-sfu-ca-4628.html' title Open Journal Systems | Public Knowledge Project === file2bib.sh === id: github-com-379 author: title: GitHub - ericleasemorgan/reader-gutenberg: A system for implementing an index to Project Gutenberg date: pages: extension: .html txt: ./txt/github-com-379.txt cache: ./cache/github-com-379.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 10 analytics-location // apple-itunes-app app-id=1477376905 browser-errors-url https://api.github.com/_private/browser/errors browser-optimizely-client-errors-url https://api.github.com/_private/browser/optimizely_client/errors browser-stats-url https://api.github.com/_private/browser/stats cookie-consent-required false dc:title GitHub - ericleasemorgan/reader-gutenberg: A system for implementing an index to Project Gutenberg description A system for implementing an index to Project Gutenberg - ericleasemorgan/reader-gutenberg enabled-features MARKETPLACE_PENDING_INSTALLATIONS expected-hostname github.com fb:app_id 1401488693436528 github-keyboard-shortcuts repository go-import github.com/ericleasemorgan/reader-gutenberg git https://github.com/ericleasemorgan/reader-gutenberg.git google-site-verification ['c1kuD-K2HIVF635lypcsWPoD4kilo5-jA_wBFyT4uMY', 'KT5gs8h0wvaagLKAVWq8bbeNwnZZK1r1XQysX3xurLU', 'ZzhVyEFwb7w3e0-uOTltm8Jsck2F5StVihD0exw2fsA', 'GXs5KoUUkNCoaAZn7wPN-t01Pywp9M3sEjnt_3_ZWPc'] hostname github.com hovercard-subject-tag repository:184406460 html-safe-nonce 3d9bc5b63a2bc016135aca9e96793c2fc3b4fce4567b5fb26c40264e94c43393 octolytics-app-id github octolytics-dimension-repository_explore_github_marketplace_ci_cta_shown false octolytics-dimension-repository_id 184406460 octolytics-dimension-repository_is_fork false octolytics-dimension-repository_network_root_id 184406460 octolytics-dimension-repository_network_root_nwo ericleasemorgan/reader-gutenberg octolytics-dimension-repository_nwo ericleasemorgan/reader-gutenberg octolytics-dimension-repository_public true octolytics-dimension-user_id 539005 octolytics-dimension-user_login ericleasemorgan octolytics-event-url https://collector.githubapp.com/github-external/browser_event octolytics-host collector.githubapp.com og:description A system for implementing an index to Project Gutenberg - ericleasemorgan/reader-gutenberg og:image https://avatars1.githubusercontent.com/u/539005?s=400&v=4 og:site_name GitHub og:title ericleasemorgan/reader-gutenberg og:type object og:url https://github.com/ericleasemorgan/reader-gutenberg optimizely-datafile {"version": "4", "rollouts": [], "typedAudiences": [], "anonymizeIP": true, "projectId": "16737760170", "variables": [], "featureFlags": [], "experiments": [{"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "18630402174", "key": "launchpad"}, {"variables": [], "id": "18866331456", "key": "control"}], "id": "18651193356", "key": "_features_redesign_rollout", "layerId": "18645992876", "trafficAllocation": [{"entityId": "18630402174", "endOfRange": 500}, {"entityId": "18866331456", "endOfRange": 1000}, {"entityId": "18630402174", "endOfRange": 5000}, {"entityId": "18630402174", "endOfRange": 5500}, {"entityId": "18866331456", "endOfRange": 10000}], "forcedVariations": {"143327983.1601483920": "launchpad", "1955030087.1562868941": "launchpad", "1983887325.1550021416": "launchpad", "1947530619.1600461583": "launchpad"}}, {"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "19136700362", "key": "show_plans"}, {"variables": [], "id": "19157700511", "key": "control"}], "id": "19062314978", "key": "account_billing_plans", "layerId": "19068014945", "trafficAllocation": [{"entityId": "19136700362", "endOfRange": 5000}, {"entityId": "19157700511", "endOfRange": 10000}], "forcedVariations": {"1238720267648ea2c88a74b410aa3c5c": "show_plans", "c4abf59d1620c671458b2a74df2a2410": "control"}}], "audiences": [{"conditions": "[\"or\", {\"match\": \"exact\", \"name\": \"$opt_dummy_attribute\", \"type\": \"custom_attribute\", \"value\": \"$opt_dummy_value\"}]", "id": "$opt_dummy_audience", "name": "Optimizely-Generated Audience for Backwards Compatibility"}], "groups": [], "attributes": [{"id": "16822470375", "key": "user_id"}, {"id": "17143601254", "key": "spammy"}, {"id": "18175660309", "key": "organization_plan"}, {"id": "18813001570", "key": "is_logged_in"}, {"id": "19073851829", "key": "geo"}], "botFiltering": false, "accountId": "16737760170", "events": [{"experimentIds": [], "id": "17911811441", "key": "hydro_click.dashboard.teacher_toolbox_cta"}, {"experimentIds": [], "id": "18124116703", "key": "submit.organizations.complete_sign_up"}, {"experimentIds": [], "id": "18145892387", "key": "no_metric.tracked_outside_of_optimizely"}, {"experimentIds": [], "id": "18178755568", "key": "click.org_onboarding_checklist.add_repo"}, {"experimentIds": [], "id": "18180553241", "key": "submit.repository_imports.create"}, {"experimentIds": [], "id": "18186103728", "key": "click.help.learn_more_about_repository_creation"}, {"experimentIds": [], "id": "18188530140", "key": "test_event.do_not_use_in_production"}, {"experimentIds": [], "id": "18191963644", "key": "click.empty_org_repo_cta.transfer_repository"}, {"experimentIds": [], "id": "18195612788", "key": "click.empty_org_repo_cta.import_repository"}, {"experimentIds": [], "id": "18210945499", "key": "click.org_onboarding_checklist.invite_members"}, {"experimentIds": [], "id": "18211063248", "key": "click.empty_org_repo_cta.create_repository"}, {"experimentIds": [], "id": "18215721889", "key": "click.org_onboarding_checklist.update_profile"}, {"experimentIds": [], "id": "18224360785", "key": "click.org_onboarding_checklist.dismiss"}, {"experimentIds": [], "id": "18234832286", "key": "submit.organization_activation.complete"}, {"experimentIds": [], "id": "18252392383", "key": "submit.org_repository.create"}, {"experimentIds": [], "id": "18257551537", "key": "submit.org_member_invitation.create"}, {"experimentIds": [], "id": "18259522260", "key": "submit.organization_profile.update"}, {"experimentIds": [], "id": "18564603625", "key": "view.classroom_select_organization"}, {"experimentIds": [], "id": "18568612016", "key": "click.classroom_sign_in_click"}, {"experimentIds": [], "id": "18572592540", "key": "view.classroom_name"}, {"experimentIds": [], "id": "18574203855", "key": "click.classroom_create_organization"}, {"experimentIds": [], "id": "18582053415", "key": "click.classroom_select_organization"}, {"experimentIds": [], "id": "18589463420", "key": "click.classroom_create_classroom"}, {"experimentIds": [], "id": "18591323364", "key": "click.classroom_create_first_classroom"}, {"experimentIds": [], "id": "18591652321", "key": "click.classroom_grant_access"}, {"experimentIds": [], "id": "18607131425", "key": "view.classroom_creation"}, {"experimentIds": [], "id": "18831680583", "key": "upgrade_account_plan"}, {"experimentIds": [], "id": "19064064515", "key": "click.signup"}, {"experimentIds": [], "id": "19075373687", "key": "click.view_account_billing_page"}, {"experimentIds": [], "id": "19077355841", "key": "click.dismiss_signup_prompt"}, {"experimentIds": ["19062314978"], "id": "19079713938", "key": "click.contact_sales"}, {"experimentIds": ["19062314978"], "id": "19120963070", "key": "click.compare_account_plans"}, {"experimentIds": ["19062314978"], "id": "19151690317", "key": "click.upgrade_account_cta"}], "revision": "338"} request-id CBC4:04A1:19BE609:24B3ED2:5FC25ECD resourceName b'github-com-379.html' theme-color #1e2327 title GitHub - ericleasemorgan/reader-gutenberg: A system for implementing an index to Project Gutenberg twitter:card summary twitter:description A system for implementing an index to Project Gutenberg - ericleasemorgan/reader-gutenberg twitter:image:src https://avatars1.githubusercontent.com/u/539005?s=400&v=4 twitter:site @github twitter:title ericleasemorgan/reader-gutenberg user-login viewport width=device-width visitor-hmac 39d653471abf758139166af29e79755c713618d52004eabf88f8700a427cb4e8 visitor-payload eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDQkM0OjA0QTE6MTlCRTYwOToyNEIzRUQyOjVGQzI1RUNEIiwidmlzaXRvcl9pZCI6IjY3NTU3NjIzNjczOTg4Mjk3NzIiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== x-pjax-version 2e8f71f42810547327b61d588a3a343a43cfff382213dc6ed94d44854f0d23c0 dh-crc-nd-edu-9558 txt/../wrd/dh-crc-nd-edu-9558.wrd === file2bib.sh === id: github-com-2983 author: title: GitHub - ericleasemorgan/reader-toolbox: A suite of scripts use to report on and analyze the content of Distant Reader "study carrels" date: pages: extension: .html txt: ./txt/github-com-2983.txt cache: ./cache/github-com-2983.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 22 analytics-location // apple-itunes-app app-id=1477376905 browser-errors-url https://api.github.com/_private/browser/errors browser-optimizely-client-errors-url https://api.github.com/_private/browser/optimizely_client/errors browser-stats-url https://api.github.com/_private/browser/stats cookie-consent-required false dc:title GitHub - ericleasemorgan/reader-toolbox: A suite of scripts use to report on and analyze the content of Distant Reader "study carrels" description A suite of scripts use to report on and analyze the content of Distant Reader "study carrels" - ericleasemorgan/reader-toolbox enabled-features MARKETPLACE_PENDING_INSTALLATIONS expected-hostname github.com fb:app_id 1401488693436528 github-keyboard-shortcuts repository go-import github.com/ericleasemorgan/reader-toolbox git https://github.com/ericleasemorgan/reader-toolbox.git google-site-verification ['c1kuD-K2HIVF635lypcsWPoD4kilo5-jA_wBFyT4uMY', 'KT5gs8h0wvaagLKAVWq8bbeNwnZZK1r1XQysX3xurLU', 'ZzhVyEFwb7w3e0-uOTltm8Jsck2F5StVihD0exw2fsA', 'GXs5KoUUkNCoaAZn7wPN-t01Pywp9M3sEjnt_3_ZWPc'] hostname github.com hovercard-subject-tag repository:202905079 html-safe-nonce 8ac3c55a00f6fc69c53ea24c32e2ce9e3d072ed7206c72cdafebec0143673c21 octolytics-app-id github octolytics-dimension-repository_explore_github_marketplace_ci_cta_shown false octolytics-dimension-repository_id 202905079 octolytics-dimension-repository_is_fork false octolytics-dimension-repository_network_root_id 202905079 octolytics-dimension-repository_network_root_nwo ericleasemorgan/reader-toolbox octolytics-dimension-repository_nwo ericleasemorgan/reader-toolbox octolytics-dimension-repository_public true octolytics-dimension-user_id 539005 octolytics-dimension-user_login ericleasemorgan octolytics-event-url https://collector.githubapp.com/github-external/browser_event octolytics-host collector.githubapp.com og:description A suite of scripts use to report on and analyze the content of Distant Reader "study carrels" - ericleasemorgan/reader-toolbox og:image https://avatars1.githubusercontent.com/u/539005?s=400&v=4 og:site_name GitHub og:title ericleasemorgan/reader-toolbox og:type object og:url https://github.com/ericleasemorgan/reader-toolbox optimizely-datafile {"version": "4", "rollouts": [], "typedAudiences": [], "anonymizeIP": true, "projectId": "16737760170", "variables": [], "featureFlags": [], "experiments": [{"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "18630402174", "key": "launchpad"}, {"variables": [], "id": "18866331456", "key": "control"}], "id": "18651193356", "key": "_features_redesign_rollout", "layerId": "18645992876", "trafficAllocation": [{"entityId": "18630402174", "endOfRange": 500}, {"entityId": "18866331456", "endOfRange": 1000}, {"entityId": "18630402174", "endOfRange": 5000}, {"entityId": "18630402174", "endOfRange": 5500}, {"entityId": "18866331456", "endOfRange": 10000}], "forcedVariations": {"143327983.1601483920": "launchpad", "1955030087.1562868941": "launchpad", "1983887325.1550021416": "launchpad", "1947530619.1600461583": "launchpad"}}, {"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "19136700362", "key": "show_plans"}, {"variables": [], "id": "19157700511", "key": "control"}], "id": "19062314978", "key": "account_billing_plans", "layerId": "19068014945", "trafficAllocation": [{"entityId": "19136700362", "endOfRange": 5000}, {"entityId": "19157700511", "endOfRange": 10000}], "forcedVariations": {"1238720267648ea2c88a74b410aa3c5c": "show_plans", "c4abf59d1620c671458b2a74df2a2410": "control"}}], "audiences": [{"conditions": "[\"or\", {\"match\": \"exact\", \"name\": \"$opt_dummy_attribute\", \"type\": \"custom_attribute\", \"value\": \"$opt_dummy_value\"}]", "id": "$opt_dummy_audience", "name": "Optimizely-Generated Audience for Backwards Compatibility"}], "groups": [], "attributes": [{"id": "16822470375", "key": "user_id"}, {"id": "17143601254", "key": "spammy"}, {"id": "18175660309", "key": "organization_plan"}, {"id": "18813001570", "key": "is_logged_in"}, {"id": "19073851829", "key": "geo"}], "botFiltering": false, "accountId": "16737760170", "events": [{"experimentIds": [], "id": "17911811441", "key": "hydro_click.dashboard.teacher_toolbox_cta"}, {"experimentIds": [], "id": "18124116703", "key": "submit.organizations.complete_sign_up"}, {"experimentIds": [], "id": "18145892387", "key": "no_metric.tracked_outside_of_optimizely"}, {"experimentIds": [], "id": "18178755568", "key": "click.org_onboarding_checklist.add_repo"}, {"experimentIds": [], "id": "18180553241", "key": "submit.repository_imports.create"}, {"experimentIds": [], "id": "18186103728", "key": "click.help.learn_more_about_repository_creation"}, {"experimentIds": [], "id": "18188530140", "key": "test_event.do_not_use_in_production"}, {"experimentIds": [], "id": "18191963644", "key": "click.empty_org_repo_cta.transfer_repository"}, {"experimentIds": [], "id": "18195612788", "key": "click.empty_org_repo_cta.import_repository"}, {"experimentIds": [], "id": "18210945499", "key": "click.org_onboarding_checklist.invite_members"}, {"experimentIds": [], "id": "18211063248", "key": "click.empty_org_repo_cta.create_repository"}, {"experimentIds": [], "id": "18215721889", "key": "click.org_onboarding_checklist.update_profile"}, {"experimentIds": [], "id": "18224360785", "key": "click.org_onboarding_checklist.dismiss"}, {"experimentIds": [], "id": "18234832286", "key": "submit.organization_activation.complete"}, {"experimentIds": [], "id": "18252392383", "key": "submit.org_repository.create"}, {"experimentIds": [], "id": "18257551537", "key": "submit.org_member_invitation.create"}, {"experimentIds": [], "id": "18259522260", "key": "submit.organization_profile.update"}, {"experimentIds": [], "id": "18564603625", "key": "view.classroom_select_organization"}, {"experimentIds": [], "id": "18568612016", "key": "click.classroom_sign_in_click"}, {"experimentIds": [], "id": "18572592540", "key": "view.classroom_name"}, {"experimentIds": [], "id": "18574203855", "key": "click.classroom_create_organization"}, {"experimentIds": [], "id": "18582053415", "key": "click.classroom_select_organization"}, {"experimentIds": [], "id": "18589463420", "key": "click.classroom_create_classroom"}, {"experimentIds": [], "id": "18591323364", "key": "click.classroom_create_first_classroom"}, {"experimentIds": [], "id": "18591652321", "key": "click.classroom_grant_access"}, {"experimentIds": [], "id": "18607131425", "key": "view.classroom_creation"}, {"experimentIds": [], "id": "18831680583", "key": "upgrade_account_plan"}, {"experimentIds": [], "id": "19064064515", "key": "click.signup"}, {"experimentIds": [], "id": "19075373687", "key": "click.view_account_billing_page"}, {"experimentIds": [], "id": "19077355841", "key": "click.dismiss_signup_prompt"}, {"experimentIds": ["19062314978"], "id": "19079713938", "key": "click.contact_sales"}, {"experimentIds": ["19062314978"], "id": "19120963070", "key": "click.compare_account_plans"}, {"experimentIds": ["19062314978"], "id": "19151690317", "key": "click.upgrade_account_cta"}], "revision": "338"} request-id CBB0:7D3F:2797549:35CD6A2:5FC25ECC resourceName b'github-com-2983.html' theme-color #1e2327 title GitHub - ericleasemorgan/reader-toolbox: A suite of scripts use to report on and analyze the content of Distant Reader "study carrels" twitter:card summary twitter:description A suite of scripts use to report on and analyze the content of Distant Reader "study carrels" - ericleasemorgan/reader-toolbox twitter:image:src https://avatars1.githubusercontent.com/u/539005?s=400&v=4 twitter:site @github twitter:title ericleasemorgan/reader-toolbox user-login viewport width=device-width visitor-hmac 140303e30ddbc3bce315bfec20dc8fd454d2d08ed4244c301f2e695be367f604 visitor-payload eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDQkIwOjdEM0Y6Mjc5NzU0OTozNUNENkEyOjVGQzI1RUNDIiwidmlzaXRvcl9pZCI6IjQxNzQyODE2NzU4MjE0NDA3MTYiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== x-pjax-version 2e8f71f42810547327b61d588a3a343a43cfff382213dc6ed94d44854f0d23c0 === file2bib.sh === id: github-com-7801 author: title: GitHub - ericleasemorgan/reader-lite: Given a file and a directory, output analysis of file to directory date: pages: extension: .html txt: ./txt/github-com-7801.txt cache: ./cache/github-com-7801.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 9 analytics-location // apple-itunes-app app-id=1477376905 browser-errors-url https://api.github.com/_private/browser/errors browser-optimizely-client-errors-url https://api.github.com/_private/browser/optimizely_client/errors browser-stats-url https://api.github.com/_private/browser/stats cookie-consent-required false dc:title GitHub - ericleasemorgan/reader-lite: Given a file and a directory, output analysis of file to directory description Given a file and a directory, output analysis of file to directory - ericleasemorgan/reader-lite enabled-features MARKETPLACE_PENDING_INSTALLATIONS expected-hostname github.com fb:app_id 1401488693436528 github-keyboard-shortcuts repository go-import github.com/ericleasemorgan/reader-lite git https://github.com/ericleasemorgan/reader-lite.git google-site-verification ['c1kuD-K2HIVF635lypcsWPoD4kilo5-jA_wBFyT4uMY', 'KT5gs8h0wvaagLKAVWq8bbeNwnZZK1r1XQysX3xurLU', 'ZzhVyEFwb7w3e0-uOTltm8Jsck2F5StVihD0exw2fsA', 'GXs5KoUUkNCoaAZn7wPN-t01Pywp9M3sEjnt_3_ZWPc'] hostname github.com hovercard-subject-tag repository:227520485 html-safe-nonce 4370482afcf527e6ecad848e8a803737e3032c7f9670553ed50388dad6dcbd98 octolytics-app-id github octolytics-dimension-repository_explore_github_marketplace_ci_cta_shown false octolytics-dimension-repository_id 227520485 octolytics-dimension-repository_is_fork false octolytics-dimension-repository_network_root_id 227520485 octolytics-dimension-repository_network_root_nwo ericleasemorgan/reader-lite octolytics-dimension-repository_nwo ericleasemorgan/reader-lite octolytics-dimension-repository_public true octolytics-dimension-user_id 539005 octolytics-dimension-user_login ericleasemorgan octolytics-event-url https://collector.githubapp.com/github-external/browser_event octolytics-host collector.githubapp.com og:description Given a file and a directory, output analysis of file to directory - ericleasemorgan/reader-lite og:image https://avatars1.githubusercontent.com/u/539005?s=400&v=4 og:site_name GitHub og:title ericleasemorgan/reader-lite og:type object og:url https://github.com/ericleasemorgan/reader-lite optimizely-datafile {"version": "4", "rollouts": [], "typedAudiences": [], "anonymizeIP": true, "projectId": "16737760170", "variables": [], "featureFlags": [], "experiments": [{"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "18630402174", "key": "launchpad"}, {"variables": [], "id": "18866331456", "key": "control"}], "id": "18651193356", "key": "_features_redesign_rollout", "layerId": "18645992876", "trafficAllocation": [{"entityId": "18630402174", "endOfRange": 500}, {"entityId": "18866331456", "endOfRange": 1000}, {"entityId": "18630402174", "endOfRange": 5000}, {"entityId": "18630402174", "endOfRange": 5500}, {"entityId": "18866331456", "endOfRange": 10000}], "forcedVariations": {"143327983.1601483920": "launchpad", "1955030087.1562868941": "launchpad", "1983887325.1550021416": "launchpad", "1947530619.1600461583": "launchpad"}}, {"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "19136700362", "key": "show_plans"}, {"variables": [], "id": "19157700511", "key": "control"}], "id": "19062314978", "key": "account_billing_plans", "layerId": "19068014945", "trafficAllocation": [{"entityId": "19136700362", "endOfRange": 5000}, {"entityId": "19157700511", "endOfRange": 10000}], "forcedVariations": {"1238720267648ea2c88a74b410aa3c5c": "show_plans", "c4abf59d1620c671458b2a74df2a2410": "control"}}], "audiences": [{"conditions": "[\"or\", {\"match\": \"exact\", \"name\": \"$opt_dummy_attribute\", \"type\": \"custom_attribute\", \"value\": \"$opt_dummy_value\"}]", "id": "$opt_dummy_audience", "name": "Optimizely-Generated Audience for Backwards Compatibility"}], "groups": [], "attributes": [{"id": "16822470375", "key": "user_id"}, {"id": "17143601254", "key": "spammy"}, {"id": "18175660309", "key": "organization_plan"}, {"id": "18813001570", "key": "is_logged_in"}, {"id": "19073851829", "key": "geo"}], "botFiltering": false, "accountId": "16737760170", "events": [{"experimentIds": [], "id": "17911811441", "key": "hydro_click.dashboard.teacher_toolbox_cta"}, {"experimentIds": [], "id": "18124116703", "key": "submit.organizations.complete_sign_up"}, {"experimentIds": [], "id": "18145892387", "key": "no_metric.tracked_outside_of_optimizely"}, {"experimentIds": [], "id": "18178755568", "key": "click.org_onboarding_checklist.add_repo"}, {"experimentIds": [], "id": "18180553241", "key": "submit.repository_imports.create"}, {"experimentIds": [], "id": "18186103728", "key": "click.help.learn_more_about_repository_creation"}, {"experimentIds": [], "id": "18188530140", "key": "test_event.do_not_use_in_production"}, {"experimentIds": [], "id": "18191963644", "key": "click.empty_org_repo_cta.transfer_repository"}, {"experimentIds": [], "id": "18195612788", "key": "click.empty_org_repo_cta.import_repository"}, {"experimentIds": [], "id": "18210945499", "key": "click.org_onboarding_checklist.invite_members"}, {"experimentIds": [], "id": "18211063248", "key": "click.empty_org_repo_cta.create_repository"}, {"experimentIds": [], "id": "18215721889", "key": "click.org_onboarding_checklist.update_profile"}, {"experimentIds": [], "id": "18224360785", "key": "click.org_onboarding_checklist.dismiss"}, {"experimentIds": [], "id": "18234832286", "key": "submit.organization_activation.complete"}, {"experimentIds": [], "id": "18252392383", "key": "submit.org_repository.create"}, {"experimentIds": [], "id": "18257551537", "key": "submit.org_member_invitation.create"}, {"experimentIds": [], "id": "18259522260", "key": "submit.organization_profile.update"}, {"experimentIds": [], "id": "18564603625", "key": "view.classroom_select_organization"}, {"experimentIds": [], "id": "18568612016", "key": "click.classroom_sign_in_click"}, {"experimentIds": [], "id": "18572592540", "key": "view.classroom_name"}, {"experimentIds": [], "id": "18574203855", "key": "click.classroom_create_organization"}, {"experimentIds": [], "id": "18582053415", "key": "click.classroom_select_organization"}, {"experimentIds": [], "id": "18589463420", "key": "click.classroom_create_classroom"}, {"experimentIds": [], "id": "18591323364", "key": "click.classroom_create_first_classroom"}, {"experimentIds": [], "id": "18591652321", "key": "click.classroom_grant_access"}, {"experimentIds": [], "id": "18607131425", "key": "view.classroom_creation"}, {"experimentIds": [], "id": "18831680583", "key": "upgrade_account_plan"}, {"experimentIds": [], "id": "19064064515", "key": "click.signup"}, {"experimentIds": [], "id": "19075373687", "key": "click.view_account_billing_page"}, {"experimentIds": [], "id": "19077355841", "key": "click.dismiss_signup_prompt"}, {"experimentIds": ["19062314978"], "id": "19079713938", "key": "click.contact_sales"}, {"experimentIds": ["19062314978"], "id": "19120963070", "key": "click.compare_account_plans"}, {"experimentIds": ["19062314978"], "id": "19151690317", "key": "click.upgrade_account_cta"}], "revision": "338"} request-id CBA6:14BB:29AD218:38BE668:5FC25ECC resourceName b'github-com-7801.html' theme-color #1e2327 title GitHub - ericleasemorgan/reader-lite: Given a file and a directory, output analysis of file to directory twitter:card summary twitter:description Given a file and a directory, output analysis of file to directory - ericleasemorgan/reader-lite twitter:image:src https://avatars1.githubusercontent.com/u/539005?s=400&v=4 twitter:site @github twitter:title ericleasemorgan/reader-lite user-login viewport width=device-width visitor-hmac 8366c979338f47e26f91fd209711d002777f0fd66df9f669eb3aefe8e2b87b0f visitor-payload eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDQkE2OjE0QkI6MjlBRDIxODozOEJFNjY4OjVGQzI1RUNDIiwidmlzaXRvcl9pZCI6IjEwNjQ2MzMyMTgwMTYwMjA0IiwicmVnaW9uX2VkZ2UiOiJpYWQiLCJyZWdpb25fcmVuZGVyIjoiaWFkIn0= x-pjax-version 2e8f71f42810547327b61d588a3a343a43cfff382213dc6ed94d44854f0d23c0 dh-crc-nd-edu-9558 txt/../pos/dh-crc-nd-edu-9558.pos === file2bib.sh === id: github-com-8326 author: title: GitHub - ericleasemorgan/htid2books: Given an access key, secret token, and a HathiTrust identifier, output plain text as well as PDF versions of a book. date: pages: extension: .html txt: ./txt/github-com-8326.txt cache: ./cache/github-com-8326.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 10 analytics-location // apple-itunes-app app-id=1477376905 browser-errors-url https://api.github.com/_private/browser/errors browser-optimizely-client-errors-url https://api.github.com/_private/browser/optimizely_client/errors browser-stats-url https://api.github.com/_private/browser/stats cookie-consent-required false dc:title GitHub - ericleasemorgan/htid2books: Given an access key, secret token, and a HathiTrust identifier, output plain text as well as PDF versions of a book. description Given an access key, secret token, and a HathiTrust identifier, output plain text as well as PDF versions of a book. - ericleasemorgan/htid2books enabled-features MARKETPLACE_PENDING_INSTALLATIONS expected-hostname github.com fb:app_id 1401488693436528 github-keyboard-shortcuts repository go-import github.com/ericleasemorgan/htid2books git https://github.com/ericleasemorgan/htid2books.git google-site-verification ['c1kuD-K2HIVF635lypcsWPoD4kilo5-jA_wBFyT4uMY', 'KT5gs8h0wvaagLKAVWq8bbeNwnZZK1r1XQysX3xurLU', 'ZzhVyEFwb7w3e0-uOTltm8Jsck2F5StVihD0exw2fsA', 'GXs5KoUUkNCoaAZn7wPN-t01Pywp9M3sEjnt_3_ZWPc'] hostname github.com hovercard-subject-tag repository:171007366 html-safe-nonce a7e877d73b9f87a81f0c6619afc9fd5e8a2de68d303bf1289a91d5802d30db48 octolytics-app-id github octolytics-dimension-repository_explore_github_marketplace_ci_cta_shown false octolytics-dimension-repository_id 171007366 octolytics-dimension-repository_is_fork false octolytics-dimension-repository_network_root_id 171007366 octolytics-dimension-repository_network_root_nwo ericleasemorgan/htid2books octolytics-dimension-repository_nwo ericleasemorgan/htid2books octolytics-dimension-repository_public true octolytics-dimension-user_id 539005 octolytics-dimension-user_login ericleasemorgan octolytics-event-url https://collector.githubapp.com/github-external/browser_event octolytics-host collector.githubapp.com og:description Given an access key, secret token, and a HathiTrust identifier, output plain text as well as PDF versions of a book. - ericleasemorgan/htid2books og:image https://avatars1.githubusercontent.com/u/539005?s=400&v=4 og:site_name GitHub og:title ericleasemorgan/htid2books og:type object og:url https://github.com/ericleasemorgan/htid2books optimizely-datafile {"version": "4", "rollouts": [], "typedAudiences": [], "anonymizeIP": true, "projectId": "16737760170", "variables": [], "featureFlags": [], "experiments": [{"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "18630402174", "key": "launchpad"}, {"variables": [], "id": "18866331456", "key": "control"}], "id": "18651193356", "key": "_features_redesign_rollout", "layerId": "18645992876", "trafficAllocation": [{"entityId": "18630402174", "endOfRange": 500}, {"entityId": "18866331456", "endOfRange": 1000}, {"entityId": "18630402174", "endOfRange": 5000}, {"entityId": "18630402174", "endOfRange": 5500}, {"entityId": "18866331456", "endOfRange": 10000}], "forcedVariations": {"143327983.1601483920": "launchpad", "1955030087.1562868941": "launchpad", "1983887325.1550021416": "launchpad", "1947530619.1600461583": "launchpad"}}, {"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "19136700362", "key": "show_plans"}, {"variables": [], "id": "19157700511", "key": "control"}], "id": "19062314978", "key": "account_billing_plans", "layerId": "19068014945", "trafficAllocation": [{"entityId": "19136700362", "endOfRange": 5000}, {"entityId": "19157700511", "endOfRange": 10000}], "forcedVariations": {"1238720267648ea2c88a74b410aa3c5c": "show_plans", "c4abf59d1620c671458b2a74df2a2410": "control"}}], "audiences": [{"conditions": "[\"or\", {\"match\": \"exact\", \"name\": \"$opt_dummy_attribute\", \"type\": \"custom_attribute\", \"value\": \"$opt_dummy_value\"}]", "id": "$opt_dummy_audience", "name": "Optimizely-Generated Audience for Backwards Compatibility"}], "groups": [], "attributes": [{"id": "16822470375", "key": "user_id"}, {"id": "17143601254", "key": "spammy"}, {"id": "18175660309", "key": "organization_plan"}, {"id": "18813001570", "key": "is_logged_in"}, {"id": "19073851829", "key": "geo"}], "botFiltering": false, "accountId": "16737760170", "events": [{"experimentIds": [], "id": "17911811441", "key": "hydro_click.dashboard.teacher_toolbox_cta"}, {"experimentIds": [], "id": "18124116703", "key": "submit.organizations.complete_sign_up"}, {"experimentIds": [], "id": "18145892387", "key": "no_metric.tracked_outside_of_optimizely"}, {"experimentIds": [], "id": "18178755568", "key": "click.org_onboarding_checklist.add_repo"}, {"experimentIds": [], "id": "18180553241", "key": "submit.repository_imports.create"}, {"experimentIds": [], "id": "18186103728", "key": "click.help.learn_more_about_repository_creation"}, {"experimentIds": [], "id": "18188530140", "key": "test_event.do_not_use_in_production"}, {"experimentIds": [], "id": "18191963644", "key": "click.empty_org_repo_cta.transfer_repository"}, {"experimentIds": [], "id": "18195612788", "key": "click.empty_org_repo_cta.import_repository"}, {"experimentIds": [], "id": "18210945499", "key": "click.org_onboarding_checklist.invite_members"}, {"experimentIds": [], "id": "18211063248", "key": "click.empty_org_repo_cta.create_repository"}, {"experimentIds": [], "id": "18215721889", "key": "click.org_onboarding_checklist.update_profile"}, {"experimentIds": [], "id": "18224360785", "key": "click.org_onboarding_checklist.dismiss"}, {"experimentIds": [], "id": "18234832286", "key": "submit.organization_activation.complete"}, {"experimentIds": [], "id": "18252392383", "key": "submit.org_repository.create"}, {"experimentIds": [], "id": "18257551537", "key": "submit.org_member_invitation.create"}, {"experimentIds": [], "id": "18259522260", "key": "submit.organization_profile.update"}, {"experimentIds": [], "id": "18564603625", "key": "view.classroom_select_organization"}, {"experimentIds": [], "id": "18568612016", "key": "click.classroom_sign_in_click"}, {"experimentIds": [], "id": "18572592540", "key": "view.classroom_name"}, {"experimentIds": [], "id": "18574203855", "key": "click.classroom_create_organization"}, {"experimentIds": [], "id": "18582053415", "key": "click.classroom_select_organization"}, {"experimentIds": [], "id": "18589463420", "key": "click.classroom_create_classroom"}, {"experimentIds": [], "id": "18591323364", "key": "click.classroom_create_first_classroom"}, {"experimentIds": [], "id": "18591652321", "key": "click.classroom_grant_access"}, {"experimentIds": [], "id": "18607131425", "key": "view.classroom_creation"}, {"experimentIds": [], "id": "18831680583", "key": "upgrade_account_plan"}, {"experimentIds": [], "id": "19064064515", "key": "click.signup"}, {"experimentIds": [], "id": "19075373687", "key": "click.view_account_billing_page"}, {"experimentIds": [], "id": "19077355841", "key": "click.dismiss_signup_prompt"}, {"experimentIds": ["19062314978"], "id": "19079713938", "key": "click.contact_sales"}, {"experimentIds": ["19062314978"], "id": "19120963070", "key": "click.compare_account_plans"}, {"experimentIds": ["19062314978"], "id": "19151690317", "key": "click.upgrade_account_cta"}], "revision": "338"} request-id CBAA:7917:408CA93:52D4314:5FC25ECC resourceName b'github-com-8326.html' theme-color #1e2327 title GitHub - ericleasemorgan/htid2books: Given an access key, secret token, and a HathiTrust identifier, output plain text as well as PDF versions of a book. twitter:card summary twitter:description Given an access key, secret token, and a HathiTrust identifier, output plain text as well as PDF versions of a book. - ericleasemorgan/htid2books twitter:image:src https://avatars1.githubusercontent.com/u/539005?s=400&v=4 twitter:site @github twitter:title ericleasemorgan/htid2books user-login viewport width=device-width visitor-hmac eb25167276341f2650f82511d5755e9fdb3f6bf190afd14e6c0ab4858908d24c visitor-payload eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDQkFBOjc5MTc6NDA4Q0E5Mzo1MkQ0MzE0OjVGQzI1RUNDIiwidmlzaXRvcl9pZCI6IjE3MjYxNjUwMjQ2NDM3MjQyOCIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 x-pjax-version 2e8f71f42810547327b61d588a3a343a43cfff382213dc6ed94d44854f0d23c0 === file2bib.sh === id: github-com-9780 author: title: GitHub - ericleasemorgan/reader: Distant Reader, a tool for using & understanding a corpus date: pages: extension: .html txt: ./txt/github-com-9780.txt cache: ./cache/github-com-9780.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 11 analytics-location // apple-itunes-app app-id=1477376905 browser-errors-url https://api.github.com/_private/browser/errors browser-optimizely-client-errors-url https://api.github.com/_private/browser/optimizely_client/errors browser-stats-url https://api.github.com/_private/browser/stats cookie-consent-required false dc:title GitHub - ericleasemorgan/reader: Distant Reader, a tool for using & understanding a corpus description Distant Reader, a tool for using & understanding a corpus - ericleasemorgan/reader enabled-features MARKETPLACE_PENDING_INSTALLATIONS expected-hostname github.com fb:app_id 1401488693436528 github-keyboard-shortcuts repository go-import github.com/ericleasemorgan/reader git https://github.com/ericleasemorgan/reader.git google-site-verification ['c1kuD-K2HIVF635lypcsWPoD4kilo5-jA_wBFyT4uMY', 'KT5gs8h0wvaagLKAVWq8bbeNwnZZK1r1XQysX3xurLU', 'ZzhVyEFwb7w3e0-uOTltm8Jsck2F5StVihD0exw2fsA', 'GXs5KoUUkNCoaAZn7wPN-t01Pywp9M3sEjnt_3_ZWPc'] hostname github.com hovercard-subject-tag repository:139059669 html-safe-nonce aeb4f143e0b1a10d3857c2f9db67ab32788b35588415ce55e3739194d6a1a598 octolytics-app-id github octolytics-dimension-repository_explore_github_marketplace_ci_cta_shown false octolytics-dimension-repository_id 139059669 octolytics-dimension-repository_is_fork false octolytics-dimension-repository_network_root_id 139059669 octolytics-dimension-repository_network_root_nwo ericleasemorgan/reader octolytics-dimension-repository_nwo ericleasemorgan/reader octolytics-dimension-repository_public true octolytics-dimension-user_id 539005 octolytics-dimension-user_login ericleasemorgan octolytics-event-url https://collector.githubapp.com/github-external/browser_event octolytics-host collector.githubapp.com og:description Distant Reader, a tool for using & understanding a corpus - ericleasemorgan/reader og:image https://avatars1.githubusercontent.com/u/539005?s=400&v=4 og:site_name GitHub og:title ericleasemorgan/reader og:type object og:url https://github.com/ericleasemorgan/reader optimizely-datafile {"version": "4", "rollouts": [], "typedAudiences": [], "anonymizeIP": true, "projectId": "16737760170", "variables": [], "featureFlags": [], "experiments": [{"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "18630402174", "key": "launchpad"}, {"variables": [], "id": "18866331456", "key": "control"}], "id": "18651193356", "key": "_features_redesign_rollout", "layerId": "18645992876", "trafficAllocation": [{"entityId": "18630402174", "endOfRange": 500}, {"entityId": "18866331456", "endOfRange": 1000}, {"entityId": "18630402174", "endOfRange": 5000}, {"entityId": "18630402174", "endOfRange": 5500}, {"entityId": "18866331456", "endOfRange": 10000}], "forcedVariations": {"143327983.1601483920": "launchpad", "1955030087.1562868941": "launchpad", "1983887325.1550021416": "launchpad", "1947530619.1600461583": "launchpad"}}, {"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "19136700362", "key": "show_plans"}, {"variables": [], "id": "19157700511", "key": "control"}], "id": "19062314978", "key": "account_billing_plans", "layerId": "19068014945", "trafficAllocation": [{"entityId": "19136700362", "endOfRange": 5000}, {"entityId": "19157700511", "endOfRange": 10000}], "forcedVariations": {"1238720267648ea2c88a74b410aa3c5c": "show_plans", "c4abf59d1620c671458b2a74df2a2410": "control"}}], "audiences": [{"conditions": "[\"or\", {\"match\": \"exact\", \"name\": \"$opt_dummy_attribute\", \"type\": \"custom_attribute\", \"value\": \"$opt_dummy_value\"}]", "id": "$opt_dummy_audience", "name": "Optimizely-Generated Audience for Backwards Compatibility"}], "groups": [], "attributes": [{"id": "16822470375", "key": "user_id"}, {"id": "17143601254", "key": "spammy"}, {"id": "18175660309", "key": "organization_plan"}, {"id": "18813001570", "key": "is_logged_in"}, {"id": "19073851829", "key": "geo"}], "botFiltering": false, "accountId": "16737760170", "events": [{"experimentIds": [], "id": "17911811441", "key": "hydro_click.dashboard.teacher_toolbox_cta"}, {"experimentIds": [], "id": "18124116703", "key": "submit.organizations.complete_sign_up"}, {"experimentIds": [], "id": "18145892387", "key": "no_metric.tracked_outside_of_optimizely"}, {"experimentIds": [], "id": "18178755568", "key": "click.org_onboarding_checklist.add_repo"}, {"experimentIds": [], "id": "18180553241", "key": "submit.repository_imports.create"}, {"experimentIds": [], "id": "18186103728", "key": "click.help.learn_more_about_repository_creation"}, {"experimentIds": [], "id": "18188530140", "key": "test_event.do_not_use_in_production"}, {"experimentIds": [], "id": "18191963644", "key": "click.empty_org_repo_cta.transfer_repository"}, {"experimentIds": [], "id": "18195612788", "key": "click.empty_org_repo_cta.import_repository"}, {"experimentIds": [], "id": "18210945499", "key": "click.org_onboarding_checklist.invite_members"}, {"experimentIds": [], "id": "18211063248", "key": "click.empty_org_repo_cta.create_repository"}, {"experimentIds": [], "id": "18215721889", "key": "click.org_onboarding_checklist.update_profile"}, {"experimentIds": [], "id": "18224360785", "key": "click.org_onboarding_checklist.dismiss"}, {"experimentIds": [], "id": "18234832286", "key": "submit.organization_activation.complete"}, {"experimentIds": [], "id": "18252392383", "key": "submit.org_repository.create"}, {"experimentIds": [], "id": "18257551537", "key": "submit.org_member_invitation.create"}, {"experimentIds": [], "id": "18259522260", "key": "submit.organization_profile.update"}, {"experimentIds": [], "id": "18564603625", "key": "view.classroom_select_organization"}, {"experimentIds": [], "id": "18568612016", "key": "click.classroom_sign_in_click"}, {"experimentIds": [], "id": "18572592540", "key": "view.classroom_name"}, {"experimentIds": [], "id": "18574203855", "key": "click.classroom_create_organization"}, {"experimentIds": [], "id": "18582053415", "key": "click.classroom_select_organization"}, {"experimentIds": [], "id": "18589463420", "key": "click.classroom_create_classroom"}, {"experimentIds": [], "id": "18591323364", "key": "click.classroom_create_first_classroom"}, {"experimentIds": [], "id": "18591652321", "key": "click.classroom_grant_access"}, {"experimentIds": [], "id": "18607131425", "key": "view.classroom_creation"}, {"experimentIds": [], "id": "18831680583", "key": "upgrade_account_plan"}, {"experimentIds": [], "id": "19064064515", "key": "click.signup"}, {"experimentIds": [], "id": "19075373687", "key": "click.view_account_billing_page"}, {"experimentIds": [], "id": "19077355841", "key": "click.dismiss_signup_prompt"}, {"experimentIds": ["19062314978"], "id": "19079713938", "key": "click.contact_sales"}, {"experimentIds": ["19062314978"], "id": "19120963070", "key": "click.compare_account_plans"}, {"experimentIds": ["19062314978"], "id": "19151690317", "key": "click.upgrade_account_cta"}], "revision": "338"} request-id B54A:1B67:4084171:5306E53:5FC25ECD resourceName b'github-com-9780.html' theme-color #1e2327 title GitHub - ericleasemorgan/reader: Distant Reader, a tool for using & understanding a corpus twitter:card summary twitter:description Distant Reader, a tool for using & understanding a corpus - ericleasemorgan/reader twitter:image:src https://avatars1.githubusercontent.com/u/539005?s=400&v=4 twitter:site @github twitter:title ericleasemorgan/reader user-login viewport width=device-width visitor-hmac 14e528972b743ccc4b298d65341c28c591f30b6814c6d8974cf318a1e91a8372 visitor-payload eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJCNTRBOjFCNjc6NDA4NDE3MTo1MzA2RTUzOjVGQzI1RUNEIiwidmlzaXRvcl9pZCI6IjYyNDg4NDgzMjQwMDc0NTIzNjUiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== x-pjax-version 2e8f71f42810547327b61d588a3a343a43cfff382213dc6ed94d44854f0d23c0 === file2bib.sh === id: github-com-8025 author: title: GitHub - ericleasemorgan/ojs-toolbox: Given a Open Journal System (OJS) root URL and an authorization token, cache all JSON files associated with the given OJS title, and optionally output rudimentary bibliographics in the form of a tab-separated value (TSV) stream. date: pages: extension: .html txt: ./txt/github-com-8025.txt cache: ./cache/github-com-8025.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 10 analytics-location // apple-itunes-app app-id=1477376905 browser-errors-url https://api.github.com/_private/browser/errors browser-optimizely-client-errors-url https://api.github.com/_private/browser/optimizely_client/errors browser-stats-url https://api.github.com/_private/browser/stats cookie-consent-required false dc:title GitHub - ericleasemorgan/ojs-toolbox: Given a Open Journal System (OJS) root URL and an authorization token, cache all JSON files associated with the given OJS title, and optionally output rudimentary bibliographics in the form of a tab-separated value (TSV) stream. description Given a Open Journal System (OJS) root URL and an authorization token, cache all JSON files associated with the given OJS title, and optionally output rudimentary bibliographics in the form of a tab-separated value (TSV) stream. - ericleasemorgan/ojs-toolbox enabled-features MARKETPLACE_PENDING_INSTALLATIONS expected-hostname github.com fb:app_id 1401488693436528 github-keyboard-shortcuts repository go-import github.com/ericleasemorgan/ojs-toolbox git https://github.com/ericleasemorgan/ojs-toolbox.git google-site-verification ['c1kuD-K2HIVF635lypcsWPoD4kilo5-jA_wBFyT4uMY', 'KT5gs8h0wvaagLKAVWq8bbeNwnZZK1r1XQysX3xurLU', 'ZzhVyEFwb7w3e0-uOTltm8Jsck2F5StVihD0exw2fsA', 'GXs5KoUUkNCoaAZn7wPN-t01Pywp9M3sEjnt_3_ZWPc'] hostname github.com hovercard-subject-tag repository:216553348 html-safe-nonce d5ad2f1598a4ae41d90ea3274d8d8efa4dda4c92eda27aeb8412acaaf800059f octolytics-app-id github octolytics-dimension-repository_explore_github_marketplace_ci_cta_shown false octolytics-dimension-repository_id 216553348 octolytics-dimension-repository_is_fork false octolytics-dimension-repository_network_root_id 216553348 octolytics-dimension-repository_network_root_nwo ericleasemorgan/ojs-toolbox octolytics-dimension-repository_nwo ericleasemorgan/ojs-toolbox octolytics-dimension-repository_public true octolytics-dimension-user_id 539005 octolytics-dimension-user_login ericleasemorgan octolytics-event-url https://collector.githubapp.com/github-external/browser_event octolytics-host collector.githubapp.com og:description Given a Open Journal System (OJS) root URL and an authorization token, cache all JSON files associated with the given OJS title, and optionally output rudimentary bibliographics in the form of a ta... og:image https://avatars1.githubusercontent.com/u/539005?s=400&v=4 og:site_name GitHub og:title ericleasemorgan/ojs-toolbox og:type object og:url https://github.com/ericleasemorgan/ojs-toolbox optimizely-datafile {"version": "4", "rollouts": [], "typedAudiences": [], "anonymizeIP": true, "projectId": "16737760170", "variables": [], "featureFlags": [], "experiments": [{"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "18630402174", "key": "launchpad"}, {"variables": [], "id": "18866331456", "key": "control"}], "id": "18651193356", "key": "_features_redesign_rollout", "layerId": "18645992876", "trafficAllocation": [{"entityId": "18630402174", "endOfRange": 500}, {"entityId": "18866331456", "endOfRange": 1000}, {"entityId": "18630402174", "endOfRange": 5000}, {"entityId": "18630402174", "endOfRange": 5500}, {"entityId": "18866331456", "endOfRange": 10000}], "forcedVariations": {"143327983.1601483920": "launchpad", "1955030087.1562868941": "launchpad", "1983887325.1550021416": "launchpad", "1947530619.1600461583": "launchpad"}}, {"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "19136700362", "key": "show_plans"}, {"variables": [], "id": "19157700511", "key": "control"}], "id": "19062314978", "key": "account_billing_plans", "layerId": "19068014945", "trafficAllocation": [{"entityId": "19136700362", "endOfRange": 5000}, {"entityId": "19157700511", "endOfRange": 10000}], "forcedVariations": {"1238720267648ea2c88a74b410aa3c5c": "show_plans", "c4abf59d1620c671458b2a74df2a2410": "control"}}], "audiences": [{"conditions": "[\"or\", {\"match\": \"exact\", \"name\": \"$opt_dummy_attribute\", \"type\": \"custom_attribute\", \"value\": \"$opt_dummy_value\"}]", "id": "$opt_dummy_audience", "name": "Optimizely-Generated Audience for Backwards Compatibility"}], "groups": [], "attributes": [{"id": "16822470375", "key": "user_id"}, {"id": "17143601254", "key": "spammy"}, {"id": "18175660309", "key": "organization_plan"}, {"id": "18813001570", "key": "is_logged_in"}, {"id": "19073851829", "key": "geo"}], "botFiltering": false, "accountId": "16737760170", "events": [{"experimentIds": [], "id": "17911811441", "key": "hydro_click.dashboard.teacher_toolbox_cta"}, {"experimentIds": [], "id": "18124116703", "key": "submit.organizations.complete_sign_up"}, {"experimentIds": [], "id": "18145892387", "key": "no_metric.tracked_outside_of_optimizely"}, {"experimentIds": [], "id": "18178755568", "key": "click.org_onboarding_checklist.add_repo"}, {"experimentIds": [], "id": "18180553241", "key": "submit.repository_imports.create"}, {"experimentIds": [], "id": "18186103728", "key": "click.help.learn_more_about_repository_creation"}, {"experimentIds": [], "id": "18188530140", "key": "test_event.do_not_use_in_production"}, {"experimentIds": [], "id": "18191963644", "key": "click.empty_org_repo_cta.transfer_repository"}, {"experimentIds": [], "id": "18195612788", "key": "click.empty_org_repo_cta.import_repository"}, {"experimentIds": [], "id": "18210945499", "key": "click.org_onboarding_checklist.invite_members"}, {"experimentIds": [], "id": "18211063248", "key": "click.empty_org_repo_cta.create_repository"}, {"experimentIds": [], "id": "18215721889", "key": "click.org_onboarding_checklist.update_profile"}, {"experimentIds": [], "id": "18224360785", "key": "click.org_onboarding_checklist.dismiss"}, {"experimentIds": [], "id": "18234832286", "key": "submit.organization_activation.complete"}, {"experimentIds": [], "id": "18252392383", "key": "submit.org_repository.create"}, {"experimentIds": [], "id": "18257551537", "key": "submit.org_member_invitation.create"}, {"experimentIds": [], "id": "18259522260", "key": "submit.organization_profile.update"}, {"experimentIds": [], "id": "18564603625", "key": "view.classroom_select_organization"}, {"experimentIds": [], "id": "18568612016", "key": "click.classroom_sign_in_click"}, {"experimentIds": [], "id": "18572592540", "key": "view.classroom_name"}, {"experimentIds": [], "id": "18574203855", "key": "click.classroom_create_organization"}, {"experimentIds": [], "id": "18582053415", "key": "click.classroom_select_organization"}, {"experimentIds": [], "id": "18589463420", "key": "click.classroom_create_classroom"}, {"experimentIds": [], "id": "18591323364", "key": "click.classroom_create_first_classroom"}, {"experimentIds": [], "id": "18591652321", "key": "click.classroom_grant_access"}, {"experimentIds": [], "id": "18607131425", "key": "view.classroom_creation"}, {"experimentIds": [], "id": "18831680583", "key": "upgrade_account_plan"}, {"experimentIds": [], "id": "19064064515", "key": "click.signup"}, {"experimentIds": [], "id": "19075373687", "key": "click.view_account_billing_page"}, {"experimentIds": [], "id": "19077355841", "key": "click.dismiss_signup_prompt"}, {"experimentIds": ["19062314978"], "id": "19079713938", "key": "click.contact_sales"}, {"experimentIds": ["19062314978"], "id": "19120963070", "key": "click.compare_account_plans"}, {"experimentIds": ["19062314978"], "id": "19151690317", "key": "click.upgrade_account_cta"}], "revision": "338"} request-id CBC2:0413:3AF5509:4C125A8:5FC25ECC resourceName b'github-com-8025.html' theme-color #1e2327 title GitHub - ericleasemorgan/ojs-toolbox: Given a Open Journal System (OJS) root URL and an authorization token, cache all JSON files associated with the given OJS title, and optionally output rudimentary bibliographics in the form of a tab-separated value (TSV) stream. twitter:card summary twitter:description Given a Open Journal System (OJS) root URL and an authorization token, cache all JSON files associated with the given OJS title, and optionally output rudimentary bibliographics in the form of a ta... twitter:image:src https://avatars1.githubusercontent.com/u/539005?s=400&v=4 twitter:site @github twitter:title ericleasemorgan/ojs-toolbox user-login viewport width=device-width visitor-hmac 11c6242ac4a482007103fc6e40a9142d0cccf086eeae51c4eb48090deef34cb4 visitor-payload eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDQkMyOjA0MTM6M0FGNTUwOTo0QzEyNUE4OjVGQzI1RUNDIiwidmlzaXRvcl9pZCI6IjExOTkzNzE1NTYwNDQ4MjQyNjgiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== x-pjax-version 2e8f71f42810547327b61d588a3a343a43cfff382213dc6ed94d44854f0d23c0 === file2bib.sh === id: www-gnu-org-8892 author: title: GNU Parallel - GNU Project - Free Software Foundation date: pages: extension: .html txt: ./txt/www-gnu-org-8892.txt cache: ./cache/www-gnu-org-8892.html Content-Encoding UTF-8 Content-Language en Content-Type application/xhtml+xml; charset=UTF-8 Content-Type-Hint text/html; charset=utf-8 DC.title gnu.org ICBM 42.355469, -71.058627 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 4 dc:title GNU Parallel - GNU Project - Free Software Foundation geo:lat 42.355469 geo:long -71.058627 resourceName b'www-gnu-org-8892.html' title GNU Parallel - GNU Project - Free Software Foundation viewport width=device-width, initial-scale=1 === file2bib.sh === id: github-com-8202 author: title: GitHub - senderle/topic-modeling-tool: A point-and-click tool for creating and analyzing topic models produced by MALLET. date: pages: extension: .html txt: ./txt/github-com-8202.txt cache: ./cache/github-com-8202.html Content-Encoding UTF-8 Content-Language en Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 22 analytics-location // apple-itunes-app app-id=1477376905 browser-errors-url https://api.github.com/_private/browser/errors browser-optimizely-client-errors-url https://api.github.com/_private/browser/optimizely_client/errors browser-stats-url https://api.github.com/_private/browser/stats cookie-consent-required false dc:title GitHub - senderle/topic-modeling-tool: A point-and-click tool for creating and analyzing topic models produced by MALLET. description A point-and-click tool for creating and analyzing topic models produced by MALLET. - senderle/topic-modeling-tool enabled-features MARKETPLACE_PENDING_INSTALLATIONS expected-hostname github.com fb:app_id 1401488693436528 github-keyboard-shortcuts repository go-import github.com/senderle/topic-modeling-tool git https://github.com/senderle/topic-modeling-tool.git google-site-verification ['c1kuD-K2HIVF635lypcsWPoD4kilo5-jA_wBFyT4uMY', 'KT5gs8h0wvaagLKAVWq8bbeNwnZZK1r1XQysX3xurLU', 'ZzhVyEFwb7w3e0-uOTltm8Jsck2F5StVihD0exw2fsA', 'GXs5KoUUkNCoaAZn7wPN-t01Pywp9M3sEjnt_3_ZWPc'] hostname github.com hovercard-subject-tag repository:47996186 html-safe-nonce 49fe2ea6d55073c0add5afe38a65960ab38c5359824385b21fe8d4a7c058ac2f octolytics-app-id github octolytics-dimension-repository_explore_github_marketplace_ci_cta_shown false octolytics-dimension-repository_id 47996186 octolytics-dimension-repository_is_fork false octolytics-dimension-repository_network_root_id 47996186 octolytics-dimension-repository_network_root_nwo senderle/topic-modeling-tool octolytics-dimension-repository_nwo senderle/topic-modeling-tool octolytics-dimension-repository_public true octolytics-dimension-user_id 4257267 octolytics-dimension-user_login senderle octolytics-event-url https://collector.githubapp.com/github-external/browser_event octolytics-host collector.githubapp.com og:description A point-and-click tool for creating and analyzing topic models produced by MALLET. - senderle/topic-modeling-tool og:image https://avatars2.githubusercontent.com/u/4257267?s=400&v=4 og:site_name GitHub og:title senderle/topic-modeling-tool og:type object og:url https://github.com/senderle/topic-modeling-tool optimizely-datafile {"version": "4", "rollouts": [], "typedAudiences": [], "anonymizeIP": true, "projectId": "16737760170", "variables": [], "featureFlags": [], "experiments": [{"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "18630402174", "key": "launchpad"}, {"variables": [], "id": "18866331456", "key": "control"}], "id": "18651193356", "key": "_features_redesign_rollout", "layerId": "18645992876", "trafficAllocation": [{"entityId": "18630402174", "endOfRange": 500}, {"entityId": "18866331456", "endOfRange": 1000}, {"entityId": "18630402174", "endOfRange": 5000}, {"entityId": "18630402174", "endOfRange": 5500}, {"entityId": "18866331456", "endOfRange": 10000}], "forcedVariations": {"143327983.1601483920": "launchpad", "1955030087.1562868941": "launchpad", "1983887325.1550021416": "launchpad", "1947530619.1600461583": "launchpad"}}, {"status": "Running", "audienceIds": [], "variations": [{"variables": [], "id": "19136700362", "key": "show_plans"}, {"variables": [], "id": "19157700511", "key": "control"}], "id": "19062314978", "key": "account_billing_plans", "layerId": "19068014945", "trafficAllocation": [{"entityId": "19136700362", "endOfRange": 5000}, {"entityId": "19157700511", "endOfRange": 10000}], "forcedVariations": {"1238720267648ea2c88a74b410aa3c5c": "show_plans", "c4abf59d1620c671458b2a74df2a2410": "control"}}], "audiences": [{"conditions": "[\"or\", {\"match\": \"exact\", \"name\": \"$opt_dummy_attribute\", \"type\": \"custom_attribute\", \"value\": \"$opt_dummy_value\"}]", "id": "$opt_dummy_audience", "name": "Optimizely-Generated Audience for Backwards Compatibility"}], "groups": [], "attributes": [{"id": "16822470375", "key": "user_id"}, {"id": "17143601254", "key": "spammy"}, {"id": "18175660309", "key": "organization_plan"}, {"id": "18813001570", "key": "is_logged_in"}, {"id": "19073851829", "key": "geo"}], "botFiltering": false, "accountId": "16737760170", "events": [{"experimentIds": [], "id": "17911811441", "key": "hydro_click.dashboard.teacher_toolbox_cta"}, {"experimentIds": [], "id": "18124116703", "key": "submit.organizations.complete_sign_up"}, {"experimentIds": [], "id": "18145892387", "key": "no_metric.tracked_outside_of_optimizely"}, {"experimentIds": [], "id": "18178755568", "key": "click.org_onboarding_checklist.add_repo"}, {"experimentIds": [], "id": "18180553241", "key": "submit.repository_imports.create"}, {"experimentIds": [], "id": "18186103728", "key": "click.help.learn_more_about_repository_creation"}, {"experimentIds": [], "id": "18188530140", "key": "test_event.do_not_use_in_production"}, {"experimentIds": [], "id": "18191963644", "key": "click.empty_org_repo_cta.transfer_repository"}, {"experimentIds": [], "id": "18195612788", "key": "click.empty_org_repo_cta.import_repository"}, {"experimentIds": [], "id": "18210945499", "key": "click.org_onboarding_checklist.invite_members"}, {"experimentIds": [], "id": "18211063248", "key": "click.empty_org_repo_cta.create_repository"}, {"experimentIds": [], "id": "18215721889", "key": "click.org_onboarding_checklist.update_profile"}, {"experimentIds": [], "id": "18224360785", "key": "click.org_onboarding_checklist.dismiss"}, {"experimentIds": [], "id": "18234832286", "key": "submit.organization_activation.complete"}, {"experimentIds": [], "id": "18252392383", "key": "submit.org_repository.create"}, {"experimentIds": [], "id": "18257551537", "key": "submit.org_member_invitation.create"}, {"experimentIds": [], "id": "18259522260", "key": "submit.organization_profile.update"}, {"experimentIds": [], "id": "18564603625", "key": "view.classroom_select_organization"}, {"experimentIds": [], "id": "18568612016", "key": "click.classroom_sign_in_click"}, {"experimentIds": [], "id": "18572592540", "key": "view.classroom_name"}, {"experimentIds": [], "id": "18574203855", "key": "click.classroom_create_organization"}, {"experimentIds": [], "id": "18582053415", "key": "click.classroom_select_organization"}, {"experimentIds": [], "id": "18589463420", "key": "click.classroom_create_classroom"}, {"experimentIds": [], "id": "18591323364", "key": "click.classroom_create_first_classroom"}, {"experimentIds": [], "id": "18591652321", "key": "click.classroom_grant_access"}, {"experimentIds": [], "id": "18607131425", "key": "view.classroom_creation"}, {"experimentIds": [], "id": "18831680583", "key": "upgrade_account_plan"}, {"experimentIds": [], "id": "19064064515", "key": "click.signup"}, {"experimentIds": [], "id": "19075373687", "key": "click.view_account_billing_page"}, {"experimentIds": [], "id": "19077355841", "key": "click.dismiss_signup_prompt"}, {"experimentIds": ["19062314978"], "id": "19079713938", "key": "click.contact_sales"}, {"experimentIds": ["19062314978"], "id": "19120963070", "key": "click.compare_account_plans"}, {"experimentIds": ["19062314978"], "id": "19151690317", "key": "click.upgrade_account_cta"}], "revision": "338"} request-id CBC8:7051:282E57D:36957EE:5FC25ECD resourceName b'github-com-8202.html' theme-color #1e2327 title GitHub - senderle/topic-modeling-tool: A point-and-click tool for creating and analyzing topic models produced by MALLET. twitter:card summary twitter:description A point-and-click tool for creating and analyzing topic models produced by MALLET. - senderle/topic-modeling-tool twitter:image:src https://avatars2.githubusercontent.com/u/4257267?s=400&v=4 twitter:site @github twitter:title senderle/topic-modeling-tool user-login viewport width=device-width visitor-hmac d114b467f020d64cbdb047488083cd584e3ee6977989eff206d3307e9ae0d7df visitor-payload eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDQkM4OjcwNTE6MjgyRTU3RDozNjk1N0VFOjVGQzI1RUNEIiwidmlzaXRvcl9pZCI6IjUwODUyNzcxNjg4MTcyOTkxNDkiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== x-pjax-version 2e8f71f42810547327b61d588a3a343a43cfff382213dc6ed94d44854f0d23c0 === file2bib.sh === id: infomotions-com-9966 author: title: Michael Hart in Roanoke (Indiana) – Infomotions Mini-Musings date: pages: extension: .html txt: ./txt/infomotions-com-9966.txt cache: ./cache/infomotions-com-9966.html Content-Encoding UTF-8 Content-Language en-US Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 6 dc:title Michael Hart in Roanoke (Indiana) – Infomotions Mini-Musings generator WordPress 5.1.8 resourceName b'infomotions-com-9966.html' title Michael Hart in Roanoke (Indiana) – Infomotions Mini-Musings viewport width=device-width === file2bib.sh === id: tika-apache-org-2948 author: title: Apache Tika – Apache Tika date: pages: extension: .html txt: ./txt/tika-apache-org-2948.txt cache: ./cache/tika-apache-org-2948.html Content-Encoding UTF-8 Content-Type application/xhtml+xml; charset=UTF-8 Content-Type-Hint text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 5 dc:title Apache Tika – Apache Tika resourceName b'tika-apache-org-2948.html' title Apache Tika – Apache Tika dh-crc-nd-edu-9558 txt/../ent/dh-crc-nd-edu-9558.ent === file2bib.sh === id: planet-infomotions-com-9545 author: title: Planet Eric Lease Morgan date: pages: extension: .xml txt: ./txt/planet-infomotions-com-9545.txt cache: ./cache/planet-infomotions-com-9545.xml Content-Type application/atom+xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.feed.FeedParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 273 dc:description dc:title Planet Eric Lease Morgan description resourceName b'planet-infomotions-com-9545.xml' title Planet Eric Lease Morgan === file2bib.sh === id: dh-crc-nd-edu-9558 author: title: DH Blog @ Notre Dame | Learning about human expression through the use of computers date: pages: extension: .html txt: ./txt/dh-crc-nd-edu-9558.txt cache: ./cache/dh-crc-nd-edu-9558.html Content-Encoding UTF-8 Content-Language en-US Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 6 dc:title DH Blog @ Notre Dame | Learning about human expression through the use of computers generator WordPress 5.1.8 og:description Learning about human expression through the use of computers og:image https://s0.wp.com/i/blank.jpg og:locale en_US og:site_name DH Blog @ Notre Dame og:title DH Blog @ Notre Dame og:type website og:url http://dh.crc.nd.edu/blog/ resourceName b'dh-crc-nd-edu-9558.html' title DH Blog @ Notre Dame | Learning about human expression through the use of computers viewport width=device-width infomotions-com-9318 txt/../pos/infomotions-com-9318.pos infomotions-com-6757 txt/../pos/infomotions-com-6757.pos infomotions-com-6757 txt/../wrd/infomotions-com-6757.wrd infomotions-com-9318 txt/../wrd/infomotions-com-9318.wrd infomotions-com-9318 txt/../ent/infomotions-com-9318.ent infomotions-com-6757 txt/../ent/infomotions-com-6757.ent === file2bib.sh === id: infomotions-com-6757 author: title: Infomotions Mini-Musings date: pages: extension: .xml txt: ./txt/infomotions-com-6757.txt cache: ./cache/infomotions-com-6757.xml Content-Type application/rss+xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.feed.FeedParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 69 dc:description Artist- and Librarian-At-Large dc:title Infomotions Mini-Musings description Artist- and Librarian-At-Large resourceName b'infomotions-com-6757.xml' title Infomotions Mini-Musings === file2bib.sh === id: infomotions-com-9318 author: title: Infomotions' Musings on Information and Librarianship date: pages: extension: .xml txt: ./txt/infomotions-com-9318.txt cache: ./cache/infomotions-com-9318.xml Content-Type application/rss+xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.feed.FeedParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 78 dc:description This is a list of travel logs, software descriptions, formally published articles, or presentation handouts. Four are the most recently written. One is randomly chosen, and one is selected from the archives. All of these items, Musings on Information and Librarianship, were written by Eric Lease Morgan. dc:title Infomotions' Musings on Information and Librarianship description This is a list of travel logs, software descriptions, formally published articles, or presentation handouts. Four are the most recently written. One is randomly chosen, and one is selected from the archives. All of these items, Musings on Information and Librarianship, were written by Eric Lease Morgan. resourceName b'infomotions-com-9318.xml' title Infomotions' Musings on Information and Librarianship infomotions-com-2987 txt/../pos/infomotions-com-2987.pos infomotions-com-9504 txt/../wrd/infomotions-com-9504.wrd infomotions-com-9504 txt/../pos/infomotions-com-9504.pos infomotions-com-2987 txt/../wrd/infomotions-com-2987.wrd infomotions-com-9504 txt/../ent/infomotions-com-9504.ent infomotions-com-2987 txt/../ent/infomotions-com-2987.ent === file2bib.sh === id: infomotions-com-9504 author: title: Infomotions Mini-Musings – Artist- and Librarian-At-Large date: pages: extension: .html txt: ./txt/infomotions-com-9504.txt cache: ./cache/infomotions-com-9504.html Content-Encoding UTF-8 Content-Language en-US Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 111 dc:title Infomotions Mini-Musings – Artist- and Librarian-At-Large generator WordPress 5.1.8 resourceName b'infomotions-com-9504.html' title Infomotions Mini-Musings – Artist- and Librarian-At-Large viewport width=device-width === file2bib.sh === id: infomotions-com-2987 author: title: Infomotions Mini-Musings – Artist- and Librarian-At-Large date: pages: extension: .html txt: ./txt/infomotions-com-2987.txt cache: ./cache/infomotions-com-2987.html Content-Encoding UTF-8 Content-Language en-US Content-Type text/html; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.html.HtmlParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 81 dc:title Infomotions Mini-Musings – Artist- and Librarian-At-Large generator WordPress 5.1.8 resourceName b'infomotions-com-2987.html' title Infomotions Mini-Musings – Artist- and Librarian-At-Large viewport width=device-width planet-infomotions-com-8900 txt/../pos/planet-infomotions-com-8900.pos planet-infomotions-com-8900 txt/../wrd/planet-infomotions-com-8900.wrd planet-infomotions-com-3359 txt/../pos/planet-infomotions-com-3359.pos planet-infomotions-com-3359 txt/../wrd/planet-infomotions-com-3359.wrd planet-infomotions-com-8900 txt/../ent/planet-infomotions-com-8900.ent planet-infomotions-com-3359 txt/../ent/planet-infomotions-com-3359.ent === file2bib.sh === id: planet-infomotions-com-8900 author: title: Planet Eric Lease Morgan date: pages: extension: .xml txt: ./txt/planet-infomotions-com-8900.txt cache: ./cache/planet-infomotions-com-8900.xml Content-Type application/rss+xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.feed.FeedParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 563 dc:description Planet Eric Lease Morgan - http://planet.infomotions.com/ dc:title Planet Eric Lease Morgan description Planet Eric Lease Morgan - http://planet.infomotions.com/ resourceName b'planet-infomotions-com-8900.xml' title Planet Eric Lease Morgan === file2bib.sh === id: planet-infomotions-com-3359 author: title: planet-infomotions-com-3359 date: pages: extension: .xml txt: ./txt/planet-infomotions-com-3359.txt cache: ./cache/planet-infomotions-com-3359.xml Content-Type application/rdf+xml Creation-Date 1978-06-30T05:00:00+00:00 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 178 dcterms:created 1978-06-30T05:00:00+00:00 meta:creation-date 1978-06-30T05:00:00+00:00 resourceName b'planet-infomotions-com-3359.xml' Done mapping. Reducing planet-infomotions === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === id = infomotions-com-9318 author = title = Infomotions' Musings on Information and Librarianship date = pages = extension = .xml mime = application/rss+xml words = 13253 sentences = 1102 flesch = 57 summary = Keywords: user-centered design; SOCHE; presentations; librarianship; Source: This essay was never formally published, but it was created for Southwestern Ohio Council for Higher Education (SOCHE) and a conference called 'The Human Face of Information (technology)' Wednesday, May 6, 2009 at Wright State University In a sentence I learned two things: 1) institutional repository software such as Fedora, DSpace, and EPrints are increasingly being used for more than open access publishing efforts, and 2) the Web Services API of Fedora makes it relatively easy for developers using any programming language to interface with the underlying core.Keywords: Gruene, Texas; institutional repositories; digital libraries; travel log; Source: This file was never formally published. The purpose of OCKHAM is to articulate and design a set of "light weight reference models" for creating and maintaining digital library services and collections.Keywords: OCKHAM (Open Community Knowledge Hypermedia Administration and Metadata); Atlanta, GA; travel log; Source: This text was never published. cache = ./cache/infomotions-com-9318.xml txt = ./txt/infomotions-com-9318.txt === reduce.pl bib === id = planet-infomotions-com-9545 author = title = Planet Eric Lease Morgan date = pages = extension = .xml mime = application/atom+xml words = 3844 sentences = 362 flesch = 57 summary = Rome in three days, an archivists introduction to linked data publishing Questions from a library science student about RDF and linked data Publishing archival descriptions as linked data via databases Simple linked data recipe for libraries, museums, and archives Cloud-sourcing Research Collections: Managing Print in the Mass-digitized Library Environment Selected Internet Resources on Digital Research Data Curation Open source software and libraries: A current SWOT analysis Web-scale discovery indexes and "next generation" library catalogs MyLibrary: A Digital library framework & toolbox Open Library Developer's Meeting: One Web Page for Every Book Ever Published Open source software at the Montana State University Libraries Symposium Open source software for libraries in 30 minutes Exploiting "Light-weight" Protocols and Open Source Tools to Implement Digital Library Collections and Services Open source software in libraries: A workshop Open source software in libraries Open source software in libraries Open source software in libraries cache = ./cache/planet-infomotions-com-9545.xml txt = ./txt/planet-infomotions-com-9545.txt === reduce.pl bib === === reduce.pl bib === id = planet-infomotions-com-3359 author = title = planet-infomotions-com-3359 date = pages = extension = .xml mime = application/rdf+xml words = 389347 sentences = 61656 flesch = 74 summary = [15] Next steps include: calculating an integer denoting the number of pages in an item, implementing a Web-based search interface to a subset’s full text as well as metadata, putting the source code (written in Python and Bash) on GitHub. After that I need to: identify more robust ways to create subsets from the whole of EEBO, provide links to the raw TEI/XML as well as HTML versions of items, implement quite a number of cosmetic enhancements, and most importantly, support the means to compare & contrast items of interest in each subset. The next steps are numerous and listed in no priority order: putting the whole thing on GitHub, outputting the reports in generic formats so other things can easily read them, improving the terminal-based search interface, implementing a Web-based search interface, writing advanced programs in R that chart and graph analysis, provide a means for comparing & contrasting two or more items from a corpus, indexing the corpus with a (real) indexer such as Solr, writing a "cookbook" describing how to use the browser to to "kewl" things, making the metadata of corpora available as Linked Data, etc. cache = ./cache/planet-infomotions-com-3359.xml txt = ./txt/planet-infomotions-com-3359.txt === reduce.pl bib === id = infomotions-com-9504 author = title = Infomotions Mini-Musings – Artist- and Librarian-At-Large date = pages = extension = .html mime = text/html words = 59918 sentences = 4704 flesch = 69 summary = To all these ends, Voyant Tools counts & tabulates the frequencies of words, plots the results in a number of useful ways, supports topic modeling, and the comparison documents across a corpus. This essay describes, illustrates, and demonstrates how the Digital Public Library of America (DPLA) can build on the good work of others who support the creation and maintenance of collections and provide value-added services against texts — a concept we call "use & understand". More specifically, this proposal assumes the collections of the DPLA include things like but not necessarily limited to: digitized versions of public domain works, the full-text of open access scholarly journals and/or trade magazines, scholarly and governmental data sets, theses & dissertations, a substantial portion of the existing United States government documents, the archives of selected mailing lists, and maybe even the archives of blog postings and Twitter feeds. cache = ./cache/infomotions-com-9504.html txt = ./txt/infomotions-com-9504.txt === reduce.pl bib === id = planet-infomotions-com-7919 author = title = Water de Jour date = pages = extension = .xml mime = application/rdf+xml words = 58 sentences = 9 flesch = 52 summary = Planet Eric Lease Morgan http://planet.infomotions.com/ Catholic Portal DH @ Notre Dame DH Blog @ Notre Dame LiAM: Linked Archival Metadata LiAM: Linked Archival Metadata Life of a Librarian Days in the Life of a Librarian Mini-musings Infomotions Mini-Musings Infomotions Mini-Musings Musings Infomotions' Musings on Information and Librarianship Readings What's Eric Reading? Water collection Water de Jour cache = ./cache/planet-infomotions-com-7919.xml txt = ./txt/planet-infomotions-com-7919.txt === reduce.pl bib === id = infomotions-com-3637 author = title = Alex Catalogue of Electronic Texts date = pages = extension = .html mime = application/xhtml+xml words = 105 sentences = 12 flesch = 65 summary = Home Serials Blog Musings Sandbox Alex Catalogue Alex Catalogue Browse by author Browse by title Browse by tag About the Catalogue Alex Catalogue of Electronic Texts Alex Catalogue of Electronic Texts This is a collection of public domain and open access documents with a focus on American and English literature as well as Western philosophy. Its purpose is to help facilitate a person's liberal arts education. "Big ideas don't fit on a mobile." Discover what books you consider "great". Take the Great Books Survey. Creator: Eric Lease Morgan Date created: 1994-07-23 Date updated: 2014-12-12 URL: http://infomotions.com/alex cache = ./cache/infomotions-com-3637.html txt = ./txt/infomotions-com-3637.txt === reduce.pl bib === id = infomotions-com-172 author = title = Index of /sandbox date = pages = extension = .html mime = text/html words = 183 sentences = 20 flesch = 87 summary = Index of /sandbox Home Alex Catalogue Serials Blog Musings Sandbox Sandbox Sandbox This is the Infomotions Sandbox, a place where we do applied research & development. The things found in here are experimental and may not work correctly. Your milage may vary, but they ought to be fun anyway. Name Last modified Size Description Parent Directory C4LJ/ 2008-05-24 13:17 alex-lite/ 2011-04-11 20:15 alex/ 2009-09-24 20:12 bibframe/ 2016-03-06 15:40 blues/ 2020-10-12 18:16 books/ 2014-12-13 07:11 concordance/ 2009-06-13 07:46 great-books-redux/ 2013-04-17 02:12 great-books/ 2010-12-31 10:39 gutenberg-index/ 2019-05-04 17:18 gutenberg/ 2009-01-19 08:57 liam/ 2018-01-04 10:47 mbooks/ 2014-11-08 20:20 mine-alamw11/ 2014-11-03 21:23 mine-mail/ 2014-11-03 21:33 mylibrary/ 2007-10-24 20:15 solr-sru/ 2014-09-22 21:59 timeline/ 2014-03-19 11:03 Author: Eric Lease Morgan Date created: 2009-01-19 (Martin Luther King Day) Date updated: 2009-01-19 URL: http://infomotions.com/sandbox/ cache = ./cache/infomotions-com-172.html txt = ./txt/infomotions-com-172.txt === reduce.pl bib === id = infomotions-com-3852 author = title = Infomotions, LLC date = pages = extension = .html mime = application/xhtml+xml words = 325 sentences = 20 flesch = 51 summary = With more than twenty years of experience, Infomotions can assist you, your staff, and your fellow employees learn about, create, and maintain digital library collections and services that are usable, scalable, sustainable, and relevant to your patrons. For example, Infomotions has been practicing open access publishing and open source software distribution for more than fifteen years. All of our articles, presentations, workshops, handouts, travel logs, and software are freely available through our Musings on Information and Librarianship. For example, try searching the Musings for articles, librarians, libraries, and librarianship, presentations, or travel logs. Alex Catalogue of Electronic Texts a collection of "great" American and English literature as well as Western philosophy Mr. Serials Collection a set of library-related electronic serials If you think Infomotions can assist you and your organization with your digital library collections and services, then don't hesitate to drop us a line. eric_morgan@infomotions.com cache = ./cache/infomotions-com-3852.html txt = ./txt/infomotions-com-3852.txt === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === id = planet-infomotions-com-8900 author = title = Planet Eric Lease Morgan date = pages = extension = .xml mime = application/rss+xml words = 300279 sentences = 22489 flesch = 67 summary = [15] Next steps include: calculating an integer denoting the number of pages in an item, implementing a Web-based search interface to a subset’s full text as well as metadata, putting the source code (written in Python and Bash) on GitHub. After that I need to: identify more robust ways to create subsets from the whole of EEBO, provide links to the raw TEI/XML as well as HTML versions of items, implement quite a number of cosmetic enhancements, and most importantly, support the means to compare & contrast items of interest in each subset. The next steps are numerous and listed in no priority order: putting the whole thing on GitHub, outputting the reports in generic formats so other things can easily read them, improving the terminal-based search interface, implementing a Web-based search interface, writing advanced programs in R that chart and graph analysis, provide a means for comparing & contrasting two or more items from a corpus, indexing the corpus with a (real) indexer such as Solr, writing a "cookbook" describing how to use the browser to to "kewl" things, making the metadata of corpora available as Linked Data, etc. cache = ./cache/planet-infomotions-com-8900.xml txt = ./txt/planet-infomotions-com-8900.txt === reduce.pl bib === id = bit-ly-5230 author = title = Link Grabber - Chrome Web Store date = pages = extension = .html mime = text/html words = 131 sentences = 31 flesch = 68 summary = Link Grabber Chrome Web Store Link Grabber offered by Don 70,000+ users Overview An easy to use extractor or grabber for hyperlinks on an HTML page Extract links from an HTML page and display them in another tab. Features: * Requires no special permissions * No usage information / analytics are collected from you * Use either browser action button or context menu item to activate * Configurable list of blocked domains * Filter links by substring match * Copy links to clipboard * Show/hide links that appear more than once on the page * Show/hide links to the same hostname as the source page * Group links by domain Thanks to: Icons by FatCow http://www.fatcow.com/free-icons React http://facebook.github.io/react/ Twitter Bootstrap http://twitter.github.com/bootstrap Report Abuse Additional Information Version: 0.5.2 Updated: November 10, 2017 Size: 199KiB Language: English (United States) cache = ./cache/bit-ly-5230.html txt = ./txt/bit-ly-5230.txt === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === id = mallet-cs-umass-edu-3654 author = title = MALLET homepage date = pages = extension = .html mime = text/html words = 354 sentences = 45 flesch = 44 summary = MAchine Learning for LanguagE Toolkit MALLET is open source software For research use, please remember to cite MALLET. MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text. MALLET includes sophisticated tools for document classification: efficient routines for converting text to "features", In addition to classification, MALLET includes tools for sequence tagging Topic models are useful for analyzing large collections of The MALLET topic modeling toolkit contains efficient, Many of the algorithms in MALLET depend on numerical optimization. MALLET includes an efficient implementation of Limited Memory BFGS, In addition to sophisticated Machine Learning applications, MALLET includes routines for transforming text documents into [Quick Start] [Developer's Guide] [Quick Start] [Developer's Guide] An add-on package to MALLET, called GRMM, contains support for inference in general graphical models, "MALLET: A Machine Learning for Language Toolkit." cache = ./cache/mallet-cs-umass-edu-3654.html txt = ./txt/mallet-cs-umass-edu-3654.txt === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === id = infomotions-com-555 author = title = Water date = pages = extension = .html mime = text/html words = 37 sentences = 5 flesch = 80 summary = Home Alex Catalogue Serials Water Water Blog Musings Planet Sandbox Water Collection Alas, as of December 13, 2014, my water collection has gone off-line, but you can read about it in a series of blog postings. cache = ./cache/infomotions-com-555.html txt = ./txt/infomotions-com-555.txt === reduce.pl bib === id = planet-infomotions-com-4104 author = title = Eric Lease Morgan's Writings Timeline date = pages = extension = .html mime = text/html words = 88 sentences = 12 flesch = 77 summary = Eric Lease Morgan's Writings Timeline Eric Lease Morgan's Writings Timeline This is timeline of my writings to date. (Well, the vast majority of 'em.) Click & drag or use your mouse wheel to navigate backwards and forwards through time. Click on an item to read a synopsis or link to the full text. See also the "planet" for a textual view. For more information see the blog posting. Author: Eric Lease Morgan Date created: December 20, 2010 Date updated: June 4, 2011 URL: http://planet.infomotions.com/timeline/ cache = ./cache/planet-infomotions-com-4104.html txt = ./txt/planet-infomotions-com-4104.txt === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === id = serials-infomotions-com-5908 author = title = Index of / date = pages = extension = .html mime = text/html words = 343 sentences = 28 flesch = 78 summary = Index of / Serials Serials Electronic serials This is a loose collection of electronic journals (serials), mostly from the area of library science. As a librarian this sort of information interests me and that is why it has been collected. The process to create this collection has been coined the Mr. Serials Process. Since fewer and fewer electronic serials are distributed via electronic mail, the Mr. Serials Process is slowly becoming obsolete, but for some things it still works just fine. Read more about the Mr. Serials Process in Eric Lease Morgan "Description and Evaluation of the 'Mr. Serials' Process: Automatically Collecting, Organizing, Archiving, Indexing, and Disseminating Electronic Serials" Serials Review 21 no. For the latest information regarding Mr. Serials see "Mr. Serials is Dead. Long live Mr. Serials." dated January 11, 2009. Name Last modified Size Description Author: Eric Lease Morgan Date created: 1992-06-21 Date updated: 2009-01-12 URL: http://serials.infomotions.com cache = ./cache/serials-infomotions-com-5908.html txt = ./txt/serials-infomotions-com-5908.txt === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === id = 2020-code4lib-org-5785 author = title = Using & hacking the Distant Reader date = pages = extension = .html mime = text/html words = 289 sentences = 22 flesch = 66 summary = Put another way, the Reader consumes just about any number of files in any just about any format, and it outputs plain text files, delimited files, a relational database, and a set of HTML reports all for the purposes of systematic reading. The first half of this workshop will be on the use of the Distant Reader. Attendees will learn how to submit content to the Reader, and then how to interact with the HTML reports. The second half of the workshop will be on hacking the Reader's structured data. Given the plain text files, tab-delimited files, and relational database the Reader also outputs, attendees will learn how to do various visualizations against the data, subset the data with SQL, index the data with Solr, normalize the data with OpenRefine, use machine learning against the data, etc., all for the purposes of more in-depth analysis. cache = ./cache/2020-code4lib-org-5785.html txt = ./txt/2020-code4lib-org-5785.txt === reduce.pl bib === id = sites-tufts-edu-6731 author = title = Comments on: date = pages = extension = .xml mime = application/rss+xml words = 10 sentences = 2 flesch = 91 summary = cache = ./cache/sites-tufts-edu-6731.xml txt = ./txt/sites-tufts-edu-6731.txt === reduce.pl bib === id = twitter-com-9838 author = title = twitter-com-9838 date = pages = extension = .html mime = text/html words = 32 sentences = 6 flesch = 90 summary = We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter? Yes Something went wrong, but don't fret — let's give it another shot. cache = ./cache/twitter-com-9838.html txt = ./txt/twitter-com-9838.txt === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === id = stedolan-github-io-4569 author = title = jq date = pages = extension = .html mime = text/html words = 301 sentences = 35 flesch = 85 summary = Tutorial Manual Source Try online! Linux (64-bit) OS X (64-bit) Windows (64-bit) Other platforms, older versions, and source Try online at jqplay.org! jq is like sed for JSON data you can use it to slice and filter and map and transform structured data with the same ease that sed, You can download a single binary, scp it to a far away machine of the same type, and expect it to work. jq can mangle the data format that you have into the one that you shorter and simpler than you'd expect. Go read the tutorial for more, or the manual See installation options on the download page, and the release notes jq 1.5 released, including new datetime, math, and regexp functions, See installation options on the release notes releases page. releases page. jq 1.4 (finally) released! Get it on the download page. Get it on the download page. jq 1.3 released. jq 1.3 released. cache = ./cache/stedolan-github-io-4569.html txt = ./txt/stedolan-github-io-4569.txt === reduce.pl bib === id = infomotions-com-953 author = Eric Lease Morgan title = Infomotions' Musings on Information and Librarianship date = pages = extension = .html mime = application/xhtml+xml words = 355 sentences = 26 flesch = 63 summary = Browse by date Browse by subject Infomotions' Musings on Information and Librarianship Infomotions' Musings on Information and Librarianship This is a collection of the things I've written -my musings. It includes pre-edited as well as formally published articles, travel logs, descriptions of software applications, and the hand-outs of workshops and presentations. Adding Internet resources to our OPACs Description: This essay advocates the addition of bibliographic records describing Internet-based electronic serials and Internet resources in general to library online public access catalogs (OPAC), addresses a few implications of this proposition, and finally, suggests a few solutions to accomplish this goal. Subject(s): cataloging; articles; URL: http://infomotions.com/musings/adding-internet-resources/ A lot of the time, this means thinking, studying, writing, sharing, and repeating the process. I believe it is important to share one's ideas freely. This collection is a manifestation of that idea. To these ends I am sharing the texts in this collection with you. URL: http://infomotions.com/musings/ cache = ./cache/infomotions-com-953.html txt = ./txt/infomotions-com-953.txt === reduce.pl bib === id = dh-crc-nd-edu-7757 author = title = Project Gutenberg - Home date = pages = extension = .html mime = text/html words = 29 sentences = 8 flesch = 83 summary = Project Gutenberg Home Project Gutenberg Home Get URLs This is selected fulltext index to the content of Project Gutenberg. Enter a query. Query: Eric Lease Morgan April 30, 2019 cache = ./cache/dh-crc-nd-edu-7757.html txt = ./txt/dh-crc-nd-edu-7757.txt === reduce.pl bib === === reduce.pl bib === id = dh-crc-nd-edu-1806 author = title = DH Blog @ Notre Dame date = pages = extension = .xml mime = application/rss+xml words = 512 sentences = 33 flesch = 60 summary = Once parts-of-speech are denoted, a reader can begin to analyze a text on a … Continue reading → A student here at Notre Dame wants to do computer and text mining analyze a set of websites. Beth Plale and Yiming Sun, both from the HathiTrust Research Center, came to Notre Dame on Tuesday (May 7) to give the digital humanities group an update of some of the things happening at the Center. This posting documents some … Continue reading → In his words, “I will explain how practices such as text mining present a fundamental challenge … Continue reading → This Friday (April 12) the Notre Dame Digital Humanities group will be sponsoring a lunchtime presentation by Matthew Sag called Copyright And The Digital Humanities: I will explain how practices such as text mining present a fundamental challenge to our … Continue reading → cache = ./cache/dh-crc-nd-edu-1806.xml txt = ./txt/dh-crc-nd-edu-1806.txt === reduce.pl bib === id = tika-apache-org-2948 author = title = Apache Tika – Apache Tika date = pages = extension = .html mime = application/xhtml+xml words = 3460 sentences = 345 flesch = 80 summary = This release includes a new artifact to enable starting tika-server as a service via Eric Pugh, improved detection of zip-based formats, more complex PDF processing options, security fixes and numerous bug fixes and dependency upgrades. Please see the CHANGES.txt file for a full list of changes in this release, and have a look at the download page for more information on how to obtain Apache Tika 1.2. Please see the CHANGES.txt file for a full list of changes in this release, and have a look at the download page for more information on how to obtain Apache Tika 1.2. Please see the CHANGES.txt file for a full list of changes in this release, and have a look at the download page for more information on how to obtain Apache Tika 1.2. Please see the CHANGES.txt file for a full list of changes in this release, and have a look at the download page for more information on how to obtain Apache Tika 1.2. cache = ./cache/tika-apache-org-2948.html txt = ./txt/tika-apache-org-2948.txt === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === id = www-gnu-org-8892 author = title = GNU Parallel - GNU Project - Free Software Foundation date = pages = extension = .html mime = application/xhtml+xml words = 1214 sentences = 145 flesch = 73 summary = GNU Project Free Software Foundation GNU parallel can then split the input and pipe it into If you use xargs and tee today you will find GNU parallel very easy to use as GNU parallel is written to have the same options as xargs. GNU parallel makes sure output from the commands is the same output as possible to use output from GNU parallel as input for other programs. For each line of input GNU parallel will execute command with If you prefer reading a book buy GNU Parallel 2018 at https://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html of OPTIONS in man parallel (Use LESS=+/EXAMPLE: https://www.gnu.org/software/parallel/parallel_cheat.pdf For alternatives to GNU parallel, see GNU parallel, see: man parallel_design The GNU Parallel Citation FAQ. GNU parallel has two mailing lists: for discussing uses of GNU parallel. You can show your support for GNU parallel using our merchandise. O. Tange (2018): GNU Parallel 2018, March 2018, https://doi.org/10.5281/zenodo.1146014. cache = ./cache/www-gnu-org-8892.html txt = ./txt/www-gnu-org-8892.txt === reduce.pl bib === === reduce.pl bib === id = curl-haxx-se-8721 author = title = curl date = pages = extension = .html mime = text/html words = 393 sentences = 65 flesch = 73 summary = tiny-curl Releases curl tool curl-library curl-users Book: Everything curl Release Procedure Test curl Release table curl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, curl is used in command lines or scripts to transfer data. Who makes curl? curl is free and open source software and exists What's the latest curl? The most recent stable version is 7.73.0, released on 14th of October 2020. Currently, 90 of the listed downloads are of the latest version. Time to donate to the curl project? Check out the latest source code Everything curl is a detailed everything there is to know about curl, libcurl and the associated project. Learn how to use curl. perhaps how the curl project accepts contributions. Everything curl is itself an Everything curl is itself an Everything curl is itself an Everything curl is itself an Everything curl is itself an open project that accepts your contributions and help. cache = ./cache/curl-haxx-se-8721.html txt = ./txt/curl-haxx-se-8721.txt === reduce.pl bib === id = infomotions-com-9966 author = title = Michael Hart in Roanoke (Indiana) – Infomotions Mini-Musings date = pages = extension = .html mime = text/html words = 2496 sentences = 237 flesch = 77 summary = Michael Hart in Roanoke (Indiana) – Infomotions Mini-Musings On Saturday, February 27, Paul Turner and I made our way to Roanoke (Indiana) to listen to Michael Hart tell stories about electronic texts and Project Gutenberg. To celebrate its 100th birthday, the Roanoke Public Library invited Michael Hart of Project Gutenberg fame to share his experience regarding electronic texts in a presentation called "Books & eBooks: Past, Present & Future Libraries". "The things Project Gutenberg creates are electronic texts, not ebooks. Maybe I should have phrased it differently and asked him, the way Paul did, to compare the experience of reading physical books and electronic texts. Posted on March 7, 2010March 11, 2010Author Eric Lease MorganCategories Alex Catalogue, TraveloguesTags Michael Hart, Project Gutenberg, Roanoke (Indiana) 5 thoughts on "Michael Hart in Roanoke (Indiana)" As president of the Roanoke Public Library Board, I appreciate what you have written and how the Michael Hart presentation impacted you. cache = ./cache/infomotions-com-9966.html txt = ./txt/infomotions-com-9966.txt === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === id = infomotions-com-3769 author = title = Fun with RSS and the RSS aggregator called Planet – Infomotions Mini-Musings date = pages = extension = .html mime = text/html words = 1474 sentences = 184 flesch = 77 summary = Fun with RSS and the RSS aggregator called Planet – Infomotions Mini-Musings This posting outlines how I refined a number of my RSS feeds and then aggregated them into a coherent whole using Planet. The result is a fledgling system I call "What's Eric Reading?" Since I wanted to share my wealth (after all, I am a librarian) I created an RSS feed against this system too. I went back to my water collection and created a full-fledged RSS feed against it as well. A couple of years ago the Code4Lib community created an RSS "planet" called Planet Code4Lib — "Blogs and feeds of interest to the Code4Lib community, aggregated." I think it is maintained by Jonathan Rochkind, but I'm not sure. Use the Planet software to aggregate RSS fitting your library's collection development policy. cache = ./cache/infomotions-com-3769.html txt = ./txt/infomotions-com-3769.txt === reduce.pl bib === === reduce.pl bib === id = pkp-sfu-ca-4628 author = title = Open Journal Systems | Public Knowledge Project date = pages = extension = .html mime = text/html words = 492 sentences = 47 flesch = 51 summary = PKP is a multi-university initiative developing (free) open source software and conducting research to improve the quality and reach of scholarly publishing Public Knowledge Project > Open Journal Systems Public Knowledge Project > Open Journal Systems Open Journal Systems (OJS) is an open source software application for managing and publishing scholarly journals. Originally developed and released by PKP in 2001 to improve access to research, it is the most widely used open source journal publishing platform in existence, with over 10,000 journals using it worldwide. PKP Publishing Services also offers a fee-based service which provides the installation and hosting of OJS, as well as performing daily backups of your data, applying security patches and upgrades, and priority answering your support questions. All revenue generated by the hosting service goes into developing PKP software and supporting the Public Knowledge Project. For support with PKP software we encourage users to consult our documentation and search our support forums. cache = ./cache/pkp-sfu-ca-4628.html txt = ./txt/pkp-sfu-ca-4628.txt === reduce.pl bib === === reduce.pl bib === id = infomotions-com-6757 author = title = Infomotions Mini-Musings date = pages = extension = .xml mime = application/rss+xml words = 11414 sentences = 796 flesch = 66 summary = It means PDF files need to have been “born digitally” or they need to have been processed with optical character recognition (OCR), and then … Continue reading Creating a plain text version of a corpus with Tika This essay describes, illustrates, and demonstrates how the Digital Public Library of America (DPLA) can build on the good work of others who support the creation and maintenance of collections and provide value-added services against texts — a concept we call “use & understand”. I decided to give it a whirl and particpate in the DPLA Beta Sprint, and below is my submission: DPLA Beta Sprint Submission My DPLA Beta Sprint submission will describe and demonstrate how the digitized versions of library collections can be made more useful through the application of text mining and various other digital humanities … Continue reading DPLA Beta Sprint Submission This posting describes the initial process I am using to do such a thing, but the imporant thing to note is that this process is more about librarianship than it is … Continue reading Collecting the Great Books cache = ./cache/infomotions-com-6757.xml txt = ./txt/infomotions-com-6757.txt === reduce.pl bib === === reduce.pl bib === id = github-com-7801 author = title = GitHub - ericleasemorgan/reader-lite: Given a file and a directory, output analysis of file to directory date = pages = extension = .html mime = text/html words = 625 sentences = 132 flesch = 74 summary = GitHub ericleasemorgan/reader-lite: Given a file and a directory, output analysis of file to directory Explore GitHub → GitHub Education GitHub Stars program GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together. GitHub is where the world builds software GitHub CLI Open with GitHub Desktop Launching GitHub Desktop Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. If nothing happens, download GitHub Desktop and try again. If nothing happens, download the GitHub extension for Visual Studio and try again. Latest commit Failed to load latest commit information. Latest commit message Commit time No releases published No packages published Contact GitHub We use optional third-party analytics cookies to understand how you use GitHub.com so we can build better products. We use optional third-party analytics cookies to understand how you use GitHub.com so we can build better products. cache = ./cache/github-com-7801.html txt = ./txt/github-com-7801.txt === reduce.pl bib === id = distantreader-org-6471 author = title = Home date = pages = extension = .html mime = text/html words = 313 sentences = 25 flesch = 68 summary = Distant Reader Gateway The Distant Reader is a tool for reading. The Distant Reader empowers you to use & understand large amounts of textual information both quickly & easily. Technically speaking, the Distant Reader is a system which locally harvests/caches content you specify. It then transforms the content into plain text, performs sets of natural language processing & text mining against the text, saves the results in a number of formats, reduces the whole to a cross-platform database file, queries the database thus summarizing the collection, zips the results of the entire process into a single file, and makes the file available to you for further investigation -"reading". Sample output of the Reader ("study carrels") I don't know about you, but now-a-days I can find plenty of scholarly & authoritative content. The Distant Reader is intended to address this question by making observations against a corpus and providing tools for interpreting the results. cache = ./cache/distantreader-org-6471.html txt = ./txt/distantreader-org-6471.txt === reduce.pl bib === id = distantreader-org-7009 author = title = Home date = pages = extension = .html mime = text/html words = 313 sentences = 25 flesch = 68 summary = Distant Reader Gateway The Distant Reader is a tool for reading. The Distant Reader empowers you to use & understand large amounts of textual information both quickly & easily. Technically speaking, the Distant Reader is a system which locally harvests/caches content you specify. It then transforms the content into plain text, performs sets of natural language processing & text mining against the text, saves the results in a number of formats, reduces the whole to a cross-platform database file, queries the database thus summarizing the collection, zips the results of the entire process into a single file, and makes the file available to you for further investigation -"reading". Sample output of the Reader ("study carrels") I don't know about you, but now-a-days I can find plenty of scholarly & authoritative content. The Distant Reader is intended to address this question by making observations against a corpus and providing tools for interpreting the results. cache = ./cache/distantreader-org-7009.html txt = ./txt/distantreader-org-7009.txt === reduce.pl bib === id = github-com-2983 author = title = GitHub - ericleasemorgan/reader-toolbox: A suite of scripts use to report on and analyze the content of Distant Reader "study carrels" date = pages = extension = .html mime = text/html words = 739 sentences = 139 flesch = 74 summary = GitHub ericleasemorgan/reader-toolbox: A suite of scripts use to report on and analyze the content of Distant Reader "study carrels" GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together. Open with GitHub Desktop Launching GitHub Desktop Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. If nothing happens, download GitHub Desktop and try again. If nothing happens, download the GitHub extension for Visual Studio and try again. A suite of scripts use to report on and analyze the content of Distant Reader "study carrels" A suite of scripts use to report on and analyze the content of Distant Reader "study carrels" Contact GitHub We use optional third-party analytics cookies to understand how you use GitHub.com so we can build better products. We use optional third-party analytics cookies to understand how you use GitHub.com so we can build better products. cache = ./cache/github-com-2983.html txt = ./txt/github-com-2983.txt === reduce.pl bib === id = github-com-8326 author = title = GitHub - ericleasemorgan/htid2books: Given an access key, secret token, and a HathiTrust identifier, output plain text as well as PDF versions of a book. date = pages = extension = .html mime = text/html words = 2298 sentences = 278 flesch = 71 summary = GitHub ericleasemorgan/htid2books: Given an access key, secret token, and a HathiTrust identifier, output plain text as well as PDF versions of a book. For example, ./bin/htid2txt.sh 194dfe2bg3 xa5350f0c44548487778e942518a nyp.33433082524681 In this case, the script will do the tiniest bit of validation, repeatedly run a Perl script (htid2txt.pl) to get the OCR of an individual page, cache the result, and when there no more pages in the given book, concatenate the cache into a text file saved in the directory named ./books. Given an access key, secret token, and a HathiTrust identifier, output plain text as well as PDF versions of a book. Given an access key, secret token, and a HathiTrust identifier, output plain text as well as PDF versions of a book. Given an access key, secret token, and a HathiTrust identifier, output plain text as well as PDF versions of a book. cache = ./cache/github-com-8326.html txt = ./txt/github-com-8326.txt === reduce.pl bib === id = infomotions-com-2987 author = title = Infomotions Mini-Musings – Artist- and Librarian-At-Large date = pages = extension = .html mime = text/html words = 59918 sentences = 4704 flesch = 69 summary = To all these ends, Voyant Tools counts & tabulates the frequencies of words, plots the results in a number of useful ways, supports topic modeling, and the comparison documents across a corpus. This essay describes, illustrates, and demonstrates how the Digital Public Library of America (DPLA) can build on the good work of others who support the creation and maintenance of collections and provide value-added services against texts — a concept we call "use & understand". More specifically, this proposal assumes the collections of the DPLA include things like but not necessarily limited to: digitized versions of public domain works, the full-text of open access scholarly journals and/or trade magazines, scholarly and governmental data sets, theses & dissertations, a substantial portion of the existing United States government documents, the archives of selected mailing lists, and maybe even the archives of blog postings and Twitter feeds. cache = ./cache/infomotions-com-2987.html txt = ./txt/infomotions-com-2987.txt === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === id = docs-pkp-sfu-ca-7101 author = title = REST API Reference, 3.1.x - Open Journal Systems date = pages = extension = .html mime = text/html words = 28 sentences = 7 flesch = 30 summary = REST API Reference, 3.1.x Open Journal Systems Community Documentation Interest Group Contributing Documentation Translating Guide Community Forum Public Knowledge Project PKP|Publishing Services Contact Us Contact Us cache = ./cache/docs-pkp-sfu-ca-7101.html txt = ./txt/docs-pkp-sfu-ca-7101.txt === reduce.pl bib === id = github-com-8025 author = title = GitHub - ericleasemorgan/ojs-toolbox: Given a Open Journal System (OJS) root URL and an authorization token, cache all JSON files associated with the given OJS title, and optionally output rudimentary bibliographics in the form of a tab-separated value (TSV) stream. date = pages = extension = .html mime = text/html words = 1219 sentences = 168 flesch = 72 summary = GitHub ericleasemorgan/ojs-toolbox: Given a Open Journal System (OJS) root URL and an authorization token, cache all JSON files associated with the given OJS title, and optionally output rudimentary bibliographics in the form of a tab-separated value (TSV) stream. Given a Open Journal System (OJS) root URL and an authorization token, cache all JSON files associated with the given OJS title, and optionally output rudimentary bibliographics in the form of a tab-separated value (TSV) stream. Given a Open Journal System (OJS) root URL and an authorization token, cache all JSON files associated with the given OJS title, and optionally output rudimentary bibliographics in the form of a tab-separated value (TSV) stream. Given a Open Journal System (OJS) root URL and an authorization token, cache all JSON files associated with the given OJS title, and optionally output rudimentary bibliographics in the form of a tab-separated value (TSV) stream. cache = ./cache/github-com-8025.html txt = ./txt/github-com-8025.txt === reduce.pl bib === === reduce.pl bib === id = github-com-8202 author = title = GitHub - senderle/topic-modeling-tool: A point-and-click tool for creating and analyzing topic models produced by MALLET. date = pages = extension = .html mime = text/html words = 1460 sentences = 201 flesch = 72 summary = GitHub senderle/topic-modeling-tool: A point-and-click tool for creating and analyzing topic models produced by MALLET. GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together. If nothing happens, download GitHub Desktop and try again. The Topic Modeling Tool now has native Windows and Mac apps, and because of your operating system and version, and let us know the other tools you're $ cd topic-modeling-tool/TopicModelingTool Work on this version of the tool has benefited from the support of A point-and-click tool for creating and analyzing topic models produced by MALLET. A point-and-click tool for creating and analyzing topic models produced by MALLET. senderle.github.io/topic-modeling-tool/documentation/2017/01/06/quickstart.html senderle.github.io/topic-modeling-tool/documentation/2017/01/06/quickstart.html We use optional third-party analytics cookies to understand how you use GitHub.com so we can build better products. We use optional third-party analytics cookies to understand how you use GitHub.com so we can build better products. cache = ./cache/github-com-8202.html txt = ./txt/github-com-8202.txt === reduce.pl bib === id = github-com-379 author = title = GitHub - ericleasemorgan/reader-gutenberg: A system for implementing an index to Project Gutenberg date = pages = extension = .html mime = text/html words = 582 sentences = 127 flesch = 73 summary = GitHub ericleasemorgan/reader-gutenberg: A system for implementing an index to Project Gutenberg Explore GitHub → GitHub Education GitHub Stars program GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together. GitHub is where the world builds software GitHub CLI Open with GitHub Desktop Launching GitHub Desktop Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. If nothing happens, download GitHub Desktop and try again. If nothing happens, download the GitHub extension for Visual Studio and try again. Failed to load latest commit information. Latest commit message A system for implementing an index to Project Gutenberg Contact GitHub We use optional third-party analytics cookies to understand how you use GitHub.com so we can build better products. We use optional third-party analytics cookies to understand how you use GitHub.com so we can build better products. cache = ./cache/github-com-379.html txt = ./txt/github-com-379.txt === reduce.pl bib === === reduce.pl bib === id = www-laurenceanthony-net-8779 author = title = Laurence Anthony's AntConc date = pages = extension = .html mime = text/html words = 797 sentences = 122 flesch = 65 summary = Laurence Anthony's AntConc All previous releases of AntConc can be found at the following link. AntConc 3.2.1 Tutorial (in English) Latest version available here. For example if you download AntConc 3.5.8, which was released in 2019, you would cite/reference it as follows: AntConc (Version 3.5.8) [Computer Software]. These lists can be imported into AntConc and used as reference corpora word lists to create keyword lists. Brown Corpus word frequency list (lowercase) These can be imported into AntConc to create lemma word lists. An English lemma list based on all words in the BNC corpus with a frequency greater than 2 (created by Laurence Anthony). To use this list, *append* a hyphen (-) and apostrophe (') character to the AntConc token definition to ensure the processed correctly (see global settings). To use this list, *append* a hyphen (-) and apostrophe (') character to the AntConc token definition (see global settings). cache = ./cache/www-laurenceanthony-net-8779.html txt = ./txt/www-laurenceanthony-net-8779.txt === reduce.pl bib === id = youtu-be-1944 author = title = Michael Hart in Roanoke (Indiana) - YouTube date = pages = extension = .html mime = text/html words = 22 sentences = 10 flesch = 81 summary = Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features © 2020 Google LLC cache = ./cache/youtu-be-1944.html txt = ./txt/youtu-be-1944.txt === reduce.pl bib === id = github-com-9780 author = title = GitHub - ericleasemorgan/reader: Distant Reader, a tool for using & understanding a corpus date = pages = extension = .html mime = text/html words = 1348 sentences = 190 flesch = 69 summary = GitHub ericleasemorgan/reader: Distant Reader, a tool for using & understanding a corpus GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together. The Distant Reader CORD is a high performance computing (HPC) system which: 1) takes an almost arbitrary amount of unstructured data (text) as input and outputs a set of structured data for analysis, and 2) does this work against a specific data set called CORD-19. As an HPC, the Distant Reader CORD is not a single computer program but instead a suite of software comprised of many individual scripts and applications. This suite of software will prepare a data set called "CORD-19" for processing with the Distant Reader. As a pre-processing step for the Distant Reader, the suite processes the CORD-19 metadata and its associated JSON files. cache = ./cache/github-com-9780.html txt = ./txt/github-com-9780.txt === reduce.pl bib === === reduce.pl bib === id = dh-crc-nd-edu-9558 author = title = DH Blog @ Notre Dame | Learning about human expression through the use of computers date = pages = extension = .html mime = text/html words = 4666 sentences = 355 flesch = 69 summary = Once parts-of-speech are denoted, a reader can begin to analyze a text on a dimension beyond the simple tabulating of words. Beth Plale and Yiming Sun, both from the HathiTrust Research Center, came to Notre Dame on Tuesday (May 7) to give the digital humanities group an update of some of the things happening at the Center. In his words, "I will explain how practices such as text mining present a fundamental challenge to our understanding of copyright law and what this means for scholars in the digital humanities." To answer his own question, Sag does not believe processes like text mining violate copyright because the results are generated automatically — created by machines. I will explain how practices such as text mining present a fundamental challenge to our understanding of copyright law and what this means for scholars in the digital humanities. cache = ./cache/dh-crc-nd-edu-9558.html txt = ./txt/dh-crc-nd-edu-9558.txt === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === === reduce.pl bib === Building ./etc/reader.txt planet-infomotions-com-3359 planet-infomotions-com-8900 infomotions-com-9504 youtu-be-1944 www-xsede-org-5929 www-laurenceanthony-net-8779 number of items: 99 sum of words: 865,064 average size in words: 20,596 average readability score: 69 nouns: data; text; p; content; words; library; number; things; file; information; word; process; libraries; people; time; sandbox; files; use; list; source; collection; web; services; href="http://dh.crc.nd.edu; xml; search; book; set; books; reader; way; software; results; water; musings; example; database; blog; computer; work; documents; amp; records; document; analysis; items; texts; access; #; metadata verbs: is; are; be; was; have; do; were; used; has; use; linked; given; called; been; create; make; using; see; get; based; find; read; created; described; does; done; think; had; provide; written; need; know; go; made; published; describes; include; learn; being; intended; comes; reading; understand; take; creating; named; learned; want; making; found adjectives: other; more; many; open; such; digital; new; first; few; good; available; same; full; different; strong; bibliographic; possible; second; next; able; great; simple; plain; most; -; much; traditional; local; long; wp; single; specific; particular; similar; interesting; whole; easy; complete; own; library; short; triple; useful; difficult; original; common; various; current; human; important adverbs: not; then; more; as; well; very; also; just; up; most; only; really; here; so; now; much; first; out; too; specifically; instead; even; about; never; together; finally; back; thus; consequently; rather; again; there; still; originally; almost; simply; probably; all; usually; often; easily; above; ago; sometimes; always; yet; on; maybe; necessarily; below pronouns: i; it; my; you; they; we; their; your; them; he; our; its; me; his; she; us; itself; her; themselves; myself; one; him; yourself; ourselves; himself; #; >; ’s; let’s; year’s; http://example.org/city; safe.” one’s; em; you.” years’ we’ll; us” there.” ii; herself; apache2::const::ok; wordclouds; useful” solve?” mine; http://example.org/europe; http://dbpedia.org/resource/walt_disney; y’; yours proper nouns: /p; li; #; rdf; library; university; marc; href="http://infomotions.com; td; alex; p; data; perl; digital; notre; dame; tcp; libraries; great; src="http://infomotions.com; thoreau; amp; google; search.cgi?q; project; eric; browser; htrc; h2; zzzz&word; concordance/?cmd; >; nbsp; Date created: 1994-07-23 Date updated: 2014-12-12 URL: http://infomotions.com/alex id: infomotions-com-3769 author: title: Fun with RSS and the RSS aggregator called Planet – Infomotions Mini-Musings date: words: 1474.0 sentences: 184.0 pages: flesch: 77.0 cache: ./cache/infomotions-com-3769.html txt: ./txt/infomotions-com-3769.txt summary: Fun with RSS and the RSS aggregator called Planet – Infomotions Mini-Musings This posting outlines how I refined a number of my RSS feeds and then aggregated them into a coherent whole using Planet. The result is a fledgling system I call "What''s Eric Reading?" Since I wanted to share my wealth (after all, I am a librarian) I created an RSS feed against this system too. I went back to my water collection and created a full-fledged RSS feed against it as well. A couple of years ago the Code4Lib community created an RSS "planet" called Planet Code4Lib — "Blogs and feeds of interest to the Code4Lib community, aggregated." I think it is maintained by Jonathan Rochkind, but I''m not sure. Use the Planet software to aggregate RSS fitting your library''s collection development policy. id: infomotions-com-3852 author: title: Infomotions, LLC date: words: 325.0 sentences: 20.0 pages: flesch: 51.0 cache: ./cache/infomotions-com-3852.html txt: ./txt/infomotions-com-3852.txt summary: With more than twenty years of experience, Infomotions can assist you, your staff, and your fellow employees learn about, create, and maintain digital library collections and services that are usable, scalable, sustainable, and relevant to your patrons. For example, Infomotions has been practicing open access publishing and open source software distribution for more than fifteen years. All of our articles, presentations, workshops, handouts, travel logs, and software are freely available through our Musings on Information and Librarianship. For example, try searching the Musings for articles, librarians, libraries, and librarianship, presentations, or travel logs. Alex Catalogue of Electronic Texts a collection of "great" American and English literature as well as Western philosophy Mr. Serials Collection a set of library-related electronic serials If you think Infomotions can assist you and your organization with your digital library collections and services, then don''t hesitate to drop us a line. eric_morgan@infomotions.com id: infomotions-com-555 author: title: Water date: words: 37.0 sentences: 5.0 pages: flesch: 80.0 cache: ./cache/infomotions-com-555.html txt: ./txt/infomotions-com-555.txt summary: Home Alex Catalogue Serials Water Water Blog Musings Planet Sandbox Water Collection Alas, as of December 13, 2014, my water collection has gone off-line, but you can read about it in a series of blog postings. id: infomotions-com-6757 author: title: Infomotions Mini-Musings date: words: 11414.0 sentences: 796.0 pages: flesch: 66.0 cache: ./cache/infomotions-com-6757.xml txt: ./txt/infomotions-com-6757.txt summary: It means PDF files need to have been “born digitally” or they need to have been processed with optical character recognition (OCR), and then … Continue reading Creating a plain text version of a corpus with Tika This essay describes, illustrates, and demonstrates how the Digital Public Library of America (DPLA) can build on the good work of others who support the creation and maintenance of collections and provide value-added services against texts — a concept we call “use & understand”. I decided to give it a whirl and particpate in the DPLA Beta Sprint, and below is my submission: DPLA Beta Sprint Submission My DPLA Beta Sprint submission will describe and demonstrate how the digitized versions of library collections can be made more useful through the application of text mining and various other digital humanities … Continue reading DPLA Beta Sprint Submission This posting describes the initial process I am using to do such a thing, but the imporant thing to note is that this process is more about librarianship than it is … Continue reading Collecting the Great Books id: infomotions-com-9318 author: title: Infomotions'' Musings on Information and Librarianship date: words: 13253.0 sentences: 1102.0 pages: flesch: 57.0 cache: ./cache/infomotions-com-9318.xml txt: ./txt/infomotions-com-9318.txt summary: Keywords: user-centered design; SOCHE; presentations; librarianship; Source: This essay was never formally published, but it was created for Southwestern Ohio Council for Higher Education (SOCHE) and a conference called ''The Human Face of Information (technology)'' Wednesday, May 6, 2009 at Wright State University In a sentence I learned two things: 1) institutional repository software such as Fedora, DSpace, and EPrints are increasingly being used for more than open access publishing efforts, and 2) the Web Services API of Fedora makes it relatively easy for developers using any programming language to interface with the underlying core.Keywords: Gruene, Texas; institutional repositories; digital libraries; travel log; Source: This file was never formally published. The purpose of OCKHAM is to articulate and design a set of "light weight reference models" for creating and maintaining digital library services and collections.Keywords: OCKHAM (Open Community Knowledge Hypermedia Administration and Metadata); Atlanta, GA; travel log; Source: This text was never published. id: infomotions-com-9504 author: title: Infomotions Mini-Musings – Artist- and Librarian-At-Large date: words: 59918.0 sentences: 4704.0 pages: flesch: 69.0 cache: ./cache/infomotions-com-9504.html txt: ./txt/infomotions-com-9504.txt summary: To all these ends, Voyant Tools counts & tabulates the frequencies of words, plots the results in a number of useful ways, supports topic modeling, and the comparison documents across a corpus. This essay describes, illustrates, and demonstrates how the Digital Public Library of America (DPLA) can build on the good work of others who support the creation and maintenance of collections and provide value-added services against texts — a concept we call "use & understand". More specifically, this proposal assumes the collections of the DPLA include things like but not necessarily limited to: digitized versions of public domain works, the full-text of open access scholarly journals and/or trade magazines, scholarly and governmental data sets, theses & dissertations, a substantial portion of the existing United States government documents, the archives of selected mailing lists, and maybe even the archives of blog postings and Twitter feeds. id: infomotions-com-9966 author: title: Michael Hart in Roanoke (Indiana) – Infomotions Mini-Musings date: words: 2496.0 sentences: 237.0 pages: flesch: 77.0 cache: ./cache/infomotions-com-9966.html txt: ./txt/infomotions-com-9966.txt summary: Michael Hart in Roanoke (Indiana) – Infomotions Mini-Musings On Saturday, February 27, Paul Turner and I made our way to Roanoke (Indiana) to listen to Michael Hart tell stories about electronic texts and Project Gutenberg. To celebrate its 100th birthday, the Roanoke Public Library invited Michael Hart of Project Gutenberg fame to share his experience regarding electronic texts in a presentation called "Books & eBooks: Past, Present & Future Libraries". "The things Project Gutenberg creates are electronic texts, not ebooks. Maybe I should have phrased it differently and asked him, the way Paul did, to compare the experience of reading physical books and electronic texts. Posted on March 7, 2010March 11, 2010Author Eric Lease MorganCategories Alex Catalogue, TraveloguesTags Michael Hart, Project Gutenberg, Roanoke (Indiana) 5 thoughts on "Michael Hart in Roanoke (Indiana)" As president of the Roanoke Public Library Board, I appreciate what you have written and how the Michael Hart presentation impacted you. id: mallet-cs-umass-edu-3654 author: title: MALLET homepage date: words: 354.0 sentences: 45.0 pages: flesch: 44.0 cache: ./cache/mallet-cs-umass-edu-3654.html txt: ./txt/mallet-cs-umass-edu-3654.txt summary: MAchine Learning for LanguagE Toolkit MALLET is open source software For research use, please remember to cite MALLET. MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text. MALLET includes sophisticated tools for document classification: efficient routines for converting text to "features", In addition to classification, MALLET includes tools for sequence tagging Topic models are useful for analyzing large collections of The MALLET topic modeling toolkit contains efficient, Many of the algorithms in MALLET depend on numerical optimization. MALLET includes an efficient implementation of Limited Memory BFGS, In addition to sophisticated Machine Learning applications, MALLET includes routines for transforming text documents into [Quick Start] [Developer''s Guide] [Quick Start] [Developer''s Guide] An add-on package to MALLET, called GRMM, contains support for inference in general graphical models, "MALLET: A Machine Learning for Language Toolkit." id: pkp-sfu-ca-4628 author: title: Open Journal Systems | Public Knowledge Project date: words: 492.0 sentences: 47.0 pages: flesch: 51.0 cache: ./cache/pkp-sfu-ca-4628.html txt: ./txt/pkp-sfu-ca-4628.txt summary: PKP is a multi-university initiative developing (free) open source software and conducting research to improve the quality and reach of scholarly publishing Public Knowledge Project > Open Journal Systems Public Knowledge Project > Open Journal Systems Open Journal Systems (OJS) is an open source software application for managing and publishing scholarly journals. Originally developed and released by PKP in 2001 to improve access to research, it is the most widely used open source journal publishing platform in existence, with over 10,000 journals using it worldwide. PKP Publishing Services also offers a fee-based service which provides the installation and hosting of OJS, as well as performing daily backups of your data, applying security patches and upgrades, and priority answering your support questions. All revenue generated by the hosting service goes into developing PKP software and supporting the Public Knowledge Project. For support with PKP software we encourage users to consult our documentation and search our support forums. id: planet-infomotions-com-3359 author: title: planet-infomotions-com-3359 date: words: 389347.0 sentences: 61656.0 pages: flesch: 74.0 cache: ./cache/planet-infomotions-com-3359.xml txt: ./txt/planet-infomotions-com-3359.txt summary: [15] Next steps include: calculating an integer denoting the number of pages in an item, implementing a Web-based search interface to a subset’s full text as well as metadata, putting the source code (written in Python and Bash) on GitHub. After that I need to: identify more robust ways to create subsets from the whole of EEBO, provide links to the raw TEI/XML as well as HTML versions of items, implement quite a number of cosmetic enhancements, and most importantly, support the means to compare & contrast items of interest in each subset. The next steps are numerous and listed in no priority order: putting the whole thing on GitHub, outputting the reports in generic formats so other things can easily read them, improving the terminal-based search interface, implementing a Web-based search interface, writing advanced programs in R that chart and graph analysis, provide a means for comparing & contrasting two or more items from a corpus, indexing the corpus with a (real) indexer such as Solr, writing a "cookbook" describing how to use the browser to to "kewl" things, making the metadata of corpora available as Linked Data, etc. id: planet-infomotions-com-4104 author: title: Eric Lease Morgan''s Writings Timeline date: words: 88.0 sentences: 12.0 pages: flesch: 77.0 cache: ./cache/planet-infomotions-com-4104.html txt: ./txt/planet-infomotions-com-4104.txt summary: Eric Lease Morgan''s Writings Timeline Eric Lease Morgan''s Writings Timeline This is timeline of my writings to date. (Well, the vast majority of ''em.) Click & drag or use your mouse wheel to navigate backwards and forwards through time. Click on an item to read a synopsis or link to the full text. See also the "planet" for a textual view. For more information see the blog posting. Author: Eric Lease Morgan Date created: December 20, 2010 Date updated: June 4, 2011 URL: http://planet.infomotions.com/timeline/ id: planet-infomotions-com-7919 author: title: Water de Jour date: words: 58.0 sentences: 9.0 pages: flesch: 52.0 cache: ./cache/planet-infomotions-com-7919.xml txt: ./txt/planet-infomotions-com-7919.txt summary: Planet Eric Lease Morgan http://planet.infomotions.com/ Catholic Portal DH @ Notre Dame DH Blog @ Notre Dame LiAM: Linked Archival Metadata LiAM: Linked Archival Metadata Life of a Librarian Days in the Life of a Librarian Mini-musings Infomotions Mini-Musings Infomotions Mini-Musings Musings Infomotions'' Musings on Information and Librarianship Readings What''s Eric Reading? Water collection Water de Jour id: planet-infomotions-com-8900 author: title: Planet Eric Lease Morgan date: words: 300279.0 sentences: 22489.0 pages: flesch: 67.0 cache: ./cache/planet-infomotions-com-8900.xml txt: ./txt/planet-infomotions-com-8900.txt summary: [15] Next steps include: calculating an integer denoting the number of pages in an item, implementing a Web-based search interface to a subset’s full text as well as metadata, putting the source code (written in Python and Bash) on GitHub. After that I need to: identify more robust ways to create subsets from the whole of EEBO, provide links to the raw TEI/XML as well as HTML versions of items, implement quite a number of cosmetic enhancements, and most importantly, support the means to compare & contrast items of interest in each subset. The next steps are numerous and listed in no priority order: putting the whole thing on GitHub, outputting the reports in generic formats so other things can easily read them, improving the terminal-based search interface, implementing a Web-based search interface, writing advanced programs in R that chart and graph analysis, provide a means for comparing & contrasting two or more items from a corpus, indexing the corpus with a (real) indexer such as Solr, writing a "cookbook" describing how to use the browser to to "kewl" things, making the metadata of corpora available as Linked Data, etc. id: planet-infomotions-com-9545 author: title: Planet Eric Lease Morgan date: words: 3844.0 sentences: 362.0 pages: flesch: 57.0 cache: ./cache/planet-infomotions-com-9545.xml txt: ./txt/planet-infomotions-com-9545.txt summary: Rome in three days, an archivists introduction to linked data publishing Questions from a library science student about RDF and linked data Publishing archival descriptions as linked data via databases Simple linked data recipe for libraries, museums, and archives Cloud-sourcing Research Collections: Managing Print in the Mass-digitized Library Environment Selected Internet Resources on Digital Research Data Curation Open source software and libraries: A current SWOT analysis Web-scale discovery indexes and "next generation" library catalogs MyLibrary: A Digital library framework & toolbox Open Library Developer''s Meeting: One Web Page for Every Book Ever Published Open source software at the Montana State University Libraries Symposium Open source software for libraries in 30 minutes Exploiting "Light-weight" Protocols and Open Source Tools to Implement Digital Library Collections and Services Open source software in libraries: A workshop Open source software in libraries Open source software in libraries Open source software in libraries id: serials-infomotions-com-5908 author: title: Index of / date: words: 343.0 sentences: 28.0 pages: flesch: 78.0 cache: ./cache/serials-infomotions-com-5908.html txt: ./txt/serials-infomotions-com-5908.txt summary: Index of / Serials Serials Electronic serials This is a loose collection of electronic journals (serials), mostly from the area of library science. As a librarian this sort of information interests me and that is why it has been collected. The process to create this collection has been coined the Mr. Serials Process. Since fewer and fewer electronic serials are distributed via electronic mail, the Mr. Serials Process is slowly becoming obsolete, but for some things it still works just fine. Read more about the Mr. Serials Process in Eric Lease Morgan "Description and Evaluation of the ''Mr. Serials'' Process: Automatically Collecting, Organizing, Archiving, Indexing, and Disseminating Electronic Serials" Serials Review 21 no. For the latest information regarding Mr. Serials see "Mr. Serials is Dead. Long live Mr. Serials." dated January 11, 2009. Name Last modified Size Description Author: Eric Lease Morgan Date created: 1992-06-21 Date updated: 2009-01-12 URL: http://serials.infomotions.com id: sites-tufts-edu-6731 author: title: Comments on: date: words: 10.0 sentences: 2.0 pages: flesch: 91.0 cache: ./cache/sites-tufts-edu-6731.xml txt: ./txt/sites-tufts-edu-6731.txt summary: id: stedolan-github-io-4569 author: title: jq date: words: 301.0 sentences: 35.0 pages: flesch: 85.0 cache: ./cache/stedolan-github-io-4569.html txt: ./txt/stedolan-github-io-4569.txt summary: Tutorial Manual Source Try online! Linux (64-bit) OS X (64-bit) Windows (64-bit) Other platforms, older versions, and source Try online at jqplay.org! jq is like sed for JSON data you can use it to slice and filter and map and transform structured data with the same ease that sed, You can download a single binary, scp it to a far away machine of the same type, and expect it to work. jq can mangle the data format that you have into the one that you shorter and simpler than you''d expect. Go read the tutorial for more, or the manual See installation options on the download page, and the release notes jq 1.5 released, including new datetime, math, and regexp functions, See installation options on the release notes releases page. releases page. jq 1.4 (finally) released! Get it on the download page. Get it on the download page. jq 1.3 released. jq 1.3 released. id: tika-apache-org-2948 author: title: Apache Tika – Apache Tika date: words: 3460.0 sentences: 345.0 pages: flesch: 80.0 cache: ./cache/tika-apache-org-2948.html txt: ./txt/tika-apache-org-2948.txt summary: This release includes a new artifact to enable starting tika-server as a service via Eric Pugh, improved detection of zip-based formats, more complex PDF processing options, security fixes and numerous bug fixes and dependency upgrades. Please see the CHANGES.txt file for a full list of changes in this release, and have a look at the download page for more information on how to obtain Apache Tika 1.2. Please see the CHANGES.txt file for a full list of changes in this release, and have a look at the download page for more information on how to obtain Apache Tika 1.2. Please see the CHANGES.txt file for a full list of changes in this release, and have a look at the download page for more information on how to obtain Apache Tika 1.2. Please see the CHANGES.txt file for a full list of changes in this release, and have a look at the download page for more information on how to obtain Apache Tika 1.2. id: twitter-com-9838 author: title: twitter-com-9838 date: words: 32.0 sentences: 6.0 pages: flesch: 90.0 cache: ./cache/twitter-com-9838.html txt: ./txt/twitter-com-9838.txt summary: We''ve detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter? Yes Something went wrong, but don''t fret — let''s give it another shot. id: www-gnu-org-8892 author: title: GNU Parallel - GNU Project - Free Software Foundation date: words: 1214.0 sentences: 145.0 pages: flesch: 73.0 cache: ./cache/www-gnu-org-8892.html txt: ./txt/www-gnu-org-8892.txt summary: GNU Project Free Software Foundation GNU parallel can then split the input and pipe it into If you use xargs and tee today you will find GNU parallel very easy to use as GNU parallel is written to have the same options as xargs. GNU parallel makes sure output from the commands is the same output as possible to use output from GNU parallel as input for other programs. For each line of input GNU parallel will execute command with If you prefer reading a book buy GNU Parallel 2018 at https://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html of OPTIONS in man parallel (Use LESS=+/EXAMPLE: https://www.gnu.org/software/parallel/parallel_cheat.pdf For alternatives to GNU parallel, see GNU parallel, see: man parallel_design The GNU Parallel Citation FAQ. GNU parallel has two mailing lists: for discussing uses of GNU parallel. You can show your support for GNU parallel using our merchandise. O. Tange (2018): GNU Parallel 2018, March 2018, https://doi.org/10.5281/zenodo.1146014. id: www-laurenceanthony-net-8779 author: title: Laurence Anthony''s AntConc date: words: 797.0 sentences: 122.0 pages: flesch: 65.0 cache: ./cache/www-laurenceanthony-net-8779.html txt: ./txt/www-laurenceanthony-net-8779.txt summary: Laurence Anthony''s AntConc All previous releases of AntConc can be found at the following link. AntConc 3.2.1 Tutorial (in English) Latest version available here. For example if you download AntConc 3.5.8, which was released in 2019, you would cite/reference it as follows: AntConc (Version 3.5.8) [Computer Software]. These lists can be imported into AntConc and used as reference corpora word lists to create keyword lists. Brown Corpus word frequency list (lowercase) These can be imported into AntConc to create lemma word lists. An English lemma list based on all words in the BNC corpus with a frequency greater than 2 (created by Laurence Anthony). To use this list, *append* a hyphen (-) and apostrophe ('') character to the AntConc token definition to ensure the processed correctly (see global settings). To use this list, *append* a hyphen (-) and apostrophe ('') character to the AntConc token definition (see global settings). id: youtu-be-1944 author: title: Michael Hart in Roanoke (Indiana) - YouTube date: words: 22.0 sentences: 10.0 pages: flesch: 81.0 cache: ./cache/youtu-be-1944.html txt: ./txt/youtu-be-1944.txt summary: Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features © 2020 Google LLC id: infomotions-com-953 author: Eric Lease Morgan title: Infomotions'' Musings on Information and Librarianship date: words: 355.0 sentences: 26.0 pages: flesch: 63.0 cache: ./cache/infomotions-com-953.html txt: ./txt/infomotions-com-953.txt summary: Browse by date Browse by subject Infomotions'' Musings on Information and Librarianship Infomotions'' Musings on Information and Librarianship This is a collection of the things I''ve written -my musings. It includes pre-edited as well as formally published articles, travel logs, descriptions of software applications, and the hand-outs of workshops and presentations. Adding Internet resources to our OPACs Description: This essay advocates the addition of bibliographic records describing Internet-based electronic serials and Internet resources in general to library online public access catalogs (OPAC), addresses a few implications of this proposition, and finally, suggests a few solutions to accomplish this goal. Subject(s): cataloging; articles; URL: http://infomotions.com/musings/adding-internet-resources/ A lot of the time, this means thinking, studying, writing, sharing, and repeating the process. I believe it is important to share one''s ideas freely. This collection is a manifestation of that idea. To these ends I am sharing the texts in this collection with you. URL: http://infomotions.com/musings/ ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel Done