Creating study carrel named neuroscience-from-bioarxiv
Initializing database
Creating cache from Bioarxiv xml file
10.1101/2020.09.21.305516
10_1101-2020_09_21_305516.pdf

10.1101/2021.02.10.430649
10_1101-2021_02_10_430649.pdf

10.1101/2021.02.12.431018
10_1101-2021_02_12_431018.pdf

10.1101/2021.02.11.430847
10_1101-2021_02_11_430847.pdf

10.1101/2021.02.11.430871
10_1101-2021_02_11_430871.pdf

10.1101/2021.02.12.430963
10_1101-2021_02_12_430963.pdf

10.1101/2021.02.13.429885
10_1101-2021_02_13_429885.pdf

10.1101/2021.02.12.430830
10_1101-2021_02_12_430830.pdf

10.1101/2021.02.12.430739
10_1101-2021_02_12_430739.pdf

10.1101/2021.02.12.430764
10_1101-2021_02_12_430764.pdf

10.1101/2020.01.28.923532
10_1101-2020_01_28_923532.pdf

10.1101/2021.02.11.430762
10_1101-2021_02_11_430762.pdf

10.1101/2020.09.23.308239
10_1101-2020_09_23_308239.pdf

10.1101/2020.09.23.310276
10_1101-2020_09_23_310276.pdf

10.1101/2021.02.11.430806
10_1101-2021_02_11_430806.pdf

10.1101/2021.02.11.430695
10_1101-2021_02_11_430695.pdf

10.1101/2021.02.12.430989
10_1101-2021_02_12_430989.pdf

10.1101/2021.02.12.430979
10_1101-2021_02_12_430979.pdf

10.1101/2021.02.12.430923
10_1101-2021_02_12_430923.pdf

10.1101/727867
10_1101-727867.pdf

10.1101/2020.12.24.424317
10_1101-2020_12_24_424317.pdf

10.1101/2020.10.08.327718
10_1101-2020_10_08_327718.pdf

10.1101/2021.02.09.430550
10_1101-2021_02_09_430550.pdf

10.1101/2021.02.10.430656
10_1101-2021_02_10_430656.pdf

10.1101/2021.02.10.430619
10_1101-2021_02_10_430619.pdf

10.1101/2021.02.11.430789
10_1101-2021_02_11_430789.pdf

10.1101/2021.02.10.430512
10_1101-2021_02_10_430512.pdf

10.1101/2021.02.10.430705
10_1101-2021_02_10_430705.pdf

10.1101/2021.02.10.430606
10_1101-2021_02_10_430606.pdf

10.1101/698605
10_1101-698605.pdf

10.1101/2021.02.10.430563
10_1101-2021_02_10_430563.pdf

10.1101/2020.11.17.386649
10_1101-2020_11_17_386649.pdf

10.1101/2020.05.15.090266
10_1101-2020_05_15_090266.pdf

10.1101/2021.02.01.429246
10_1101-2021_02_01_429246.pdf

10.1101/2021.02.10.430623
10_1101-2021_02_10_430623.pdf

10.1101/2021.02.09.430405
10_1101-2021_02_09_430405.pdf

10.1101/2021.02.10.430367
10_1101-2021_02_10_430367.pdf

10.1101/2021.02.09.430536
10_1101-2021_02_09_430536.pdf

10.1101/2021.02.09.430363
10_1101-2021_02_09_430363.pdf

10.1101/2021.02.09.430460
10_1101-2021_02_09_430460.pdf

10.1101/2021.02.08.430070
10_1101-2021_02_08_430070.pdf

10.1101/2021.02.09.430036
10_1101-2021_02_09_430036.pdf

10.1101/2021.02.08.428881
10_1101-2021_02_08_428881.pdf

10.1101/2021.02.08.430343
10_1101-2021_02_08_430343.pdf

10.1101/2021.02.08.430275
10_1101-2021_02_08_430275.pdf

10.1101/2021.02.08.430270
10_1101-2021_02_08_430270.pdf

10.1101/2021.02.10.430604
10_1101-2021_02_10_430604.pdf

10.1101/2021.02.08.430280
10_1101-2021_02_08_430280.pdf

10.1101/2020.09.02.279521
10_1101-2020_09_02_279521.pdf

10.1101/2020.02.04.934216
10_1101-2020_02_04_934216.pdf

2021-02-14 21:22:16 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.431018v1.full.pdf [149895] -> "./cache/10_1101-2021_02_12_431018.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430764v1.full.pdf [730138] -> "./cache/10_1101-2021_02_12_430764.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430979v1.full.pdf [842591] -> "./cache/10_1101-2021_02_12_430979.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430604v1.full.pdf [679851] -> "./cache/10_1101-2021_02_10_430604.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2020.12.24.424317v2.full.pdf [209489] -> "./cache/10_1101-2020_12_24_424317.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430963v1.full.pdf [696163] -> "./cache/10_1101-2021_02_12_430963.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430695v1.full.pdf [1150493] -> "./cache/10_1101-2021_02_11_430695.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430830v1.full.pdf [643872] -> "./cache/10_1101-2021_02_12_430830.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430762v1.full.pdf [378049] -> "./cache/10_1101-2021_02_11_430762.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430619v1.full.pdf [661726] -> "./cache/10_1101-2021_02_10_430619.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430989v1.full.pdf [427940] -> "./cache/10_1101-2021_02_12_430989.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430806v1.full.pdf [577783] -> "./cache/10_1101-2021_02_11_430806.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2020.01.28.923532v3.full.pdf [1791954] -> "./cache/10_1101-2020_01_28_923532.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430923v1.full.pdf [715550] -> "./cache/10_1101-2021_02_12_430923.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430036v1.full.pdf [544464] -> "./cache/10_1101-2021_02_09_430036.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2020.05.15.090266v2.full.pdf [1203677] -> "./cache/10_1101-2020_05_15_090266.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.430270v1.full.pdf [804427] -> "./cache/10_1101-2021_02_08_430270.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.01.429246v2.full.pdf [877153] -> "./cache/10_1101-2021_02_01_429246.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.430275v1.full.pdf [1459264] -> "./cache/10_1101-2021_02_08_430275.pdf" [1]
2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430367v1.full.pdf [689268] -> "./cache/10_1101-2021_02_10_430367.pdf" [1]
2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430512v1.full.pdf [1286961] -> "./cache/10_1101-2021_02_10_430512.pdf" [1]
2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2020.11.17.386649v2.full.pdf [956127] -> "./cache/10_1101-2020_11_17_386649.pdf" [1]
2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.430280v1.full.pdf [1963137] -> "./cache/10_1101-2021_02_08_430280.pdf" [1]
2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2020.09.02.279521v5.full.pdf [1869837] -> "./cache/10_1101-2020_09_02_279521.pdf" [1]
2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430871v1.full.pdf [1785729] -> "./cache/10_1101-2021_02_11_430871.pdf" [1]
2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430656v1.full.pdf [2480984] -> "./cache/10_1101-2021_02_10_430656.pdf" [1]
2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.430343v1.full.pdf [1886281] -> "./cache/10_1101-2021_02_08_430343.pdf" [1]
2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2020.10.08.327718v2.full.pdf [2805772] -> "./cache/10_1101-2020_10_08_327718.pdf" [1]
2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430705v1.full.pdf [4222890] -> "./cache/10_1101-2021_02_10_430705.pdf" [1]
2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/698605v3.full.pdf [1986213] -> "./cache/10_1101-698605.pdf" [1]
2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430847v1.full.pdf [3059886] -> "./cache/10_1101-2021_02_11_430847.pdf" [1]
2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430649v2.full.pdf [4971886] -> "./cache/10_1101-2021_02_10_430649.pdf" [1]
2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430623v1.full.pdf [1966479] -> "./cache/10_1101-2021_02_10_430623.pdf" [1]
2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.430070v1.full.pdf [4088950] -> "./cache/10_1101-2021_02_08_430070.pdf" [1]
2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430536v1.full.pdf [2648870] -> "./cache/10_1101-2021_02_09_430536.pdf" [1]
2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430405v1.full.pdf [3503948] -> "./cache/10_1101-2021_02_09_430405.pdf" [1]
2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430550v2.full.pdf [4191549] -> "./cache/10_1101-2021_02_09_430550.pdf" [1]
2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430460v1.full.pdf [4089353] -> "./cache/10_1101-2021_02_09_430460.pdf" [1]
2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430563v1.full.pdf [3094960] -> "./cache/10_1101-2021_02_10_430563.pdf" [1]
2021-02-14 21:22:20 URL:https://www.biorxiv.org/content/10.1101/2021.02.13.429885v1.full.pdf [6052365] -> "./cache/10_1101-2021_02_13_429885.pdf" [1]
2021-02-14 21:22:20 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430739v1.full.pdf [4157988] -> "./cache/10_1101-2021_02_12_430739.pdf" [1]
2021-02-14 21:22:20 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.428881v1.full.pdf [8688452] -> "./cache/10_1101-2021_02_08_428881.pdf" [1]
2021-02-14 21:22:21 URL:https://www.biorxiv.org/content/10.1101/2020.02.04.934216v2.full.pdf [9321917] -> "./cache/10_1101-2020_02_04_934216.pdf" [1]
2021-02-14 21:22:21 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430606v1.full.pdf [7489918] -> "./cache/10_1101-2021_02_10_430606.pdf" [1]
2021-02-14 21:22:22 URL:https://www.biorxiv.org/content/10.1101/727867v2.full.pdf [10837058] -> "./cache/10_1101-727867.pdf" [1]
2021-02-14 21:22:22 URL:https://www.biorxiv.org/content/10.1101/2020.09.21.305516v2.full.pdf [9506384] -> "./cache/10_1101-2020_09_21_305516.pdf" [1]
2021-02-14 21:22:27 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430789v1.full.pdf [591073] -> "./cache/10_1101-2021_02_11_430789.pdf" [1]
2021-02-14 21:22:29 URL:https://www.biorxiv.org/content/10.1101/2020.09.23.308239v4.full.pdf [22093690] -> "./cache/10_1101-2020_09_23_308239.pdf" [1]
2021-02-14 21:22:30 URL:https://www.biorxiv.org/content/10.1101/2020.09.23.310276v3.full.pdf [6105307] -> "./cache/10_1101-2020_09_23_310276.pdf" [1]
2021-02-14 21:22:34 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430363v1.full.pdf [19076613] -> "./cache/10_1101-2021_02_09_430363.pdf" [1]
=== updating bibliographic database
Building study carrel named neuroscience-from-bioarxiv
  FILE: cache/10_1101-2021_02_12_431018.pdf
OUTPUT: txt/10_1101-2021_02_12_431018.txt
  FILE: cache/10_1101-2020_12_24_424317.pdf
OUTPUT: txt/10_1101-2020_12_24_424317.txt
  FILE: cache/10_1101-2021_02_10_430619.pdf
OUTPUT: txt/10_1101-2021_02_10_430619.txt
  FILE: cache/10_1101-2021_02_12_430830.pdf
OUTPUT: txt/10_1101-2021_02_12_430830.txt
  FILE: cache/10_1101-2021_02_12_430979.pdf
OUTPUT: txt/10_1101-2021_02_12_430979.txt
  FILE: cache/10_1101-2020_09_21_305516.pdf
OUTPUT: txt/10_1101-2020_09_21_305516.txt
  FILE: cache/10_1101-2021_02_12_430923.pdf
OUTPUT: txt/10_1101-2021_02_12_430923.txt
  FILE: cache/10_1101-2020_05_15_090266.pdf
OUTPUT: txt/10_1101-2020_05_15_090266.txt
  FILE: cache/10_1101-2021_02_11_430871.pdf
OUTPUT: txt/10_1101-2021_02_11_430871.txt
  FILE: cache/10_1101-2021_02_11_430695.pdf
OUTPUT: txt/10_1101-2021_02_11_430695.txt
  FILE: cache/10_1101-2021_02_11_430806.pdf
OUTPUT: txt/10_1101-2021_02_11_430806.txt
  FILE: cache/10_1101-2021_02_11_430847.pdf
OUTPUT: txt/10_1101-2021_02_11_430847.txt
  FILE: cache/10_1101-2021_02_12_430963.pdf
OUTPUT: txt/10_1101-2021_02_12_430963.txt
  FILE: cache/10_1101-2021_02_08_430070.pdf
OUTPUT: txt/10_1101-2021_02_08_430070.txt
  FILE: cache/10_1101-2021_02_10_430367.pdf
OUTPUT: txt/10_1101-2021_02_10_430367.txt
  FILE: cache/10_1101-2021_02_09_430036.pdf
OUTPUT: txt/10_1101-2021_02_09_430036.txt
  FILE: cache/10_1101-698605.pdf
OUTPUT: txt/10_1101-698605.txt
  FILE: cache/10_1101-2021_02_11_430789.pdf
OUTPUT: txt/10_1101-2021_02_11_430789.txt
  FILE: cache/10_1101-2020_11_17_386649.pdf
OUTPUT: txt/10_1101-2020_11_17_386649.txt
  FILE: cache/10_1101-2021_02_12_430989.pdf
OUTPUT: txt/10_1101-2021_02_12_430989.txt
  FILE: cache/10_1101-2021_02_12_430739.pdf
OUTPUT: txt/10_1101-2021_02_12_430739.txt
  FILE: cache/10_1101-2021_02_11_430762.pdf
OUTPUT: txt/10_1101-2021_02_11_430762.txt
  FILE: cache/10_1101-2021_02_10_430604.pdf
OUTPUT: txt/10_1101-2021_02_10_430604.txt
  FILE: cache/10_1101-2021_02_10_430656.pdf
OUTPUT: txt/10_1101-2021_02_10_430656.txt
  FILE: cache/10_1101-2021_02_10_430512.pdf
OUTPUT: txt/10_1101-2021_02_10_430512.txt
  FILE: cache/10_1101-2020_10_08_327718.pdf
OUTPUT: txt/10_1101-2020_10_08_327718.txt
  FILE: cache/10_1101-2020_09_23_310276.pdf
OUTPUT: txt/10_1101-2020_09_23_310276.txt
  FILE: cache/10_1101-2021_02_09_430363.pdf
OUTPUT: txt/10_1101-2021_02_09_430363.txt
  FILE: cache/10_1101-2021_02_10_430623.pdf
OUTPUT: txt/10_1101-2021_02_10_430623.txt
  FILE: cache/10_1101-2021_02_10_430606.pdf
OUTPUT: txt/10_1101-2021_02_10_430606.txt
  FILE: cache/10_1101-2020_02_04_934216.pdf
OUTPUT: txt/10_1101-2020_02_04_934216.txt
  FILE: cache/10_1101-2021_02_09_430550.pdf
OUTPUT: txt/10_1101-2021_02_09_430550.txt
  FILE: cache/10_1101-2021_02_08_430270.pdf
OUTPUT: txt/10_1101-2021_02_08_430270.txt
  FILE: cache/10_1101-2021_02_08_430343.pdf
OUTPUT: txt/10_1101-2021_02_08_430343.txt
  FILE: cache/10_1101-2021_02_01_429246.pdf
OUTPUT: txt/10_1101-2021_02_01_429246.txt
  FILE: cache/10_1101-2020_09_02_279521.pdf
OUTPUT: txt/10_1101-2020_09_02_279521.txt
  FILE: cache/10_1101-2021_02_08_430280.pdf
OUTPUT: txt/10_1101-2021_02_08_430280.txt
  FILE: cache/10_1101-2021_02_12_430764.pdf
OUTPUT: txt/10_1101-2021_02_12_430764.txt
  FILE: cache/10_1101-2020_01_28_923532.pdf
OUTPUT: txt/10_1101-2020_01_28_923532.txt
  FILE: cache/10_1101-2021_02_09_430536.pdf
OUTPUT: txt/10_1101-2021_02_09_430536.txt
  FILE: cache/10_1101-2021_02_10_430705.pdf
OUTPUT: txt/10_1101-2021_02_10_430705.txt
  FILE: cache/10_1101-2021_02_13_429885.pdf
OUTPUT: txt/10_1101-2021_02_13_429885.txt
  FILE: cache/10_1101-2021_02_09_430405.pdf
OUTPUT: txt/10_1101-2021_02_09_430405.txt
  FILE: cache/10_1101-2021_02_08_430275.pdf
OUTPUT: txt/10_1101-2021_02_08_430275.txt
  FILE: cache/10_1101-2021_02_10_430563.pdf
OUTPUT: txt/10_1101-2021_02_10_430563.txt
  FILE: cache/10_1101-2021_02_09_430460.pdf
OUTPUT: txt/10_1101-2021_02_09_430460.txt
  FILE: cache/10_1101-2021_02_10_430649.pdf
OUTPUT: txt/10_1101-2021_02_10_430649.txt
  FILE: cache/10_1101-2021_02_08_428881.pdf
OUTPUT: txt/10_1101-2021_02_08_428881.txt
  FILE: cache/10_1101-727867.pdf
OUTPUT: txt/10_1101-727867.txt
  FILE: cache/10_1101-2020_09_23_308239.pdf
OUTPUT: txt/10_1101-2020_09_23_308239.txt
=== file2bib.sh ===
         id: 10_1101-2021_02_10_430604
     author: Youngblut, Nicholas D.
      title: Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets
       date: 2021
      pages: 4
  extension: .pdf
        txt: ./txt/10_1101-2021_02_10_430604.txt
      cache: ./cache/10_1101-2021_02_10_430604.pdf

Content-Type	application/pdf
Creation-Date	2021-02-10T14:11:24Z
Keywords	
Last-Modified	2021-02-14T18:25:42Z
Last-Save-Date	2021-02-14T18:25:42Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	109
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-10T14:11:24Z
date	2021-02-14T18:25:42Z
dc:format	application/pdf; version=1.4
dc:subject	
dc:title	Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets
dcterms:created	2021-02-10T14:11:24Z
dcterms:modified	2021-02-14T18:25:42Z
meta:creation-date	2021-02-10T14:11:24Z
meta:keyword	
meta:save-date	2021-02-14T18:25:42Z
modified	2021-02-14T18:25:42Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['930', '4650', '2600', '3359']
pdf:docinfo:created	2021-02-10T14:11:24Z
pdf:docinfo:creator_tool	Chrome
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T18:25:42Z
pdf:docinfo:producer	macOS Version 10.14.6 (Build 18G7016) Quartz PDFContext
pdf:docinfo:title	Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0']
producer	macOS Version 10.14.6 (Build 18G7016) Quartz PDFContext
resourceName	b'10_1101-2021_02_10_430604.pdf'
subject	
title	Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets
xmp:CreatorTool	Chrome
xmpMM:DocumentID	uuid:46a164ee-1dd2-11b2-0a00-ef0927edca00
xmpTPg:NPages	4
=== file2bib.sh ===
         id: 10_1101-2020_05_15_090266
     author: Zhang, R.
      title: SpacePHARER: Sensitive identification of phages from CRISPR spacers in prokaryotic hosts
       date: 2021
      pages: 6
  extension: .pdf
        txt: ./txt/10_1101-2020_05_15_090266.txt
      cache: ./cache/10_1101-2020_05_15_090266.pdf

Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
Content-Type	application/pdf
Creation-Date	2021-02-10T16:31:46Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	43
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-10T16:31:46Z
date	2021-02-14T21:22:17Z
dc:format	application/pdf; version=1.5
dc:title	34320483
dcterms:created	2021-02-10T16:31:46Z
dcterms:modified	2021-02-14T21:22:17Z
meta:creation-date	2021-02-10T16:31:46Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['4925', '4142', '2075', '344', '344', '344']
pdf:docinfo:created	2021-02-10T16:31:46Z
pdf:docinfo:creator_tool	Appligent AppendPDF Pro 5.5
pdf:docinfo:custom:Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	xdvipdfmx (20200315)
pdf:docinfo:title	34320483
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0']
producer	xdvipdfmx (20200315)
resourceName	b'10_1101-2020_05_15_090266.pdf'
title	34320483
xmp:CreatorTool	Appligent AppendPDF Pro 5.5
xmpMM:DocumentID	uuid:11f1d5d2-b085-11b2-0a00-782dad000000
xmpTPg:NPages	6
=== file2bib.sh ===
         id: 10_1101-2021_02_12_431018
     author: Truong Nguyen, Phuoc
      title: HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences.
       date: 2021
      pages: 14
  extension: .pdf
        txt: ./txt/10_1101-2021_02_12_431018.txt
      cache: ./cache/10_1101-2021_02_12_431018.pdf

Content-Type	application/pdf
Creation-Date	2021-02-12T22:55:36Z
Last-Modified	2021-02-14T19:16:35Z
Last-Save-Date	2021-02-14T19:16:35Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	393
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-12T22:55:36Z
date	2021-02-14T19:16:35Z
dc:format	application/pdf; version=1.4
dc:title	HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences.
dcterms:created	2021-02-12T22:55:36Z
dcterms:modified	2021-02-14T19:16:35Z
meta:creation-date	2021-02-12T22:55:36Z
meta:save-date	2021-02-14T19:16:35Z
modified	2021-02-14T19:16:35Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['1876', '2014', '2602', '2132', '1961', '953', '1972', '1160', '2014', '1880', '2254', '2405', '1553', '833']
pdf:docinfo:created	2021-02-12T22:55:36Z
pdf:docinfo:creator_tool	Word
pdf:docinfo:modified	2021-02-14T19:16:35Z
pdf:docinfo:producer	macOS Version 11.2.1 (Build 20D74) Quartz PDFContext
pdf:docinfo:title	HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences.
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	macOS Version 11.2.1 (Build 20D74) Quartz PDFContext
resourceName	b'10_1101-2021_02_12_431018.pdf'
title	HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences.
xmp:CreatorTool	Word
xmpMM:DocumentID	uuid:58d2adfc-1dd2-11b2-0a00-b30827bd7700
xmpTPg:NPages	14
=== file2bib.sh ===
         id: 10_1101-2021_02_11_430806
     author: Badaczewska-Dawid, Aleksandra
      title: BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences
       date: 2021
      pages: 3
  extension: .pdf
        txt: ./txt/10_1101-2021_02_11_430806.txt
      cache: ./cache/10_1101-2021_02_11_430806.pdf

Author	
Content-Type	application/pdf
Creation-Date	2021-02-04T20:10:30Z
Keywords	
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	35
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-04T20:10:30Z
creator	
date	2021-02-14T21:22:17Z
dc:creator	
dc:format	application/pdf; version=1.5
dc:subject	
dc:title	BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences
dcterms:created	2021-02-04T20:10:30Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	
meta:creation-date	2021-02-04T20:10:30Z
meta:keyword	
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['4645', '5732', '5920']
pdf:docinfo:created	2021-02-04T20:10:30Z
pdf:docinfo:creator	
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	pdfTeX-1.40.20
pdf:docinfo:subject	
pdf:docinfo:title	BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['2', '0', '0']
producer	pdfTeX-1.40.20
resourceName	b'10_1101-2021_02_11_430806.pdf'
subject	
title	BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:85c70edf-1dd2-11b2-0a00-020a27bd7700
xmpTPg:NPages	3
=== file2bib.sh ===
         id: 10_1101-2021_02_09_430036
     author: Goldsborough, Thibaut
      title: A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea
       date: 2021
      pages: 7
  extension: .pdf
        txt: ./txt/10_1101-2021_02_09_430036.txt
      cache: ./cache/10_1101-2021_02_09_430036.pdf

Author	Thibaut Gold
Content-Type	application/pdf
Creation-Date	2021-02-09T17:39:12Z
Keywords	
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	44
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-09T17:39:12Z
creator	Thibaut Gold
date	2021-02-14T21:22:17Z
dc:creator	Thibaut Gold
dc:format	application/pdf; version=1.4
dc:subject	
dc:title	A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea
dcterms:created	2021-02-09T17:39:12Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	Thibaut Gold
meta:creation-date	2021-02-09T17:39:12Z
meta:keyword	
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['3613', '3079', '2428', '2810', '1917', '3361', '1048']
pdf:docinfo:created	2021-02-09T17:39:12Z
pdf:docinfo:creator	Thibaut Gold
pdf:docinfo:creator_tool	Word
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	Mac OS X 10.13.6 Quartz PDFContext
pdf:docinfo:subject	
pdf:docinfo:title	A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0']
producer	Mac OS X 10.13.6 Quartz PDFContext
resourceName	b'10_1101-2021_02_09_430036.pdf'
subject	
title	A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea
xmp:CreatorTool	Word
xmpMM:DocumentID	uuid:85c6fc4a-1dd2-11b2-0a00-ca0827edca00
xmpTPg:NPages	7
=== file2bib.sh ===
         id: 10_1101-2021_02_10_430367
     author: Chen, Meili
      title: Genome Warehouse: A Public Repository Housing Genome-scale Data
       date: 2021
      pages: 18
  extension: .pdf
        txt: ./txt/10_1101-2021_02_10_430367.txt
      cache: ./cache/10_1101-2021_02_10_430367.pdf

Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
Content-Type	application/pdf
Creation-Date	2021-02-11T01:56:12Z
Last-Modified	2021-02-14T20:28:35Z
Last-Save-Date	2021-02-14T20:28:35Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	50
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-11T01:56:12Z
date	2021-02-14T20:28:35Z
dc:format	application/pdf; version=1.5
dc:title	9691071
dcterms:created	2021-02-11T01:56:12Z
dcterms:modified	2021-02-14T20:28:35Z
meta:creation-date	2021-02-11T01:56:12Z
meta:save-date	2021-02-14T20:28:35Z
modified	2021-02-14T20:28:35Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['1753', '469', '1895', '2624', '2620', '2657', '2533', '2599', '2634', '2173', '753', '3213', '1644', '926', '1127', '344', '344', '344']
pdf:docinfo:created	2021-02-11T01:56:12Z
pdf:docinfo:creator_tool	Appligent AppendPDF Pro 5.5
pdf:docinfo:custom:Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
pdf:docinfo:modified	2021-02-14T20:28:35Z
pdf:docinfo:producer	Acrobat Distiller 9.0.0 (Windows)
pdf:docinfo:title	9691071
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Acrobat Distiller 9.0.0 (Windows)
resourceName	b'10_1101-2021_02_10_430367.pdf'
title	9691071
xmp:CreatorTool	Appligent AppendPDF Pro 5.5
xmpMM:DocumentID	uuid:dbcd43fb-b085-11b2-0a00-782dad000000
xmpTPg:NPages	18
=== file2bib.sh ===
         id: 10_1101-2021_02_10_430619
     author: Schutz, Sacha
      title: Cutevariant: a GUI-based desktop application to explore genetics variations
       date: 2021
      pages: 8
  extension: .pdf
        txt: ./txt/10_1101-2021_02_10_430619.txt
      cache: ./cache/10_1101-2021_02_10_430619.pdf

Author	
Content-Type	application/pdf
Creation-Date	2021-02-10T23:59:47Z
Keywords	
Last-Modified	2021-02-14T21:22:16Z
Last-Save-Date	2021-02-14T21:22:16Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	68
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-10T23:59:47Z
creator	
date	2021-02-14T21:22:16Z
dc:creator	
dc:format	application/pdf; version=1.5
dc:subject	
dc:title	Cutevariant: a GUI-based desktop application to explore genetics variations
dcterms:created	2021-02-10T23:59:47Z
dcterms:modified	2021-02-14T21:22:16Z
meta:author	
meta:creation-date	2021-02-10T23:59:47Z
meta:keyword	
meta:save-date	2021-02-14T21:22:16Z
modified	2021-02-14T21:22:16Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['4040', '3428', '2261', '3414', '3216', '4578', '5638', '627']
pdf:docinfo:created	2021-02-10T23:59:47Z
pdf:docinfo:creator	
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:16Z
pdf:docinfo:producer	pdfTeX-1.40.21
pdf:docinfo:subject	
pdf:docinfo:title	Cutevariant: a GUI-based desktop application to explore genetics variations
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '1', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.21
resourceName	b'10_1101-2021_02_10_430619.pdf'
subject	
title	Cutevariant: a GUI-based desktop application to explore genetics variations
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:85c6c1a1-1dd2-11b2-0a00-0e0a27fd5800
xmpTPg:NPages	8
=== file2bib.sh ===
         id: 10_1101-2020_12_24_424317
     author: Muazzam, Fariha
      title: Multi-class Cancer Classification and Biomarker Identification using Deep Learning
       date: 2021
      pages: 12
  extension: .pdf
        txt: ./txt/10_1101-2020_12_24_424317.txt
      cache: ./cache/10_1101-2020_12_24_424317.pdf

Content-Type	application/pdf
Creation-Date	2021-02-11T09:45:39Z
Last-Modified	2021-02-14T21:22:16Z
Last-Save-Date	2021-02-14T21:22:16Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	71
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-11T09:45:39Z
date	2021-02-14T21:22:16Z
dc:format	application/pdf; version=1.6
dc:language	en-GB
dc:title	Multi-class Cancer Classification and Biomarker Identification using Deep Learning
dcterms:created	2021-02-11T09:45:39Z
dcterms:modified	2021-02-14T21:22:16Z
language	en-GB
meta:creation-date	2021-02-11T09:45:39Z
meta:save-date	2021-02-14T21:22:16Z
modified	2021-02-14T21:22:16Z
pdf:PDFVersion	1.6
pdf:charsPerPage	['2052', '4575', '4008', '1734', '1778', '1769', '1698', '1509', '909', '2553', '2652', '3042']
pdf:docinfo:created	2021-02-11T09:45:39Z
pdf:docinfo:creator_tool	Writer
pdf:docinfo:modified	2021-02-14T21:22:16Z
pdf:docinfo:producer	LibreOffice 7.0
pdf:docinfo:title	Multi-class Cancer Classification and Biomarker Identification using Deep Learning
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	LibreOffice 7.0
resourceName	b'10_1101-2020_12_24_424317.pdf'
title	Multi-class Cancer Classification and Biomarker Identification using Deep Learning
xmp:CreatorTool	Writer
xmpMM:DocumentID	uuid:85c6c437-1dd2-11b2-0a00-2c09276d7200
xmpTPg:NPages	12
=== file2bib.sh ===
         id: 10_1101-2021_02_09_430405
     author: Quazi, Sameer
      title: <em>In-silico</em> Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD
       date: 2021
      pages: 23
  extension: .pdf
        txt: ./txt/10_1101-2021_02_09_430405.txt
      cache: ./cache/10_1101-2021_02_09_430405.pdf

Author	Administrator
Content-Type	application/pdf
Creation-Date	2021-02-09T11:42:11Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	290
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-09T11:42:11Z
creator	Administrator
date	2021-02-14T21:22:17Z
dc:creator	Administrator
dc:format	application/pdf; version=1.4
dc:title	In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD
dcterms:created	2021-02-09T11:42:11Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	Administrator
meta:creation-date	2021-02-09T11:42:11Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['1009', '1911', '2821', '2570', '2562', '2577', '1720', '428', '1276', '556', '1603', '1747', '1012', '1189', '1356', '1452', '1064', '2808', '1825', '1961', '1940', '1464', '342']
pdf:docinfo:created	2021-02-09T11:42:11Z
pdf:docinfo:creator	Administrator
pdf:docinfo:creator_tool	PScript5.dll Version 5.2.2
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	Acrobat Distiller 8.1.0 (Windows)
pdf:docinfo:title	In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Acrobat Distiller 8.1.0 (Windows)
resourceName	b'10_1101-2021_02_09_430405.pdf'
title	In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD
xmp:CreatorTool	PScript5.dll Version 5.2.2
xmpMM:DocumentID	uuid:85c8294e-1dd2-11b2-0a00-d70827bd3700
xmpTPg:NPages	23
=== file2bib.sh ===
         id: 10_1101-2020_09_23_310276
     author: Greenfest-Allen, Emily
      title: NIAGADS Alzheimer's GenomicsDB: A resource for exploring Alzheimer's Disease genetic and genomic knowledge
       date: 2021
      pages: 19
  extension: .pdf
        txt: ./txt/10_1101-2020_09_23_310276.txt
      cache: ./cache/10_1101-2020_09_23_310276.pdf

Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
Author	Emily Greenfest-Allen
Content-Type	application/pdf
Creation-Date	2021-02-12T15:45:35Z
Last-Modified	2021-02-14T21:22:27Z
Last-Save-Date	2021-02-14T21:22:27Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	155
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-12T15:45:35Z
creator	Emily Greenfest-Allen
date	2021-02-14T21:22:27Z
dc:creator	Emily Greenfest-Allen
dc:format	application/pdf; version=1.7
dc:language	en-US
dc:title	97992561
dcterms:created	2021-02-12T15:45:35Z
dcterms:modified	2021-02-14T21:22:27Z
language	en-US
meta:author	Emily Greenfest-Allen
meta:creation-date	2021-02-12T15:45:35Z
meta:save-date	2021-02-14T21:22:27Z
modified	2021-02-14T21:22:27Z
pdf:PDFVersion	1.7
pdf:charsPerPage	['1219', '2048', '3230', '2954', '2587', '3080', '3145', '3484', '3621', '3162', '2797', '3464', '3377', '1506', '1005', '722', '265', '309', '266']
pdf:docinfo:created	2021-02-12T15:45:35Z
pdf:docinfo:creator	Emily Greenfest-Allen
pdf:docinfo:creator_tool	Appligent AppendPDF Pro 5.5
pdf:docinfo:custom:Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
pdf:docinfo:modified	2021-02-14T21:22:27Z
pdf:docinfo:producer	Microsoft® Word for Microsoft 365
pdf:docinfo:title	97992561
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Microsoft® Word for Microsoft 365
resourceName	b'10_1101-2020_09_23_310276.pdf'
title	97992561
xmp:CreatorTool	Appligent AppendPDF Pro 5.5
xmpMM:DocumentID	uuid:076548d5-b089-11b2-0a00-782dad000000
xmpTPg:NPages	19
=== file2bib.sh ===
         id: 10_1101-2021_02_12_430739
     author: Malekian, Negin
      title: Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli
       date: 2021
      pages: 13
  extension: .pdf
        txt: ./txt/10_1101-2021_02_12_430739.txt
      cache: ./cache/10_1101-2021_02_12_430739.pdf

Author	 Negin Malekian, Ali Al-Fatlawi, Thomas U. Berendonk, Michael Schroeder
Content-Type	application/pdf
Creation-Date	2021-02-12T10:09:54Z
Keywords	E Coli, Quinolone, Antibiotic Resistance, Genome-Wide Association Study (GWAS)
Last-Modified	2021-02-14T21:22:16Z
Last-Save-Date	2021-02-14T21:22:16Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	104
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-12T10:09:54Z
creator	 Negin Malekian, Ali Al-Fatlawi, Thomas U. Berendonk, Michael Schroeder
date	2021-02-14T21:22:16Z
dc:creator	 Negin Malekian, Ali Al-Fatlawi, Thomas U. Berendonk, Michael Schroeder
dc:format	application/pdf; version=1.5
dc:subject	E Coli, Quinolone, Antibiotic Resistance, Genome-Wide Association Study (GWAS)
dc:title	Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli
dcterms:created	2021-02-12T10:09:54Z
dcterms:modified	2021-02-14T21:22:16Z
meta:author	 Negin Malekian, Ali Al-Fatlawi, Thomas U. Berendonk, Michael Schroeder
meta:creation-date	2021-02-12T10:09:54Z
meta:keyword	E Coli, Quinolone, Antibiotic Resistance, Genome-Wide Association Study (GWAS)
meta:save-date	2021-02-14T21:22:16Z
modified	2021-02-14T21:22:16Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['3702', '4965', '4466', '4704', '4734', '1542', '1200', '1401', '2846', '987', '740', '6975', '1619']
pdf:docinfo:created	2021-02-12T10:09:54Z
pdf:docinfo:creator	 Negin Malekian, Ali Al-Fatlawi, Thomas U. Berendonk, Michael Schroeder
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1
pdf:docinfo:keywords	E Coli, Quinolone, Antibiotic Resistance, Genome-Wide Association Study (GWAS)
pdf:docinfo:modified	2021-02-14T21:22:16Z
pdf:docinfo:producer	pdfTeX-1.40.20
pdf:docinfo:subject	
pdf:docinfo:title	Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.20
resourceName	b'10_1101-2021_02_12_430739.pdf'
subject	
title	Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:85c69c39-1dd2-11b2-0a00-ce09271d5700
xmpTPg:NPages	13
=== file2bib.sh ===
         id: 10_1101-2021_02_08_430270
     author: Gerard, David
      title: Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty
       date: 2021
      pages: 22
  extension: .pdf
        txt: ./txt/10_1101-2021_02_08_430270.txt
      cache: ./cache/10_1101-2021_02_08_430270.pdf

Author	
Content-Type	application/pdf
Creation-Date	2021-02-06T15:17:17Z
Keywords	
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) kpathsea version 6.3.1
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	140
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-06T15:17:17Z
creator	
date	2021-02-14T21:22:17Z
dc:creator	
dc:format	application/pdf; version=1.5
dc:subject	
dc:title	Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty
dcterms:created	2021-02-06T15:17:17Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	
meta:creation-date	2021-02-06T15:17:17Z
meta:keyword	
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['2802', '2526', '2381', '3475', '1934', '1023', '1087', '826', '1646', '807', '1100', '1292', '1510', '1328', '1406', '1480', '2341', '1019', '1098', '754', '3273', '1665']
pdf:docinfo:created	2021-02-06T15:17:17Z
pdf:docinfo:creator	
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) kpathsea version 6.3.1
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	pdfTeX-1.40.20
pdf:docinfo:subject	
pdf:docinfo:title	Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '14', '14', '2', '0', '0', '0', '0', '23', '32', '30', '28', '0', '7', '16', '10', '8', '19', '0', '0', '0', '0']
producer	pdfTeX-1.40.20
resourceName	b'10_1101-2021_02_08_430270.pdf'
subject	
title	Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:85c73f0b-1dd2-11b2-0a00-810827edca00
xmpTPg:NPages	22
=== file2bib.sh ===
         id: 10_1101-2021_02_08_430275
     author: Zhang, Jianbo
      title: Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes
       date: 2021
      pages: 6
  extension: .pdf
        txt: ./txt/10_1101-2021_02_08_430275.txt
      cache: ./cache/10_1101-2021_02_08_430275.pdf

Author	
Content-Type	application/pdf
Creation-Date	2020-11-24T15:53:05Z
Keywords	
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) kpathsea version 6.3.1
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	456
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2020-11-24T15:53:05Z
creator	
date	2021-02-14T21:22:17Z
dc:creator	
dc:format	application/pdf; version=1.5
dc:subject	
dc:title	Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes
dcterms:created	2020-11-24T15:53:05Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	
meta:creation-date	2020-11-24T15:53:05Z
meta:keyword	
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['5882', '7329', '6126', '2575', '3579', '6462']
pdf:docinfo:created	2020-11-24T15:53:05Z
pdf:docinfo:creator	
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) kpathsea version 6.3.1
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	pdfTeX-1.40.20
pdf:docinfo:subject	
pdf:docinfo:title	Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['3', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.20
resourceName	b'10_1101-2021_02_08_430275.pdf'
subject	
title	Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:85c6d549-1dd2-11b2-0a00-840827bd7200
xmpTPg:NPages	6
=== file2bib.sh ===
         id: 10_1101-2020_11_17_386649
     author: Danciu, Daniel
      title: Topology-based Sparsification of Graph Annotations
       date: 2021
      pages: 15
  extension: .pdf
        txt: ./txt/10_1101-2020_11_17_386649.txt
      cache: ./cache/10_1101-2020_11_17_386649.pdf

Content-Type	application/pdf
Creation-Date	2021-02-10T17:24:37Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	107
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-10T17:24:37Z
date	2021-02-14T21:22:17Z
dc:format	application/pdf; version=1.5
dc:title	Topology-based Sparsification of Graph Annotations
dcterms:created	2021-02-10T17:24:37Z
dcterms:modified	2021-02-14T21:22:17Z
meta:creation-date	2021-02-10T17:24:37Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['3229', '3783', '3431', '3239', '3042', '2396', '2551', '2253', '3283', '3496', '2308', '1552', '3245', '2696', '2690']
pdf:docinfo:created	2021-02-10T17:24:37Z
pdf:docinfo:creator_tool	TeX
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	pdfTeX-1.40.21
pdf:docinfo:title	Topology-based Sparsification of Graph Annotations
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '2', '5', '1', '7', '0', '0', '0', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.21
resourceName	b'10_1101-2020_11_17_386649.pdf'
title	Topology-based Sparsification of Graph Annotations
trapped	False
xmp:CreatorTool	TeX
xmpMM:DocumentID	uuid:85c770f7-1dd2-11b2-0a00-fe08275dc400
xmpTPg:NPages	15
=== file2bib.sh ===
         id: 10_1101-2021_02_11_430847
     author: Pinatti, Lisa M.
      title: SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer
       date: 2021
      pages: 26
  extension: .pdf
        txt: ./txt/10_1101-2021_02_11_430847.txt
      cache: ./cache/10_1101-2021_02_11_430847.pdf

Author	Brenner, Chad
Content-Type	application/pdf
Creation-Date	2021-02-11T21:58:56Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	250
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-11T21:58:56Z
creator	Brenner, Chad
date	2021-02-14T21:22:17Z
dc:creator	Brenner, Chad
dc:format	application/pdf; version=1.6
dc:title	SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer
dcterms:created	2021-02-11T21:58:56Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	Brenner, Chad
meta:creation-date	2021-02-11T21:58:56Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.6
pdf:charsPerPage	['2756', '1833', '2101', '558', '2288', '2027', '1758', '2115', '2279', '1965', '2251', '2283', '2408', '2357', '2974', '3686', '3256', '2101', '1500', '655', '352', '537', '441', '190', '385', '30']
pdf:docinfo:created	2021-02-11T21:58:56Z
pdf:docinfo:creator	Brenner, Chad
pdf:docinfo:creator_tool	Acrobat PDFMaker 21 for Word
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	Adobe Acrobat Pro DC (32-bit) 21.1.20135
pdf:docinfo:title	SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Adobe Acrobat Pro DC (32-bit) 21.1.20135
resourceName	b'10_1101-2021_02_11_430847.pdf'
title	SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer
xmp:CreatorTool	Acrobat PDFMaker 21 for Word
xmpMM:DocumentID	uuid:480d24e0-a20f-4afa-a462-139f117bb089
xmpTPg:NPages	26
=== file2bib.sh ===
         id: 10_1101-2021_02_10_430656
     author: Zakeri, Mohsen
      title: A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing
       date: 2021
      pages: 7
  extension: .pdf
        txt: ./txt/10_1101-2021_02_10_430656.txt
      cache: ./cache/10_1101-2021_02_10_430656.pdf

Author	
Content-Type	application/pdf
Creation-Date	2021-02-10T20:57:39Z
Keywords	
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	124
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-10T20:57:39Z
creator	
date	2021-02-14T21:22:17Z
dc:creator	
dc:format	application/pdf; version=1.5
dc:subject	
dc:title	A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing
dcterms:created	2021-02-10T20:57:39Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	
meta:creation-date	2021-02-10T20:57:39Z
meta:keyword	
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['5995', '6561', '3463', '792', '6766', '6861', '5533']
pdf:docinfo:created	2021-02-10T20:57:39Z
pdf:docinfo:creator	
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	pdfTeX-1.40.21
pdf:docinfo:subject	
pdf:docinfo:title	A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['1', '0', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.21
resourceName	b'10_1101-2021_02_10_430656.pdf'
subject	
title	A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:85c7268f-1dd2-11b2-0a00-ee08271d5700
xmpTPg:NPages	7
=== file2bib.sh ===
         id: 10_1101-2021_02_08_430070
     author: Zhang, Yao-zhong
      title: On the application of BERT models for nanopore methylation detection
       date: 2021
      pages: 7
  extension: .pdf
        txt: ./txt/10_1101-2021_02_08_430070.txt
      cache: ./cache/10_1101-2021_02_08_430070.pdf

Content-Type	application/pdf
Creation-Date	2021-02-09T06:48:34Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	220
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-09T06:48:34Z
date	2021-02-14T21:22:17Z
dc:format	application/pdf; version=1.7
dc:title	On the application of BERT models for nanopore methylation detection
dcterms:created	2021-02-09T06:48:34Z
dcterms:modified	2021-02-14T21:22:17Z
meta:creation-date	2021-02-09T06:48:34Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.7
pdf:charsPerPage	['4064', '4333', '5898', '4026', '4707', '1134', '4710']
pdf:docinfo:created	2021-02-09T06:48:34Z
pdf:docinfo:creator_tool	dvips(k) 2020.1 Copyright 2020 Radical Eye Software
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	GPL Ghostscript 9.50
pdf:docinfo:title	On the application of BERT models for nanopore methylation detection
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '2', '0', '0', '0', '0']
producer	GPL Ghostscript 9.50
resourceName	b'10_1101-2021_02_08_430070.pdf'
title	On the application of BERT models for nanopore methylation detection
xmp:CreatorTool	dvips(k) 2020.1 Copyright 2020 Radical Eye Software
xmpMM:DocumentID	uuid:60da6715-a2bf-11f6-0000-35e12fd4d910
xmpTPg:NPages	7
=== file2bib.sh ===
         id: 10_1101-2021_02_12_430923
     author: Modi, Vivek
      title: Kincore: a web resource for structural classification of protein kinases and their inhibitors
       date: 2021
      pages: 18
  extension: .pdf
        txt: ./txt/10_1101-2021_02_12_430923.txt
      cache: ./cache/10_1101-2021_02_12_430923.pdf

Author	vivekmodi
Content-Type	application/pdf
Creation-Date	2021-02-12T12:59:47Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	121
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-12T12:59:47Z
creator	vivekmodi
date	2021-02-14T21:22:17Z
dc:creator	vivekmodi
dc:format	application/pdf; version=1.7
dc:language	en-US
dc:title	Kincore: a web resource for structural classification of protein kinases and their inhibitors
dcterms:created	2021-02-12T12:59:47Z
dcterms:modified	2021-02-14T21:22:17Z
language	en-US
meta:author	vivekmodi
meta:creation-date	2021-02-12T12:59:47Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.7
pdf:charsPerPage	['548', '2461', '4120', '4318', '3898', '1805', '3138', '1650', '3611', '1882', '1633', '2942', '3687', '3445', '2654', '2853', '3053', '2600']
pdf:docinfo:created	2021-02-12T12:59:47Z
pdf:docinfo:creator	vivekmodi
pdf:docinfo:creator_tool	Microsoft Word
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:title	Kincore: a web resource for structural classification of protein kinases and their inhibitors
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
resourceName	b'10_1101-2021_02_12_430923.pdf'
title	Kincore: a web resource for structural classification of protein kinases and their inhibitors
xmp:CreatorTool	Microsoft Word
xmpMM:DocumentID	uuid:86B06038-C2B3-4083-8EB1-C3E7E10688FB
xmpTPg:NPages	18
=== file2bib.sh ===
         id: 10_1101-2021_02_12_430989
     author: Sofer, Tamar
      title: Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies
       date: 2021
      pages: 27
  extension: .pdf
        txt: ./txt/10_1101-2021_02_12_430989.txt
      cache: ./cache/10_1101-2021_02_12_430989.pdf

Author	Administrator
Content-Type	application/pdf
Creation-Date	2021-02-12T20:11:32Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	102
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-12T20:11:32Z
creator	Administrator
date	2021-02-14T21:22:17Z
dc:creator	Administrator
dc:format	application/pdf; version=1.4
dc:title	Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies
dcterms:created	2021-02-12T20:11:32Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	Administrator
meta:creation-date	2021-02-12T20:11:32Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['1793', '1826', '2384', '2072', '2125', '2065', '2176', '2085', '2167', '2092', '2248', '2203', '2202', '1934', '2013', '2319', '2363', '2242', '2021', '1882', '2977', '3460', '1111', '868', '1358', '994', '946']
pdf:docinfo:created	2021-02-12T20:11:32Z
pdf:docinfo:creator	Administrator
pdf:docinfo:creator_tool	PScript5.dll Version 5.2.2
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	Acrobat Distiller 8.1.0 (Windows)
pdf:docinfo:title	Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '44', '71', '10', '47', '0', '2', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0']
producer	Acrobat Distiller 8.1.0 (Windows)
resourceName	b'10_1101-2021_02_12_430989.pdf'
title	Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies
xmp:CreatorTool	PScript5.dll Version 5.2.2
xmpMM:DocumentID	uuid:85c70ff8-1dd2-11b2-0a00-b709275d6100
xmpTPg:NPages	27
=== file2bib.sh ===
         id: 10_1101-2021_02_12_430963
     author: Gerber, Stefan
      title: Streamlining differential exon and 3' UTR usage with diffUTR
       date: 2021
      pages: 17
  extension: .pdf
        txt: ./txt/10_1101-2021_02_12_430963.txt
      cache: ./cache/10_1101-2021_02_12_430963.pdf

Author	
Content-Type	application/pdf
Creation-Date	2021-02-12T16:44:54Z
Keywords	
Last-Modified	2021-02-14T21:22:16Z
Last-Save-Date	2021-02-14T21:22:16Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	313
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-12T16:44:54Z
creator	
date	2021-02-14T21:22:16Z
dc:creator	
dc:format	application/pdf; version=1.5
dc:subject	
dc:title	Streamlining differential exon and 3' UTR usage with diffUTR
dcterms:created	2021-02-12T16:44:54Z
dcterms:modified	2021-02-14T21:22:16Z
meta:author	
meta:creation-date	2021-02-12T16:44:54Z
meta:keyword	
meta:save-date	2021-02-14T21:22:16Z
modified	2021-02-14T21:22:16Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['1413', '2472', '1908', '2213', '2583', '2365', '1177', '2644', '2325', '2335', '2542', '2188', '2388', '2261', '2673', '3135', '1179']
pdf:docinfo:created	2021-02-12T16:44:54Z
pdf:docinfo:creator	
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:16Z
pdf:docinfo:producer	pdfTeX-1.40.21
pdf:docinfo:subject	
pdf:docinfo:title	Streamlining differential exon and 3' UTR usage with diffUTR
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '6', '0', '0', '5', '1', '0']
producer	pdfTeX-1.40.21
resourceName	b'10_1101-2021_02_12_430963.pdf'
subject	
title	Streamlining differential exon and 3' UTR usage with diffUTR
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:85c690de-1dd2-11b2-0a00-740827bd7200
xmpTPg:NPages	17
=== file2bib.sh ===
         id: 10_1101-2021_02_11_430871
     author: Vadnais, David
      title: ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data
       date: 2021
      pages: 24
  extension: .pdf
        txt: ./txt/10_1101-2021_02_11_430871.txt
      cache: ./cache/10_1101-2021_02_11_430871.pdf

Author	David Vadnais
Content-Type	application/pdf
Creation-Date	2021-02-12T05:29:17Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	128
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-12T05:29:17Z
creator	David Vadnais
date	2021-02-14T21:22:17Z
dc:creator	David Vadnais
dc:format	application/pdf; version=1.7
dc:language	en-US
dc:title	ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data
dcterms:created	2021-02-12T05:29:17Z
dcterms:modified	2021-02-14T21:22:17Z
language	en-US
meta:author	David Vadnais
meta:creation-date	2021-02-12T05:29:17Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.7
pdf:charsPerPage	['4816', '5498', '3558', '1617', '3207', '2894', '1245', '2154', '1938', '1478', '2449', '2665', '1841', '1873', '1080', '2052', '1332', '1975', '2190', '3674', '4372', '4469', '1590', '1943']
pdf:docinfo:created	2021-02-12T05:29:17Z
pdf:docinfo:creator	David Vadnais
pdf:docinfo:creator_tool	Microsoft Word
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:title	ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
resourceName	b'10_1101-2021_02_11_430871.pdf'
title	ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data
xmp:CreatorTool	Microsoft Word
xmpMM:DocumentID	uuid:1B1B6485-34DE-46C5-9550-0CC4C4D733CD
xmpTPg:NPages	24
=== file2bib.sh ===
         id: 10_1101-2021_02_11_430695
     author: Gordon-Rodriguez, Elliott
      title: Learning Sparse Log-Ratios for High-Throughput Sequencing Data
       date: 2021
      pages: 12
  extension: .pdf
        txt: ./txt/10_1101-2021_02_11_430695.txt
      cache: ./cache/10_1101-2021_02_11_430695.pdf

Author	Elliott Gordon-Rodriguez, Thomas P. Quinn, John P. Cunningham
Content-Type	application/pdf
Creation-Date	2021-02-11T17:27:55Z
Keywords	Machine Learning, ICML
Last-Modified	2021-02-11T17:27:55Z
Last-Save-Date	2021-02-11T17:27:55Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	115
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	Proceedings of the International Conference on Machine Learning 2021
created	2021-02-11T17:27:55Z
creator	Elliott Gordon-Rodriguez, Thomas P. Quinn, John P. Cunningham
date	2021-02-11T17:27:55Z
dc:creator	Elliott Gordon-Rodriguez, Thomas P. Quinn, John P. Cunningham
dc:format	application/pdf; version=1.5
dc:subject	Machine Learning, ICML
dc:title	Learning Sparse Log-Ratios for High-Throughput Sequencing Data
dcterms:created	2021-02-11T17:27:55Z
dcterms:modified	2021-02-11T17:27:55Z
meta:author	Elliott Gordon-Rodriguez, Thomas P. Quinn, John P. Cunningham
meta:creation-date	2021-02-11T17:27:55Z
meta:keyword	Machine Learning, ICML
meta:save-date	2021-02-11T17:27:55Z
modified	2021-02-11T17:27:55Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['4061', '4908', '4284', '3291', '3759', '3982', '4289', '3758', '3949', '3927', '4057', '230']
pdf:docinfo:created	2021-02-11T17:27:55Z
pdf:docinfo:creator	Elliott Gordon-Rodriguez, Thomas P. Quinn, John P. Cunningham
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1
pdf:docinfo:keywords	Machine Learning, ICML
pdf:docinfo:modified	2021-02-11T17:27:55Z
pdf:docinfo:producer	pdfTeX-1.40.20
pdf:docinfo:subject	Proceedings of the International Conference on Machine Learning 2021
pdf:docinfo:title	Learning Sparse Log-Ratios for High-Throughput Sequencing Data
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	false
pdf:unmappedUnicodeCharsPerPage	['0', '2', '8', '14', '6', '0', '0', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.20
resourceName	b'10_1101-2021_02_11_430695.pdf'
subject	Proceedings of the International Conference on Machine Learning 2021
title	Learning Sparse Log-Ratios for High-Throughput Sequencing Data
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpTPg:NPages	12
=== file2bib.sh ===
         id: 10_1101-2020_09_21_305516
     author: Nikolic, Ana
      title: Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer
       date: 2021
      pages: 32
  extension: .pdf
        txt: ./txt/10_1101-2020_09_21_305516.txt
      cache: ./cache/10_1101-2020_09_21_305516.pdf

Content-Type	application/pdf
Creation-Date	2021-02-12T16:20:22Z
Last-Modified	2021-02-14T21:22:18Z
Last-Save-Date	2021-02-14T21:22:18Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	126
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-12T16:20:22Z
date	2021-02-14T21:22:18Z
dc:format	application/pdf; version=1.4
dc:title	Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer
dcterms:created	2021-02-12T16:20:22Z
dcterms:modified	2021-02-14T21:22:18Z
meta:creation-date	2021-02-12T16:20:22Z
meta:save-date	2021-02-14T21:22:18Z
modified	2021-02-14T21:22:18Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['3381', '4467', '1165', '5419', '866', '4521', '1520', '4215', '669', '1345', '4158', '4270', '3473', '3942', '4230', '3824', '810', '1165', '884', '856', '689', '1095', '704', '725', '776', '807', '846', '1061', '990', '704', '703', '627']
pdf:docinfo:created	2021-02-12T16:20:22Z
pdf:docinfo:creator_tool	Word
pdf:docinfo:modified	2021-02-14T21:22:18Z
pdf:docinfo:producer	macOS Version 11.2.1 (Build 20D74) Quartz PDFContext
pdf:docinfo:title	Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	macOS Version 11.2.1 (Build 20D74) Quartz PDFContext
resourceName	b'10_1101-2020_09_21_305516.pdf'
title	Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer
xmp:CreatorTool	Word
xmpMM:DocumentID	uuid:85c8661d-1dd2-11b2-0a00-f209277d8900
xmpTPg:NPages	32
=== file2bib.sh ===
         id: 10_1101-2021_02_12_430830
     author: Gergely, Tibély
      title: Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data
       date: 2021
      pages: 19
  extension: .pdf
        txt: ./txt/10_1101-2021_02_12_430830.txt
      cache: ./cache/10_1101-2021_02_12_430830.pdf

Author	
Content-Type	application/pdf
Creation-Date	2021-02-12T12:51:07Z
Keywords	
Last-Modified	2021-02-14T21:22:16Z
Last-Save-Date	2021-02-14T21:22:16Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015) kpathsea version 6.2.1
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	300
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-12T12:51:07Z
creator	
date	2021-02-14T21:22:16Z
dc:creator	
dc:format	application/pdf; version=1.5
dc:subject	
dc:title	Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data
dcterms:created	2021-02-12T12:51:07Z
dcterms:modified	2021-02-14T21:22:16Z
meta:author	
meta:creation-date	2021-02-12T12:51:07Z
meta:keyword	
meta:save-date	2021-02-14T21:22:16Z
modified	2021-02-14T21:22:16Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['2344', '2892', '3313', '1197', '2366', '1948', '2587', '2787', '1337', '1536', '3109', '1199', '2709', '1289', '3331', '3112', '2139', '2457', '945']
pdf:docinfo:created	2021-02-12T12:51:07Z
pdf:docinfo:creator	
pdf:docinfo:creator_tool	LaTeX with hyperref package
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015) kpathsea version 6.2.1
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:16Z
pdf:docinfo:producer	pdfTeX-1.40.16
pdf:docinfo:subject	
pdf:docinfo:title	Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '1', '0', '8', '25', '6', '3', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0']
producer	pdfTeX-1.40.16
resourceName	b'10_1101-2021_02_12_430830.pdf'
subject	
title	Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data
trapped	False
xmp:CreatorTool	LaTeX with hyperref package
xmpMM:DocumentID	uuid:85c6b3e8-1dd2-11b2-0a00-130a277d8900
xmpTPg:NPages	19
=== file2bib.sh ===
         id: 10_1101-2021_02_11_430789
     author: Tyagin, Ilya
      title: Accelerating COVID-19 research with graph mining and transformer-based learning
       date: 2021
      pages: 9
  extension: .pdf
        txt: ./txt/10_1101-2021_02_11_430789.txt
      cache: ./cache/10_1101-2021_02_11_430789.pdf

Author	Ilya Tyagin, Ankit Kulshrestha, Justin Sybrandt, Krish Matta, Michael Shtutman, and Ilya Safro
Content-Type	application/pdf
Creation-Date	2021-02-10T14:30:51Z
Keywords	 Hypothesis Generation, Literature-Based Discovery, Transformer Models, Semantic Networks, Biomedical Recommendation, 
Last-Modified	2021-02-14T21:22:26Z
Last-Save-Date	2021-02-14T21:22:26Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	130
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	-  Applied computing  ->  Bioinformatics.Document management and text processing.-  Computing methodologies  ->  Learning latent representations.Neural networks.Information extraction.Semantic networks.
created	2021-02-10T14:30:51Z
creator	Ilya Tyagin, Ankit Kulshrestha, Justin Sybrandt, Krish Matta, Michael Shtutman, and Ilya Safro
date	2021-02-14T21:22:26Z
dc:creator	Ilya Tyagin, Ankit Kulshrestha, Justin Sybrandt, Krish Matta, Michael Shtutman, and Ilya Safro
dc:description	-  Applied computing  ->  Bioinformatics.Document management and text processing.-  Computing methodologies  ->  Learning latent representations.Neural networks.Information extraction.Semantic networks.
dc:format	application/pdf; version=1.5
dc:language	en
dc:subject	 Hypothesis Generation, Literature-Based Discovery, Transformer Models, Semantic Networks, Biomedical Recommendation, 
dc:title	Accelerating COVID-19 research with graph mining and transformer-based learning
dcterms:created	2021-02-10T14:30:51Z
dcterms:modified	2021-02-14T21:22:26Z
description	-  Applied computing  ->  Bioinformatics.Document management and text processing.-  Computing methodologies  ->  Learning latent representations.Neural networks.Information extraction.Semantic networks.
language	en
meta:author	Ilya Tyagin, Ankit Kulshrestha, Justin Sybrandt, Krish Matta, Michael Shtutman, and Ilya Safro
meta:creation-date	2021-02-10T14:30:51Z
meta:keyword	 Hypothesis Generation, Literature-Based Discovery, Transformer Models, Semantic Networks, Biomedical Recommendation, 
meta:save-date	2021-02-14T21:22:26Z
modified	2021-02-14T21:22:26Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['4414', '5587', '5796', '4953', '6020', '5204', '4953', '6112', '9146']
pdf:docinfo:created	2021-02-10T14:30:51Z
pdf:docinfo:creator	Ilya Tyagin, Ankit Kulshrestha, Justin Sybrandt, Krish Matta, Michael Shtutman, and Ilya Safro
pdf:docinfo:creator_tool	LaTeX with acmart 2020/04/30 v1.71 Typesetting articles for the Association for Computing Machinery and hyperref 2020-05-15 v7.00e Hypertext links for LaTeX
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
pdf:docinfo:keywords	 Hypothesis Generation, Literature-Based Discovery, Transformer Models, Semantic Networks, Biomedical Recommendation, 
pdf:docinfo:modified	2021-02-14T21:22:26Z
pdf:docinfo:producer	pdfTeX-1.40.21
pdf:docinfo:subject	-  Applied computing  ->  Bioinformatics.Document management and text processing.-  Computing methodologies  ->  Learning latent representations.Neural networks.Information extraction.Semantic networks.
pdf:docinfo:title	Accelerating COVID-19 research with graph mining and transformer-based learning
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '5', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.21
resourceName	b'10_1101-2021_02_11_430789.pdf'
subject	-  Applied computing  ->  Bioinformatics.Document management and text processing.-  Computing methodologies  ->  Learning latent representations.Neural networks.Information extraction.Semantic networks.
title	Accelerating COVID-19 research with graph mining and transformer-based learning
trapped	False
xmp:CreatorTool	LaTeX with acmart 2020/04/30 v1.71 Typesetting articles for the Association for Computing Machinery and hyperref 2020-05-15 v7.00e Hypertext links for LaTeX
xmpMM:DocumentID	uuid:85d60a5f-1dd2-11b2-0a00-e00927bd7700
xmpTPg:NPages	9
=== file2bib.sh ===
         id: 10_1101-2021_02_12_430764
     author: Ascensión, Alex M.
      title: Triku: a feature selection method based on nearest neighbors for single-cell data
       date: 2021
      pages: 18
  extension: .pdf
        txt: ./txt/10_1101-2021_02_12_430764.txt
      cache: ./cache/10_1101-2021_02_12_430764.pdf

Author	Alex M. Ascensión, Olga Ibañez-Solé, Inaki Inza, Ander Izeta, Marcos J. Araúzo-Bravo
Content-Type	application/pdf
Creation-Date	2021-02-12T10:37:24Z
Keywords	scRNAseq, feature selection, bioinformatics, python
Last-Modified	2021-02-14T20:14:07Z
Last-Save-Date	2021-02-14T20:14:07Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	327
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-12T10:37:24Z
creator	Alex M. Ascensión, Olga Ibañez-Solé, Inaki Inza, Ander Izeta, Marcos J. Araúzo-Bravo
date	2021-02-14T20:14:07Z
dc:creator	Alex M. Ascensión, Olga Ibañez-Solé, Inaki Inza, Ander Izeta, Marcos J. Araúzo-Bravo
dc:format	application/pdf; version=1.5
dc:subject	scRNAseq, feature selection, bioinformatics, python
dc:title	Triku: a feature selection method based on nearest neighbors for single-cell data
dcterms:created	2021-02-12T10:37:24Z
dcterms:modified	2021-02-14T20:14:07Z
meta:author	Alex M. Ascensión, Olga Ibañez-Solé, Inaki Inza, Ander Izeta, Marcos J. Araúzo-Bravo
meta:creation-date	2021-02-12T10:37:24Z
meta:keyword	scRNAseq, feature selection, bioinformatics, python
meta:save-date	2021-02-14T20:14:07Z
modified	2021-02-14T20:14:07Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['2989', '3432', '3121', '3143', '3126', '3107', '3377', '3213', '3024', '2866', '2104', '2654', '5322', '3235', '2377', '1210', '1017', '851']
pdf:docinfo:created	2021-02-12T10:37:24Z
pdf:docinfo:creator	Alex M. Ascensión, Olga Ibañez-Solé, Inaki Inza, Ander Izeta, Marcos J. Araúzo-Bravo
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:keywords	scRNAseq, feature selection, bioinformatics, python
pdf:docinfo:modified	2021-02-14T20:14:07Z
pdf:docinfo:producer	xdvipdfmx (20190225)
pdf:docinfo:title	Triku: a feature selection method based on nearest neighbors for single-cell data
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '396', '0', '0', '0']
producer	xdvipdfmx (20190225)
resourceName	b'10_1101-2021_02_12_430764.pdf'
subject	scRNAseq, feature selection, bioinformatics, python
title	Triku: a feature selection method based on nearest neighbors for single-cell data
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:6d664b1d-1dd2-11b2-0a00-8208278d5b00
xmpTPg:NPages	18
=== file2bib.sh ===
         id: 10_1101-2020_02_04_934216
     author: Kirchoff, Kathryn E.
      title: EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning
       date: 2021
      pages: 13
  extension: .pdf
        txt: ./txt/10_1101-2020_02_04_934216.txt
      cache: ./cache/10_1101-2020_02_04_934216.pdf

Content-Type	application/pdf
Creation-Date	2021-02-10T16:35:43Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	276
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-10T16:35:43Z
date	2021-02-14T21:22:17Z
dc:format	application/pdf; version=1.4
dc:title	EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning
dcterms:created	2021-02-10T16:35:43Z
dcterms:modified	2021-02-14T21:22:17Z
meta:creation-date	2021-02-10T16:35:43Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['5129', '5413', '4261', '3539', '4285', '3469', '4716', '3179', '7747', '530', '1564', '978', '677']
pdf:docinfo:created	2021-02-10T16:35:43Z
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	macOS Version 10.15.7 (Build 19H2) Quartz PDFContext
pdf:docinfo:title	EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '18', '29', '4']
producer	macOS Version 10.15.7 (Build 19H2) Quartz PDFContext
resourceName	b'10_1101-2020_02_04_934216.pdf'
title	EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:85c76a22-1dd2-11b2-0a00-000927fd5800
xmpTPg:NPages	13
=== file2bib.sh ===
         id: 10_1101-2021_02_08_428881
     author: Lu, Yang Young
      title: ACE: Explaining cluster from an adversarial perspective
       date: 2021
      pages: 12
  extension: .pdf
        txt: ./txt/10_1101-2021_02_08_428881.txt
      cache: ./cache/10_1101-2021_02_08_428881.pdf

Author	Yang Young Lu, Timothy C. Yu, Giancarlo Bonora, William Stafford Noble
Content-Type	application/pdf
Creation-Date	2021-02-09T13:00:55Z
Keywords	Machine Learning, ICML
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
PTEX.Fullbanner	This is MiKTeX-pdfTeX 2.9.4307 (1.40.12)
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	1420
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	Proceedings of the International Conference on Machine Learning 2021
created	2021-02-09T13:00:55Z
creator	Yang Young Lu, Timothy C. Yu, Giancarlo Bonora, William Stafford Noble
date	2021-02-14T21:22:17Z
dc:creator	Yang Young Lu, Timothy C. Yu, Giancarlo Bonora, William Stafford Noble
dc:description	Proceedings of the International Conference on Machine Learning 2021
dc:format	application/pdf; version=1.5
dc:subject	Machine Learning, ICML
dc:title	ACE: Explaining cluster from an adversarial perspective
dcterms:created	2021-02-09T13:00:55Z
dcterms:modified	2021-02-14T21:22:17Z
description	Proceedings of the International Conference on Machine Learning 2021
meta:author	Yang Young Lu, Timothy C. Yu, Giancarlo Bonora, William Stafford Noble
meta:creation-date	2021-02-09T13:00:55Z
meta:keyword	Machine Learning, ICML
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['4324', '4945', '3406', '4075', '4894', '3533', '4631', '3844', '4142', '3413', '561', '665']
pdf:docinfo:created	2021-02-09T13:00:55Z
pdf:docinfo:creator	Yang Young Lu, Timothy C. Yu, Giancarlo Bonora, William Stafford Noble
pdf:docinfo:creator_tool	LaTeX with hyperref package
pdf:docinfo:custom:PTEX.Fullbanner	This is MiKTeX-pdfTeX 2.9.4307 (1.40.12)
pdf:docinfo:keywords	Machine Learning, ICML
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	pdfTeX-1.40.12
pdf:docinfo:subject	Proceedings of the International Conference on Machine Learning 2021
pdf:docinfo:title	ACE: Explaining cluster from an adversarial perspective
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '10', '27', '0', '0', '0', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.12
resourceName	b'10_1101-2021_02_08_428881.pdf'
subject	Proceedings of the International Conference on Machine Learning 2021
title	ACE: Explaining cluster from an adversarial perspective
trapped	False
xmp:CreatorTool	LaTeX with hyperref package
xmpMM:DocumentID	uuid:85c6f0fe-1dd2-11b2-0a00-1a0927edca00
xmpTPg:NPages	12
=== file2bib.sh ===
         id: 10_1101-2021_02_10_430649
     author: Wen, Zi-Hang
      title: Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data
       date: 2021
      pages: 11
  extension: .pdf
        txt: ./txt/10_1101-2021_02_10_430649.txt
      cache: ./cache/10_1101-2021_02_10_430649.pdf

Author	
Content-Type	application/pdf
Creation-Date	2021-02-12T16:42:36Z
Keywords	
Last-Modified	2021-02-14T21:22:16Z
Last-Save-Date	2021-02-14T21:22:16Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	1286
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-12T16:42:36Z
creator	
date	2021-02-14T21:22:16Z
dc:creator	
dc:format	application/pdf; version=1.5
dc:subject	
dc:title	Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data
dcterms:created	2021-02-12T16:42:36Z
dcterms:modified	2021-02-14T21:22:16Z
meta:author	
meta:creation-date	2021-02-12T16:42:36Z
meta:keyword	
meta:save-date	2021-02-14T21:22:16Z
modified	2021-02-14T21:22:16Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['4396', '3587', '3999', '3258', '3868', '3519', '6427', '2602', '3216', '5326', '3921']
pdf:docinfo:created	2021-02-12T16:42:36Z
pdf:docinfo:creator	
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:16Z
pdf:docinfo:producer	pdfTeX-1.40.21
pdf:docinfo:subject	
pdf:docinfo:title	Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '57', '14', '0', '0', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.21
resourceName	b'10_1101-2021_02_10_430649.pdf'
subject	
title	Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:85c66f59-1dd2-11b2-0a00-f508278d5b00
xmpTPg:NPages	11
=== file2bib.sh ===
         id: 10_1101-2021_02_12_430979
     author: Da Silva, Kévin
      title: StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs
       date: 2021
      pages: 20
  extension: .pdf
        txt: ./txt/10_1101-2021_02_12_430979.txt
      cache: ./cache/10_1101-2021_02_12_430979.pdf

Content-Type	application/pdf
Creation-Date	2021-02-12T17:18:42Z
Last-Modified	2021-02-14T19:23:11Z
Last-Save-Date	2021-02-14T19:23:11Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	94
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-12T17:18:42Z
date	2021-02-14T19:23:11Z
dc:format	application/pdf; version=1.5
dc:title	StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs
dcterms:created	2021-02-12T17:18:42Z
dcterms:modified	2021-02-14T19:23:11Z
meta:creation-date	2021-02-12T17:18:42Z
meta:save-date	2021-02-14T19:23:11Z
modified	2021-02-14T19:23:11Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['3449', '4370', '2071', '4378', '2438', '4050', '3798', '3720', '2144', '2109', '4338', '4305', '4821', '3388', '3022', '1456', '1895', '1533', '1522', '541']
pdf:docinfo:created	2021-02-12T17:18:42Z
pdf:docinfo:creator_tool	TeX
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
pdf:docinfo:modified	2021-02-14T19:23:11Z
pdf:docinfo:producer	pdfTeX-1.40.21
pdf:docinfo:title	StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.21
resourceName	b'10_1101-2021_02_12_430979.pdf'
title	StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs
trapped	False
xmp:CreatorTool	TeX
xmpMM:DocumentID	uuid:5b2f399e-1dd2-11b2-0a00-d30827bd3700
xmpTPg:NPages	20
=== file2bib.sh ===
         id: 10_1101-2021_02_13_429885
     author: Househam, Jacob
      title: A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing
       date: 2021
      pages: 36
  extension: .pdf
        txt: ./txt/10_1101-2021_02_13_429885.txt
      cache: ./cache/10_1101-2021_02_13_429885.pdf

Content-Type	application/pdf
Creation-Date	2021-02-13T13:37:27Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	216
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-13T13:37:27Z
date	2021-02-14T21:22:17Z
dc:format	application/pdf; version=1.5
dc:title	A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing
dcterms:created	2021-02-13T13:37:27Z
dcterms:modified	2021-02-14T21:22:17Z
meta:creation-date	2021-02-13T13:37:27Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['2646', '3593', '3421', '2574', '2758', '3251', '2949', '3250', '4228', '4125', '2811', '2417', '2788', '2811', '2908', '729', '1969', '1259', '1715', '1413', '932', '646', '1253', '862', '744', '744', '745', '710', '711', '711', '711', '711', '712', '1331', '1098', '484']
pdf:docinfo:created	2021-02-13T13:37:27Z
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	Skia/PDF m90
pdf:docinfo:title	A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Skia/PDF m90
resourceName	b'10_1101-2021_02_13_429885.pdf'
title	A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing
xmpMM:DocumentID	uuid:a5aab842-f0f5-5f42-b83c-1d098c720a7a
xmpTPg:NPages	36
=== file2bib.sh ===
         id: 10_1101-2021_02_10_430623
     author: Aberasturi, Dillon
      title: “Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases
       date: 2021
      pages: 9
  extension: .pdf
        txt: ./txt/10_1101-2021_02_10_430623.txt
      cache: ./cache/10_1101-2021_02_10_430623.pdf

Author	Nima Pouladi
Content-Type	application/pdf
Creation-Date	2021-02-10T16:03:33Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	388
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-10T16:03:33Z
creator	Nima Pouladi
date	2021-02-14T21:22:17Z
dc:creator	Nima Pouladi
dc:format	application/pdf; version=1.4
dc:title	“Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases
dcterms:created	2021-02-10T16:03:33Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	Nima Pouladi
meta:creation-date	2021-02-10T16:03:33Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['4080', '6240', '7703', '4913', '7277', '7679', '7670', '7920', '8386']
pdf:docinfo:created	2021-02-10T16:03:33Z
pdf:docinfo:creator	Nima Pouladi
pdf:docinfo:creator_tool	Word
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	macOS Version 10.15.7 (Build 19H2) Quartz PDFContext
pdf:docinfo:title	“Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '43', '145', '4', '0', '2', '0', '0']
producer	macOS Version 10.15.7 (Build 19H2) Quartz PDFContext
resourceName	b'10_1101-2021_02_10_430623.pdf'
title	“Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases
xmp:CreatorTool	Word
xmpMM:DocumentID	uuid:85c83ffc-1dd2-11b2-0a00-1509278d5b00
xmpTPg:NPages	9
=== file2bib.sh ===
         id: 10_1101-2020_09_02_279521
     author: Abi Nader, Clément
      title: Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data
       date: 2021
      pages: 32
  extension: .pdf
        txt: ./txt/10_1101-2020_09_02_279521.txt
      cache: ./cache/10_1101-2020_09_02_279521.pdf

Author	Luigi
Content-Type	application/pdf
Creation-Date	2021-02-10T15:18:40Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	151
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-10T15:18:40Z
creator	Luigi
date	2021-02-14T21:22:17Z
dc:creator	Luigi
dc:format	application/pdf; version=1.7
dc:language	en-US
dc:title	Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data
dcterms:created	2021-02-10T15:18:40Z
dcterms:modified	2021-02-14T21:22:17Z
language	en-US
meta:author	Luigi
meta:creation-date	2021-02-10T15:18:40Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.7
pdf:charsPerPage	['2434', '2713', '3212', '3195', '2791', '3085', '3958', '1263', '1767', '2315', '1805', '2788', '2584', '1738', '1346', '2581', '2596', '658', '3084', '3005', '3258', '3194', '2934', '2319', '2637', '2729', '2500', '2660', '2549', '2716', '2848', '509']
pdf:docinfo:created	2021-02-10T15:18:40Z
pdf:docinfo:creator	Luigi
pdf:docinfo:creator_tool	Microsoft® Word for Microsoft 365
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	Microsoft® Word for Microsoft 365
pdf:docinfo:title	Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Microsoft® Word for Microsoft 365
resourceName	b'10_1101-2020_09_02_279521.pdf'
title	Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data
xmp:CreatorTool	Microsoft® Word for Microsoft 365
xmpMM:DocumentID	uuid:06032D44-3EB9-44B9-86B0-2BD226EEB074
xmpTPg:NPages	32
=== file2bib.sh ===
         id: 10_1101-2021_02_10_430512
     author: Kim, Catherine
      title: Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification
       date: 2021
      pages: 41
  extension: .pdf
        txt: ./txt/10_1101-2021_02_10_430512.txt
      cache: ./cache/10_1101-2021_02_10_430512.pdf

Author	Administrator
Content-Type	application/pdf
Creation-Date	2021-02-10T16:35:24Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	138
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-10T16:35:24Z
creator	Administrator
date	2021-02-14T21:22:17Z
dc:creator	Administrator
dc:format	application/pdf; version=1.4
dc:title	Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification
dcterms:created	2021-02-10T16:35:24Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	Administrator
meta:creation-date	2021-02-10T16:35:24Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['799', '1424', '3057', '1322', '2396', '2821', '838', '2839', '2649', '2927', '3046', '2894', '2854', '3091', '2975', '2901', '1457', '2391', '2551', '2648', '2590', '2505', '2444', '2515', '2623', '2360', '2741', '519', '572', '2642', '2718', '1974', '358', '358', '359', '361', '359', '358', '358', '358', '520']
pdf:docinfo:created	2021-02-10T16:35:24Z
pdf:docinfo:creator	Administrator
pdf:docinfo:creator_tool	PScript5.dll Version 5.2.2
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	Acrobat Distiller 8.1.0 (Windows)
pdf:docinfo:title	Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '16', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Acrobat Distiller 8.1.0 (Windows)
resourceName	b'10_1101-2021_02_10_430512.pdf'
title	Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification
xmp:CreatorTool	PScript5.dll Version 5.2.2
xmpMM:DocumentID	uuid:85c74254-1dd2-11b2-0a00-a108275dc400
xmpTPg:NPages	41
=== file2bib.sh ===
         id: 10_1101-2021_02_08_430343
     author: Gibbs, David L
      title: Patient-specific cell communication networks associate with disease progression in cancer
       date: 2021
      pages: 29
  extension: .pdf
        txt: ./txt/10_1101-2021_02_08_430343.txt
      cache: ./cache/10_1101-2021_02_08_430343.pdf

Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
Author	Dave
Content-Type	application/pdf
Creation-Date	2021-02-09T04:01:31Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	334
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-09T04:01:31Z
creator	Dave
date	2021-02-14T21:22:17Z
dc:creator	Dave
dc:format	application/pdf; version=1.7
dc:language	en-US
dc:title	96291204
dcterms:created	2021-02-09T04:01:31Z
dcterms:modified	2021-02-14T21:22:17Z
language	en-US
meta:author	Dave
meta:creation-date	2021-02-09T04:01:31Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.7
pdf:charsPerPage	['2561', '4352', '3379', '3527', '3566', '3828', '4402', '4535', '4228', '4333', '4207', '3279', '1677', '2262', '2297', '2203', '3241', '3317', '2921', '2659', '338', '338', '338', '338', '338', '338', '338', '338', '338']
pdf:docinfo:created	2021-02-09T04:01:31Z
pdf:docinfo:creator	Dave
pdf:docinfo:creator_tool	Appligent AppendPDF Pro 5.5
pdf:docinfo:custom:Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	Microsoft® Word for Microsoft 365
pdf:docinfo:title	96291204
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Microsoft® Word for Microsoft 365
resourceName	b'10_1101-2021_02_08_430343.pdf'
title	96291204
xmp:CreatorTool	Appligent AppendPDF Pro 5.5
xmpMM:DocumentID	uuid:02a690eb-b082-11b2-0a00-782dad000000
xmpTPg:NPages	29
=== file2bib.sh ===
         id: 10_1101-2021_02_09_430550
     author: Song, Dongyuan
      title: scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
       date: 2021
      pages: 37
  extension: .pdf
        txt: ./txt/10_1101-2021_02_09_430550.txt
      cache: ./cache/10_1101-2021_02_09_430550.pdf

Author	
Content-Type	application/pdf
Creation-Date	2021-02-10T07:08:44Z
Keywords	
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	153
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-10T07:08:44Z
creator	
date	2021-02-14T21:22:17Z
dc:creator	
dc:format	application/pdf; version=1.5
dc:subject	
dc:title	scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
dcterms:created	2021-02-10T07:08:44Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	
meta:creation-date	2021-02-10T07:08:44Z
meta:keyword	
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['2602', '3331', '3092', '1969', '2476', '2626', '2136', '2674', '2221', '2436', '1603', '2850', '2182', '2921', '3297', '1944', '2318', '2544', '2453', '2588', '860', '2144', '2326', '2412', '896', '970', '1053', '767', '438', '525', '435', '529', '2318', '2544', '2453', '2588', '860']
pdf:docinfo:created	2021-02-10T07:08:44Z
pdf:docinfo:creator	
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	pdfTeX-1.40.21
pdf:docinfo:subject	
pdf:docinfo:title	scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '69', '6', '8', '4', '0', '0', '6', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '4', '2', '34', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.21
resourceName	b'10_1101-2021_02_09_430550.pdf'
subject	
title	scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:85c7129e-1dd2-11b2-0a00-4808278d5b00
xmpTPg:NPages	37
=== file2bib.sh ===
         id: 10_1101-2021_02_10_430606
     author: Wei, Zheng
      title: NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks
       date: 2021
      pages: 31
  extension: .pdf
        txt: ./txt/10_1101-2021_02_10_430606.txt
      cache: ./cache/10_1101-2021_02_10_430606.pdf

Content-Type	application/pdf
Creation-Date	2021-02-11T07:26:25Z
Keywords	
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	161
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-11T07:26:25Z
date	2021-02-14T21:22:17Z
dc:format	application/pdf; version=1.4
dc:subject	
dc:title	NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks
dcterms:created	2021-02-11T07:26:25Z
dcterms:modified	2021-02-14T21:22:17Z
meta:creation-date	2021-02-11T07:26:25Z
meta:keyword	
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['1957', '1927', '1871', '1344', '1768', '265', '1654', '1374', '3237', '1483', '3133', '3345', '3309', '2950', '3308', '3422', '2952', '1097', '2527', '2241', '2894', '2551', '2511', '2949', '3107', '3322', '2890', '1419', '2894', '3279', '1145']
pdf:docinfo:created	2021-02-11T07:26:25Z
pdf:docinfo:creator_tool	Word
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	macOS 版本 10.14.6（版号 18G8012） Quartz PDFContext
pdf:docinfo:title	NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '32', '4', '0', '12', '0', '0', '0', '0', '0', '1', '20', '0', '0', '0', '0', '0', '0', '55', '132', '27', '119', '88', '8', '0', '0', '5', '0', '0', '0', '0']
producer	macOS 版本 10.14.6（版号 18G8012） Quartz PDFContext
resourceName	b'10_1101-2021_02_10_430606.pdf'
subject	
title	NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks
xmp:CreatorTool	Word
xmpMM:DocumentID	uuid:85c819d0-1dd2-11b2-0a00-9708277d8900
xmpTPg:NPages	31
=== file2bib.sh ===
         id: 10_1101-2020_10_08_327718
     author: Jambor, Helena
      title: Creating Clear and Informative Image-based Figures for Scientific Publications
       date: 2021
      pages: 36
  extension: .pdf
        txt: ./txt/10_1101-2020_10_08_327718.txt
      cache: ./cache/10_1101-2020_10_08_327718.pdf

Author	Tracey Weissgerber
Comments	
Company	
Content-Type	application/pdf
Creation-Date	2021-02-11T08:45:34Z
Keywords	
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
SourceModified	D:20210211084452
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	166
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-11T08:45:34Z
creator	Tracey Weissgerber
date	2021-02-14T21:22:17Z
dc:creator	Tracey Weissgerber
dc:format	application/pdf; version=1.6
dc:language	EN-US
dc:subject	
dc:title	
dcterms:created	2021-02-11T08:45:34Z
dcterms:modified	2021-02-14T21:22:17Z
language	EN-US
meta:author	Tracey Weissgerber
meta:creation-date	2021-02-11T08:45:34Z
meta:keyword	
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.6
pdf:charsPerPage	['3190', '1904', '3819', '3897', '910', '2670', '754', '2640', '1136', '3780', '2209', '2046', '2394', '2062', '1692', '1524', '1280', '2079', '2312', '2250', '2370', '3715', '3836', '2374', '3859', '4070', '3003', '1054', '1685', '1658', '1475', '890', '412', '3866', '3908', '2364']
pdf:docinfo:created	2021-02-11T08:45:34Z
pdf:docinfo:creator	Tracey Weissgerber
pdf:docinfo:creator_tool	Acrobat PDFMaker 20 for Word
pdf:docinfo:custom:Comments	
pdf:docinfo:custom:Company	
pdf:docinfo:custom:SourceModified	D:20210211084452
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	Adobe PDF Library 20.13.96
pdf:docinfo:subject	
pdf:docinfo:title	
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Adobe PDF Library 20.13.96
resourceName	b'10_1101-2020_10_08_327718.pdf'
subject	
title	
xmp:CreatorTool	Acrobat PDFMaker 20 for Word
xmpMM:DocumentID	uuid:5d3787c4-53fe-422c-847f-b8011720723d
xmpTPg:NPages	36
=== file2bib.sh ===
         id: 10_1101-698605
     author: Sarantopoulou, Dimitra
      title: Comparative evaluation of full-length isoform quantification from RNA-Seq
       date: 2021
      pages: 37
  extension: .pdf
        txt: ./txt/10_1101-698605.txt
      cache: ./cache/10_1101-698605.pdf

Author	Thomas Brooks
Content-Type	application/pdf
Creation-Date	2021-02-11T17:18:36Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	315
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-11T17:18:36Z
creator	Thomas Brooks
date	2021-02-14T21:22:17Z
dc:creator	Thomas Brooks
dc:format	application/pdf; version=1.7
dc:title	Comparative evaluation of full-length isoform quantification from RNA-Seq
dcterms:created	2021-02-11T17:18:36Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	Thomas Brooks
meta:creation-date	2021-02-11T17:18:36Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.7
pdf:charsPerPage	['2300', '2678', '3181', '3316', '2068', '3156', '2759', '1252', '2934', '2853', '1939', '1704', '2616', '1349', '1899', '2026', '1841', '2304', '2390', '2985', '1795', '2952', '3087', '3064', '2972', '2970', '2485', '1728', '2104', '2438', '2468', '2365', '613', '1062', '771', '951', '697']
pdf:docinfo:created	2021-02-11T17:18:36Z
pdf:docinfo:creator	Thomas Brooks
pdf:docinfo:creator_tool	Microsoft® Word for Microsoft 365
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	Microsoft® Word for Microsoft 365
pdf:docinfo:title	Comparative evaluation of full-length isoform quantification from RNA-Seq
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Microsoft® Word for Microsoft 365
resourceName	b'10_1101-698605.pdf'
title	Comparative evaluation of full-length isoform quantification from RNA-Seq
xmp:CreatorTool	Microsoft® Word for Microsoft 365
xmpMM:DocumentID	uuid:c591608c-c000-4b6e-931d-a5352ac54f59
xmpTPg:NPages	37
=== file2bib.sh ===
         id: 10_1101-2021_02_10_430563
     author: Bandrowski, Anita
      title: SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data
       date: 2021
      pages: 16
  extension: .pdf
        txt: ./txt/10_1101-2021_02_10_430563.txt
      cache: ./cache/10_1101-2021_02_10_430563.pdf

Author	Calmi2
Content-Type	application/pdf
Creation-Date	2021-02-10T08:53:11Z
Last-Modified	2021-02-14T21:22:18Z
Last-Save-Date	2021-02-14T21:22:18Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	530
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-10T08:53:11Z
creator	Calmi2
date	2021-02-14T21:22:18Z
dc:creator	Calmi2
dc:format	application/pdf; version=1.7
dc:language	en-US
dc:title	SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data
dcterms:created	2021-02-10T08:53:11Z
dcterms:modified	2021-02-14T21:22:18Z
language	en-US
meta:author	Calmi2
meta:creation-date	2021-02-10T08:53:11Z
meta:save-date	2021-02-14T21:22:18Z
modified	2021-02-14T21:22:18Z
pdf:PDFVersion	1.7
pdf:charsPerPage	['2645', '6469', '5580', '4241', '6477', '4383', '4414', '5014', '3391', '4754', '3775', '6024', '3764', '3020', '2517', '3837']
pdf:docinfo:created	2021-02-10T08:53:11Z
pdf:docinfo:creator	Calmi2
pdf:docinfo:creator_tool	Microsoft® Word for Microsoft 365
pdf:docinfo:modified	2021-02-14T21:22:18Z
pdf:docinfo:producer	Microsoft® Word for Microsoft 365
pdf:docinfo:title	SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Microsoft® Word for Microsoft 365
resourceName	b'10_1101-2021_02_10_430563.pdf'
title	SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data
xmp:CreatorTool	Microsoft® Word for Microsoft 365
xmpMM:DocumentID	uuid:F3D69740-A838-49AB-AADD-0C81C78493F2
xmpTPg:NPages	16
=== file2bib.sh ===
         id: 10_1101-2021_02_01_429246
     author: Zheng, Hongyu
      title: Sequence-specific minimizers via polar sets
       date: 2021
      pages: 24
  extension: .pdf
        txt: ./txt/10_1101-2021_02_01_429246.txt
      cache: ./cache/10_1101-2021_02_01_429246.pdf

Author	
Content-Type	application/pdf
Creation-Date	2021-02-10T23:12:39Z
Keywords	
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) kpathsea version 6.2.2
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	202
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-10T23:12:39Z
creator	
date	2021-02-14T21:22:17Z
dc:creator	
dc:format	application/pdf; version=1.5
dc:subject	
dc:title	Sequence-specific minimizers via polar sets
dcterms:created	2021-02-10T23:12:39Z
dcterms:modified	2021-02-14T21:22:17Z
meta:author	
meta:creation-date	2021-02-10T23:12:39Z
meta:keyword	
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['3341', '4822', '3505', '2808', '3432', '3527', '3426', '3056', '2923', '3266', '3567', '3791', '3419', '4322', '3187', '3142', '1097', '3740', '2460', '3254', '3286', '1064', '1574', '785']
pdf:docinfo:created	2021-02-10T23:12:39Z
pdf:docinfo:creator	
pdf:docinfo:creator_tool	LaTeX with hyperref package
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) kpathsea version 6.2.2
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	pdfTeX-1.40.17
pdf:docinfo:subject	
pdf:docinfo:title	Sequence-specific minimizers via polar sets
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '1', '0', '7', '4', '10', '13', '12', '0', '2', '3', '0', '0', '0', '0', '0', '0', '13', '2', '2', '0', '0', '0']
producer	pdfTeX-1.40.17
resourceName	b'10_1101-2021_02_01_429246.pdf'
subject	
title	Sequence-specific minimizers via polar sets
trapped	False
xmp:CreatorTool	LaTeX with hyperref package
xmpMM:DocumentID	uuid:85c73a57-1dd2-11b2-0a00-6f09275d6100
xmpTPg:NPages	24
=== file2bib.sh ===
         id: 10_1101-2021_02_08_430280
     author: Kasukurthi, Mohan V
      title: SALTS – <span class="underline">S</span>URFR (sncRNA) <span class="underline">A</span>nd <span class="underline">L</span>AGOOn (lncRNA) <span class="underline">T</span>ranscriptomics <span class="underline">S</span>uite
       date: 2021
      pages: 23
  extension: .pdf
        txt: ./txt/10_1101-2021_02_08_430280.txt
      cache: ./cache/10_1101-2021_02_08_430280.pdf

Author	glen borchert
Content-Type	application/pdf
Creation-Date	2021-02-08T17:59:42Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	332
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-08T17:59:42Z
creator	glen borchert
date	2021-02-14T21:22:17Z
dc:creator	glen borchert
dc:format	application/pdf; version=1.5
dc:language	en-US
dc:title	SALTS – SURFR (sncRNA) And LAGOOn (lncRNA) Transcriptomics Suite
dcterms:created	2021-02-08T17:59:42Z
dcterms:modified	2021-02-14T21:22:17Z
language	en-US
meta:author	glen borchert
meta:creation-date	2021-02-08T17:59:42Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['1572', '3069', '4757', '4428', '2929', '2020', '5103', '3677', '2141', '4289', '4322', '3554', '3166', '6567', '2370', '4458', '4580', '4205', '3674', '4080', '3597', '3637', '1271']
pdf:docinfo:created	2021-02-08T17:59:42Z
pdf:docinfo:creator	glen borchert
pdf:docinfo:creator_tool	Microsoft® Word 2016
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	Microsoft® Word 2016
pdf:docinfo:title	SALTS – SURFR (sncRNA) And LAGOOn (lncRNA) Transcriptomics Suite
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Microsoft® Word 2016
resourceName	b'10_1101-2021_02_08_430280.pdf'
title	SALTS – SURFR (sncRNA) And LAGOOn (lncRNA) Transcriptomics Suite
xmp:CreatorTool	Microsoft® Word 2016
xmpMM:DocumentID	uuid:85c75b5e-1dd2-11b2-0a00-c70827bd7200
xmpTPg:NPages	23
=== file2bib.sh ===
         id: 10_1101-2021_02_10_430705
     author: Stassen, Shobana V.
      title: VIA: Generalized and scalable trajectory inference in single-cell omics data
       date: 2021
      pages: 24
  extension: .pdf
        txt: ./txt/10_1101-2021_02_10_430705.txt
      cache: ./cache/10_1101-2021_02_10_430705.pdf

Content-Type	application/pdf
Creation-Date	2021-02-10T05:27:48Z
Last-Modified	2021-02-14T18:00:57Z
Last-Save-Date	2021-02-14T18:00:57Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	391
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-10T05:27:48Z
date	2021-02-14T18:00:57Z
dc:format	application/pdf; version=1.4
dc:title	VIA: Generalized and scalable trajectory inference in single-cell omics data
dcterms:created	2021-02-10T05:27:48Z
dcterms:modified	2021-02-14T18:00:57Z
meta:creation-date	2021-02-10T05:27:48Z
meta:save-date	2021-02-14T18:00:57Z
modified	2021-02-14T18:00:57Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['5373', '2864', '5483', '5526', '5130', '345', '6236', '5474', '2917', '4605', '4025', '3650', '5031', '4563', '5222', '5654', '5634', '4294', '3842', '3498', '4377', '4589', '4431', '1921']
pdf:docinfo:created	2021-02-10T05:27:48Z
pdf:docinfo:creator_tool	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.75 Safari/537.36
pdf:docinfo:modified	2021-02-14T18:00:57Z
pdf:docinfo:producer	Skia/PDF m77
pdf:docinfo:title	VIA: Generalized and scalable trajectory inference in single-cell omics data
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Skia/PDF m77
resourceName	b'10_1101-2021_02_10_430705.pdf'
title	VIA: Generalized and scalable trajectory inference in single-cell omics data
xmp:CreatorTool	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.75 Safari/537.36
xmpMM:DocumentID	uuid:3dc71ea7-1dd2-11b2-0a00-7a08275dc400
xmpTPg:NPages	24
=== file2bib.sh ===
         id: 10_1101-2021_02_09_430363
     author: Bayer, Johanna M. M.
      title: Accommodating site variation in neuroimaging data using hierarchical and Bayesian models
       date: 2021
      pages: 20
  extension: .pdf
        txt: ./txt/10_1101-2021_02_09_430363.txt
      cache: ./cache/10_1101-2021_02_09_430363.pdf

Author	
Content-Type	application/pdf
Creation-Date	2021-02-09T06:02:04Z
Keywords	
Last-Modified	2021-02-14T21:22:27Z
Last-Save-Date	2021-02-14T21:22:27Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.19 (TeX Live 2018) kpathsea version 6.3.0
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	165
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-09T06:02:04Z
creator	
date	2021-02-14T21:22:27Z
dc:creator	
dc:format	application/pdf; version=1.5
dc:subject	
dc:title	Accommodating site variation in neuroimaging data using hierarchical and Bayesian models
dcterms:created	2021-02-09T06:02:04Z
dcterms:modified	2021-02-14T21:22:27Z
meta:author	
meta:creation-date	2021-02-09T06:02:04Z
meta:keyword	
meta:save-date	2021-02-14T21:22:27Z
modified	2021-02-14T21:22:27Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['1502', '4844', '5626', '4771', '2956', '3003', '1660', '2526', '3391', '4243', '1765', '3323', '2090', '3826', '5707', '3101', '4217', '4665', '4787', '959']
pdf:docinfo:created	2021-02-09T06:02:04Z
pdf:docinfo:creator	
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.19 (TeX Live 2018) kpathsea version 6.3.0
pdf:docinfo:keywords	
pdf:docinfo:modified	2021-02-14T21:22:27Z
pdf:docinfo:producer	pdfTeX-1.40.19
pdf:docinfo:subject	
pdf:docinfo:title	Accommodating site variation in neuroimaging data using hierarchical and Bayesian models
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '10', '9', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.19
resourceName	b'10_1101-2021_02_09_430363.pdf'
subject	
title	Accommodating site variation in neuroimaging data using hierarchical and Bayesian models
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:85d672f8-1dd2-11b2-0a00-b309275d6100
xmpTPg:NPages	20
=== file2bib.sh ===
         id: 10_1101-2021_02_11_430762
     author: Schäffer, Alejandro A.
      title: Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation
       date: 2021
      pages: 28
  extension: .pdf
        txt: ./txt/10_1101-2021_02_11_430762.txt
      cache: ./cache/10_1101-2021_02_11_430762.pdf

Content-Type	application/pdf
Creation-Date	2021-02-11T12:21:20Z
Last-Modified	2021-02-14T21:22:17Z
Last-Save-Date	2021-02-14T21:22:17Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/MacPorts 2019.50896_2) kpathsea version 6.3.1
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	186
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-11T12:21:20Z
date	2021-02-14T21:22:17Z
dc:format	application/pdf; version=1.5
dc:title	Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation
dcterms:created	2021-02-11T12:21:20Z
dcterms:modified	2021-02-14T21:22:17Z
meta:creation-date	2021-02-11T12:21:20Z
meta:save-date	2021-02-14T21:22:17Z
modified	2021-02-14T21:22:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['3536', '3441', '3231', '3411', '3332', '2676', '3013', '3264', '3585', '3505', '3340', '3147', '3237', '2993', '3597', '3360', '3849', '3022', '3179', '3559', '3323', '3293', '3281', '3253', '3263', '5532', '1767', '692']
pdf:docinfo:created	2021-02-11T12:21:20Z
pdf:docinfo:creator_tool	TeX
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/MacPorts 2019.50896_2) kpathsea version 6.3.1
pdf:docinfo:modified	2021-02-14T21:22:17Z
pdf:docinfo:producer	pdfTeX-1.40.20
pdf:docinfo:title	Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '17', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.20
resourceName	b'10_1101-2021_02_11_430762.pdf'
title	Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation
trapped	False
xmp:CreatorTool	TeX
xmpMM:DocumentID	uuid:85c6efdd-1dd2-11b2-0a00-d909278d5b00
xmpTPg:NPages	28
=== file2bib.sh ===
         id: 10_1101-2020_09_23_308239
     author: Schultz, Bruce T
      title: The COVID-19 PHARMACOME: A method for the rational selection of drug repurposing candidates from multimodal knowledge harmonization
       date: 2021
      pages: 31
  extension: .pdf
        txt: ./txt/10_1101-2020_09_23_308239.txt
      cache: ./cache/10_1101-2020_09_23_308239.pdf

Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
Content-Type	application/pdf
Creation-Date	2021-02-11T08:22:35Z
Last-Modified	2021-02-14T21:22:20Z
Last-Save-Date	2021-02-14T21:22:20Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	5558
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-11T08:22:35Z
date	2021-02-14T21:22:20Z
dc:format	application/pdf; version=1.7
dc:language	en-US
dc:title	2599906
dcterms:created	2021-02-11T08:22:35Z
dcterms:modified	2021-02-14T21:22:20Z
language	en-US
meta:creation-date	2021-02-11T08:22:35Z
meta:save-date	2021-02-14T21:22:20Z
modified	2021-02-14T21:22:20Z
pdf:PDFVersion	1.7
pdf:charsPerPage	['2584', '1515', '2194', '2218', '491', '2118', '1934', '1624', '2074', '2074', '1754', '1826', '2154', '1228', '1624', '1173', '1263', '2086', '2088', '508', '2314', '1363', '534', '4487', '5088', '348', '344', '347', '349', '344', '7141']
pdf:docinfo:created	2021-02-11T08:22:35Z
pdf:docinfo:creator_tool	Appligent AppendPDF Pro 5.5
pdf:docinfo:custom:Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
pdf:docinfo:modified	2021-02-14T21:22:20Z
pdf:docinfo:producer	Microsoft® Word for Microsoft 365
pdf:docinfo:title	2599906
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Microsoft® Word for Microsoft 365
resourceName	b'10_1101-2020_09_23_308239.pdf'
title	2599906
xmp:CreatorTool	Appligent AppendPDF Pro 5.5
xmpMM:DocumentID	uuid:65fc1ff4-b086-11b2-0a00-782dad000000
xmpTPg:NPages	31
=== file2bib.sh ===
         id: 10_1101-2020_01_28_923532
     author: Ahmadi, Saba
      title: The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective
       date: 2021
      pages: 46
  extension: .pdf
        txt: ./txt/10_1101-2020_01_28_923532.txt
      cache: ./cache/10_1101-2020_01_28_923532.pdf

Author	Schaffer, Alejandro (NIH/NLM/NCBI) [E]
Content-Type	application/pdf
Creation-Date	2021-02-12T10:57:20Z
Last-Modified	2021-02-14T20:40:41Z
Last-Save-Date	2021-02-14T20:40:41Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	310
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-12T10:57:20Z
creator	Schaffer, Alejandro (NIH/NLM/NCBI) [E]
date	2021-02-14T20:40:41Z
dc:creator	Schaffer, Alejandro (NIH/NLM/NCBI) [E]
dc:format	application/pdf; version=1.7
dc:language	en-US
dc:title	The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective
dcterms:created	2021-02-12T10:57:20Z
dcterms:modified	2021-02-14T20:40:41Z
language	en-US
meta:author	Schaffer, Alejandro (NIH/NLM/NCBI) [E]
meta:creation-date	2021-02-12T10:57:20Z
meta:save-date	2021-02-14T20:40:41Z
modified	2021-02-14T20:40:41Z
pdf:PDFVersion	1.7
pdf:charsPerPage	['1785', '909', '2031', '2819', '3037', '2594', '1347', '1985', '2038', '3076', '2477', '1952', '1872', '1658', '1185', '2016', '2878', '1058', '1201', '2930', '2007', '1643', '2801', '3005', '2714', '3060', '2616', '3072', '1293', '2812', '2597', '2058', '1247', '856', '1289', '2511', '2560', '2403', '2051', '1906', '2272', '3658', '3506', '3579', '3509', '1722']
pdf:docinfo:created	2021-02-12T10:57:20Z
pdf:docinfo:creator	Schaffer, Alejandro (NIH/NLM/NCBI) [E]
pdf:docinfo:creator_tool	Microsoft® Word for Office 365
pdf:docinfo:modified	2021-02-14T20:40:41Z
pdf:docinfo:producer	Microsoft® Word for Office 365
pdf:docinfo:title	The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective
pdf:encrypted	false
pdf:hasMarkedContent	true
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Microsoft® Word for Office 365
resourceName	b'10_1101-2020_01_28_923532.pdf'
title	The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective
xmp:CreatorTool	Microsoft® Word for Office 365
xmpMM:DocumentID	uuid:647F7B6F-D5FD-4AA7-A927-7A5C48053B39
xmpTPg:NPages	46
=== file2bib.sh ===
         id: 10_1101-2021_02_09_430460
     author: Banerjee, Shayantan
      title: Sequence neighborhoods enable reliable prediction of pathogenic mutations in cancer genomes
       date: 2021
      pages: 39
  extension: .pdf
        txt: ./txt/10_1101-2021_02_09_430460.txt
      cache: ./cache/10_1101-2021_02_09_430460.pdf

Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
Content-Type	application/pdf
Creation-Date	2021-02-09T18:13:05Z
Last-Modified	2021-02-14T21:22:18Z
Last-Save-Date	2021-02-14T21:22:18Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	501
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-09T18:13:05Z
date	2021-02-14T21:22:18Z
dc:format	application/pdf; version=1.5
dc:title	99120235
dcterms:created	2021-02-09T18:13:05Z
dcterms:modified	2021-02-14T21:22:18Z
meta:creation-date	2021-02-09T18:13:05Z
meta:save-date	2021-02-14T21:22:18Z
modified	2021-02-14T21:22:18Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['2933', '3586', '3699', '3184', '3141', '3534', '3190', '3343', '3362', '3202', '3398', '3616', '3061', '3423', '3333', '3280', '3702', '3698', '2901', '3903', '3853', '3930', '3937', '1941', '1445', '809', '880', '1975', '3658', '3874', '1427', '341', '341', '341', '341', '341', '341', '341', '341']
pdf:docinfo:created	2021-02-09T18:13:05Z
pdf:docinfo:creator_tool	Appligent AppendPDF Pro 5.5
pdf:docinfo:custom:Appligent	AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct  2 2014 Library 10.1.0
pdf:docinfo:modified	2021-02-14T21:22:18Z
pdf:docinfo:producer	Skia/PDF m90
pdf:docinfo:title	99120235
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	Skia/PDF m90
resourceName	b'10_1101-2021_02_09_430460.pdf'
title	99120235
xmp:CreatorTool	Appligent AppendPDF Pro 5.5
xmpMM:DocumentID	uuid:3332bc21-b083-11b2-0a00-782dad000000
xmpTPg:NPages	39
=== file2bib.sh ===
         id: 10_1101-2021_02_09_430536
     author: Lin, Cui-Xiang
      title: Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes
       date: 2021
      pages: 47
  extension: .pdf
        txt: ./txt/10_1101-2021_02_09_430536.txt
      cache: ./cache/10_1101-2021_02_09_430536.pdf

Content-Type	application/pdf
Creation-Date	2021-02-08T13:33:39Z
Last-Modified	2021-02-14T21:22:18Z
Last-Save-Date	2021-02-14T21:22:18Z
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	509
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
created	2021-02-08T13:33:39Z
date	2021-02-14T21:22:18Z
dc:format	application/pdf; version=1.4
dc:title	Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes
dcterms:created	2021-02-08T13:33:39Z
dcterms:modified	2021-02-14T21:22:18Z
meta:creation-date	2021-02-08T13:33:39Z
meta:save-date	2021-02-14T21:22:18Z
modified	2021-02-14T21:22:18Z
pdf:PDFVersion	1.4
pdf:charsPerPage	['1664', '2073', '2208', '2343', '2100', '2396', '2229', '2243', '2075', '2132', '2088', '2141', '2348', '2222', '2377', '2256', '2313', '2108', '2317', '2344', '2180', '2187', '2176', '2145', '2140', '2082', '2090', '2146', '2191', '2089', '3111', '2933', '2995', '3073', '2378', '1830', '1583', '1697', '2746', '2631', '1348', '1962', '1975', '681', '7158', '1185', '1009']
pdf:docinfo:created	2021-02-08T13:33:39Z
pdf:docinfo:creator_tool	Word
pdf:docinfo:modified	2021-02-14T21:22:18Z
pdf:docinfo:producer	macOS Version 10.15.7 (Build 19H114) Quartz PDFContext
pdf:docinfo:title	Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '9', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	macOS Version 10.15.7 (Build 19H114) Quartz PDFContext
resourceName	b'10_1101-2021_02_09_430536.pdf'
title	Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes
xmp:CreatorTool	Word
xmpMM:DocumentID	uuid:85c859ad-1dd2-11b2-0a00-5408271d5700
xmpTPg:NPages	47
=== file2bib.sh ===
         id: 10_1101-727867
     author: Tangherloni, Andrea
      title: scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data
       date: 2021
      pages: 28
  extension: .pdf
        txt: ./txt/10_1101-727867.txt
      cache: ./cache/10_1101-727867.pdf

Author	Andrea Tangherloni, Federico Ricciuti, Daniela Besozzi, Pietro Liò, Ana Cvejic
Content-Type	application/pdf
Creation-Date	2021-02-12T16:25:04Z
Keywords	Autoencoders, scRNA-Seq, Dimensionality reduction, Clustering, Batch correction, Data integration
Last-Modified	2021-02-14T21:11:17Z
Last-Save-Date	2021-02-14T21:11:17Z
PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1
X-Parsed-By	['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser']
X-TIKA:content_handler	ToTextContentHandler
X-TIKA:embedded_depth	0
X-TIKA:parse_time_millis	3963
access_permission:assemble_document	true
access_permission:can_modify	true
access_permission:can_print	true
access_permission:can_print_degraded	true
access_permission:extract_content	true
access_permission:extract_for_accessibility	true
access_permission:fill_in_form	true
access_permission:modify_annotations	true
cp:subject	
created	2021-02-12T16:25:04Z
creator	Andrea Tangherloni, Federico Ricciuti, Daniela Besozzi, Pietro Liò, Ana Cvejic
date	2021-02-14T21:11:17Z
dc:creator	Andrea Tangherloni, Federico Ricciuti, Daniela Besozzi, Pietro Liò, Ana Cvejic
dc:format	application/pdf; version=1.5
dc:subject	Autoencoders, scRNA-Seq, Dimensionality reduction, Clustering, Batch correction, Data integration
dc:title	scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data
dcterms:created	2021-02-12T16:25:04Z
dcterms:modified	2021-02-14T21:11:17Z
meta:author	Andrea Tangherloni, Federico Ricciuti, Daniela Besozzi, Pietro Liò, Ana Cvejic
meta:creation-date	2021-02-12T16:25:04Z
meta:keyword	Autoencoders, scRNA-Seq, Dimensionality reduction, Clustering, Batch correction, Data integration
meta:save-date	2021-02-14T21:11:17Z
modified	2021-02-14T21:11:17Z
pdf:PDFVersion	1.5
pdf:charsPerPage	['1552', '2911', '3437', '3290', '3172', '2974', '2835', '2923', '3313', '3204', '3158', '3151', '3207', '3309', '3262', '2383', '4689', '5805', '6016', '2696', '1341', '1788', '1428', '1745', '1871', '1746', '1423', '1664']
pdf:docinfo:created	2021-02-12T16:25:04Z
pdf:docinfo:creator	Andrea Tangherloni, Federico Ricciuti, Daniela Besozzi, Pietro Liò, Ana Cvejic
pdf:docinfo:creator_tool	LaTeX with hyperref
pdf:docinfo:custom:PTEX.Fullbanner	This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1
pdf:docinfo:keywords	Autoencoders, scRNA-Seq, Dimensionality reduction, Clustering, Batch correction, Data integration
pdf:docinfo:modified	2021-02-14T21:11:17Z
pdf:docinfo:producer	pdfTeX-1.40.20
pdf:docinfo:subject	
pdf:docinfo:title	scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data
pdf:docinfo:trapped	False
pdf:encrypted	false
pdf:hasMarkedContent	false
pdf:hasXFA	false
pdf:hasXMP	true
pdf:unmappedUnicodeCharsPerPage	['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '2', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
producer	pdfTeX-1.40.20
resourceName	b'10_1101-727867.pdf'
subject	
title	scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data
trapped	False
xmp:CreatorTool	LaTeX with hyperref
xmpMM:DocumentID	uuid:81d7d1e9-1dd2-11b2-0a00-030a276d7200
xmpTPg:NPages	28
10_1101-2021_02_10_430604  txt/../ent/10_1101-2021_02_10_430604.ent
10_1101-2020_05_15_090266  txt/../ent/10_1101-2020_05_15_090266.ent
10_1101-2021_02_11_430806  txt/../ent/10_1101-2021_02_11_430806.ent
10_1101-2021_02_09_430036  txt/../ent/10_1101-2021_02_09_430036.ent
10_1101-2021_02_10_430367  txt/../ent/10_1101-2021_02_10_430367.ent
10_1101-2021_02_08_430070  txt/../ent/10_1101-2021_02_08_430070.ent
10_1101-2021_02_10_430619  txt/../ent/10_1101-2021_02_10_430619.ent
10_1101-2021_02_12_431018  txt/../ent/10_1101-2021_02_12_431018.ent
10_1101-2021_02_08_430275  txt/../ent/10_1101-2021_02_08_430275.ent
10_1101-2021_02_10_430656  txt/../ent/10_1101-2021_02_10_430656.ent
10_1101-2020_12_24_424317  txt/../ent/10_1101-2020_12_24_424317.ent
10_1101-2020_09_23_308239  txt/../ent/10_1101-2020_09_23_308239.ent
10_1101-2021_02_08_430270  txt/../ent/10_1101-2021_02_08_430270.ent
10_1101-2021_02_12_430963  txt/../ent/10_1101-2021_02_12_430963.ent
10_1101-2021_02_09_430405  txt/../ent/10_1101-2021_02_09_430405.ent
10_1101-2021_02_12_430923  txt/../ent/10_1101-2021_02_12_430923.ent
10_1101-2021_02_12_430989  txt/../ent/10_1101-2021_02_12_430989.ent
10_1101-2020_09_23_310276  txt/../ent/10_1101-2020_09_23_310276.ent
10_1101-2020_11_17_386649  txt/../ent/10_1101-2020_11_17_386649.ent
10_1101-2021_02_12_430979  txt/../ent/10_1101-2021_02_12_430979.ent
10_1101-2021_02_08_428881  txt/../ent/10_1101-2021_02_08_428881.ent
10_1101-2021_02_12_430739  txt/../ent/10_1101-2021_02_12_430739.ent
10_1101-2021_02_11_430695  txt/../ent/10_1101-2021_02_11_430695.ent
10_1101-2020_02_04_934216  txt/../ent/10_1101-2020_02_04_934216.ent
10_1101-2021_02_12_430830  txt/../ent/10_1101-2021_02_12_430830.ent
10_1101-2021_02_10_430623  txt/../ent/10_1101-2021_02_10_430623.ent
10_1101-2021_02_11_430847  txt/../ent/10_1101-2021_02_11_430847.ent
10_1101-2021_02_10_430649  txt/../ent/10_1101-2021_02_10_430649.ent
10_1101-2021_02_11_430789  txt/../ent/10_1101-2021_02_11_430789.ent
10_1101-2021_02_12_430764  txt/../ent/10_1101-2021_02_12_430764.ent
10_1101-2020_09_21_305516  txt/../ent/10_1101-2020_09_21_305516.ent
10_1101-2021_02_10_430563  txt/../ent/10_1101-2021_02_10_430563.ent
10_1101-2021_02_08_430343  txt/../ent/10_1101-2021_02_08_430343.ent
10_1101-2021_02_09_430550  txt/../ent/10_1101-2021_02_09_430550.ent
10_1101-2021_02_09_430363  txt/../ent/10_1101-2021_02_09_430363.ent
10_1101-2020_09_02_279521  txt/../ent/10_1101-2020_09_02_279521.ent
10_1101-2021_02_11_430871  txt/../ent/10_1101-2021_02_11_430871.ent
10_1101-2020_10_08_327718  txt/../ent/10_1101-2020_10_08_327718.ent
10_1101-2021_02_13_429885  txt/../ent/10_1101-2021_02_13_429885.ent
10_1101-698605  txt/../ent/10_1101-698605.ent
10_1101-2021_02_08_430280  txt/../ent/10_1101-2021_02_08_430280.ent
10_1101-2021_02_01_429246  txt/../ent/10_1101-2021_02_01_429246.ent
10_1101-2021_02_10_430705  txt/../ent/10_1101-2021_02_10_430705.ent
10_1101-2021_02_10_430606  txt/../ent/10_1101-2021_02_10_430606.ent
10_1101-2021_02_10_430512  txt/../ent/10_1101-2021_02_10_430512.ent
10_1101-2021_02_11_430762  txt/../ent/10_1101-2021_02_11_430762.ent
10_1101-2021_02_09_430460  txt/../ent/10_1101-2021_02_09_430460.ent
10_1101-727867  txt/../ent/10_1101-727867.ent
10_1101-2020_01_28_923532  txt/../ent/10_1101-2020_01_28_923532.ent
10_1101-2021_02_09_430536  txt/../ent/10_1101-2021_02_09_430536.ent
10_1101-2021_02_10_430604  txt/../pos/10_1101-2021_02_10_430604.pos
10_1101-2021_02_11_430806  txt/../pos/10_1101-2021_02_11_430806.pos
10_1101-2021_02_12_431018  txt/../pos/10_1101-2021_02_12_431018.pos
10_1101-2020_05_15_090266  txt/../pos/10_1101-2020_05_15_090266.pos
10_1101-2021_02_10_430619  txt/../pos/10_1101-2021_02_10_430619.pos
10_1101-2020_12_24_424317  txt/../pos/10_1101-2020_12_24_424317.pos
10_1101-2021_02_09_430036  txt/../pos/10_1101-2021_02_09_430036.pos
10_1101-2021_02_08_430070  txt/../pos/10_1101-2021_02_08_430070.pos
10_1101-2020_09_23_310276  txt/../pos/10_1101-2020_09_23_310276.pos
10_1101-2021_02_09_430405  txt/../pos/10_1101-2021_02_09_430405.pos
10_1101-2021_02_08_430275  txt/../pos/10_1101-2021_02_08_430275.pos
10_1101-2021_02_12_430963  txt/../pos/10_1101-2021_02_12_430963.pos
10_1101-2021_02_10_430367  txt/../pos/10_1101-2021_02_10_430367.pos
10_1101-2021_02_10_430656  txt/../pos/10_1101-2021_02_10_430656.pos
10_1101-2021_02_08_430270  txt/../pos/10_1101-2021_02_08_430270.pos
10_1101-2021_02_11_430847  txt/../pos/10_1101-2021_02_11_430847.pos
10_1101-2020_09_23_308239  txt/../pos/10_1101-2020_09_23_308239.pos
10_1101-2021_02_12_430739  txt/../pos/10_1101-2021_02_12_430739.pos
10_1101-2021_02_12_430923  txt/../pos/10_1101-2021_02_12_430923.pos
10_1101-2021_02_12_430989  txt/../pos/10_1101-2021_02_12_430989.pos
10_1101-2021_02_08_428881  txt/../pos/10_1101-2021_02_08_428881.pos
10_1101-2021_02_12_430830  txt/../pos/10_1101-2021_02_12_430830.pos
10_1101-2021_02_10_430623  txt/../pos/10_1101-2021_02_10_430623.pos
10_1101-2020_11_17_386649  txt/../pos/10_1101-2020_11_17_386649.pos
10_1101-2021_02_12_430764  txt/../pos/10_1101-2021_02_12_430764.pos
10_1101-2020_02_04_934216  txt/../pos/10_1101-2020_02_04_934216.pos
10_1101-2021_02_11_430695  txt/../pos/10_1101-2021_02_11_430695.pos
10_1101-2021_02_11_430871  txt/../pos/10_1101-2021_02_11_430871.pos
10_1101-2021_02_12_430979  txt/../pos/10_1101-2021_02_12_430979.pos
10_1101-2021_02_13_429885  txt/../pos/10_1101-2021_02_13_429885.pos
10_1101-2021_02_11_430789  txt/../pos/10_1101-2021_02_11_430789.pos
10_1101-2021_02_10_430563  txt/../pos/10_1101-2021_02_10_430563.pos
10_1101-2021_02_10_430649  txt/../pos/10_1101-2021_02_10_430649.pos
10_1101-2020_09_21_305516  txt/../pos/10_1101-2020_09_21_305516.pos
10_1101-2021_02_09_430550  txt/../pos/10_1101-2021_02_09_430550.pos
10_1101-2021_02_08_430343  txt/../pos/10_1101-2021_02_08_430343.pos
10_1101-698605  txt/../pos/10_1101-698605.pos
10_1101-2021_02_01_429246  txt/../pos/10_1101-2021_02_01_429246.pos
10_1101-2021_02_10_430606  txt/../pos/10_1101-2021_02_10_430606.pos
10_1101-2021_02_08_430280  txt/../pos/10_1101-2021_02_08_430280.pos
10_1101-2020_09_02_279521  txt/../pos/10_1101-2020_09_02_279521.pos
10_1101-2021_02_10_430512  txt/../pos/10_1101-2021_02_10_430512.pos
10_1101-2020_10_08_327718  txt/../pos/10_1101-2020_10_08_327718.pos
10_1101-2021_02_09_430363  txt/../pos/10_1101-2021_02_09_430363.pos
10_1101-2021_02_10_430705  txt/../pos/10_1101-2021_02_10_430705.pos
10_1101-727867  txt/../pos/10_1101-727867.pos
10_1101-2020_01_28_923532  txt/../pos/10_1101-2020_01_28_923532.pos
10_1101-2021_02_09_430460  txt/../pos/10_1101-2021_02_09_430460.pos
10_1101-2021_02_11_430762  txt/../pos/10_1101-2021_02_11_430762.pos
10_1101-2021_02_09_430536  txt/../pos/10_1101-2021_02_09_430536.pos
10_1101-2021_02_10_430604  txt/../wrd/10_1101-2021_02_10_430604.wrd
10_1101-2020_05_15_090266  txt/../wrd/10_1101-2020_05_15_090266.wrd
10_1101-2021_02_09_430036  txt/../wrd/10_1101-2021_02_09_430036.wrd
10_1101-2021_02_12_431018  txt/../wrd/10_1101-2021_02_12_431018.wrd
10_1101-2021_02_11_430806  txt/../wrd/10_1101-2021_02_11_430806.wrd
10_1101-2020_12_24_424317  txt/../wrd/10_1101-2020_12_24_424317.wrd
10_1101-2021_02_08_430070  txt/../wrd/10_1101-2021_02_08_430070.wrd
10_1101-2021_02_10_430367  txt/../wrd/10_1101-2021_02_10_430367.wrd
10_1101-2021_02_10_430619  txt/../wrd/10_1101-2021_02_10_430619.wrd
10_1101-2021_02_11_430847  txt/../wrd/10_1101-2021_02_11_430847.wrd
10_1101-2021_02_10_430656  txt/../wrd/10_1101-2021_02_10_430656.wrd
10_1101-2020_09_23_310276  txt/../wrd/10_1101-2020_09_23_310276.wrd
10_1101-2021_02_12_430830  txt/../wrd/10_1101-2021_02_12_430830.wrd
10_1101-2021_02_12_430989  txt/../wrd/10_1101-2021_02_12_430989.wrd
10_1101-2021_02_08_430275  txt/../wrd/10_1101-2021_02_08_430275.wrd
10_1101-2021_02_12_430739  txt/../wrd/10_1101-2021_02_12_430739.wrd
10_1101-2021_02_11_430695  txt/../wrd/10_1101-2021_02_11_430695.wrd
10_1101-2021_02_12_430923  txt/../wrd/10_1101-2021_02_12_430923.wrd
10_1101-2021_02_12_430963  txt/../wrd/10_1101-2021_02_12_430963.wrd
10_1101-2021_02_09_430405  txt/../wrd/10_1101-2021_02_09_430405.wrd
10_1101-2021_02_08_430270  txt/../wrd/10_1101-2021_02_08_430270.wrd
10_1101-2020_02_04_934216  txt/../wrd/10_1101-2020_02_04_934216.wrd
10_1101-2021_02_10_430649  txt/../wrd/10_1101-2021_02_10_430649.wrd
10_1101-2021_02_12_430764  txt/../wrd/10_1101-2021_02_12_430764.wrd
10_1101-2020_11_17_386649  txt/../wrd/10_1101-2020_11_17_386649.wrd
10_1101-2021_02_11_430789  txt/../wrd/10_1101-2021_02_11_430789.wrd
10_1101-2020_09_23_308239  txt/../wrd/10_1101-2020_09_23_308239.wrd
10_1101-2021_02_12_430979  txt/../wrd/10_1101-2021_02_12_430979.wrd
10_1101-2021_02_10_430623  txt/../wrd/10_1101-2021_02_10_430623.wrd
10_1101-2021_02_11_430871  txt/../wrd/10_1101-2021_02_11_430871.wrd
10_1101-2021_02_08_428881  txt/../wrd/10_1101-2021_02_08_428881.wrd
10_1101-2021_02_10_430563  txt/../wrd/10_1101-2021_02_10_430563.wrd
10_1101-2021_02_13_429885  txt/../wrd/10_1101-2021_02_13_429885.wrd
10_1101-2021_02_08_430343  txt/../wrd/10_1101-2021_02_08_430343.wrd
10_1101-2021_02_10_430606  txt/../wrd/10_1101-2021_02_10_430606.wrd
10_1101-2020_09_02_279521  txt/../wrd/10_1101-2020_09_02_279521.wrd
10_1101-2021_02_10_430512  txt/../wrd/10_1101-2021_02_10_430512.wrd
10_1101-2020_09_21_305516  txt/../wrd/10_1101-2020_09_21_305516.wrd
10_1101-698605  txt/../wrd/10_1101-698605.wrd
10_1101-2020_10_08_327718  txt/../wrd/10_1101-2020_10_08_327718.wrd
10_1101-2021_02_08_430280  txt/../wrd/10_1101-2021_02_08_430280.wrd
10_1101-2021_02_10_430705  txt/../wrd/10_1101-2021_02_10_430705.wrd
10_1101-2021_02_09_430363  txt/../wrd/10_1101-2021_02_09_430363.wrd
10_1101-2021_02_09_430550  txt/../wrd/10_1101-2021_02_09_430550.wrd
10_1101-2021_02_01_429246  txt/../wrd/10_1101-2021_02_01_429246.wrd
10_1101-2021_02_11_430762  txt/../wrd/10_1101-2021_02_11_430762.wrd
10_1101-2021_02_09_430460  txt/../wrd/10_1101-2021_02_09_430460.wrd
10_1101-2020_01_28_923532  txt/../wrd/10_1101-2020_01_28_923532.wrd
10_1101-727867  txt/../wrd/10_1101-727867.wrd
10_1101-2021_02_09_430536  txt/../wrd/10_1101-2021_02_09_430536.wrd
Done mapping.
Reducing neuroscience-from-bioarxiv
=== reduce.pl bib ===
         id = 10_1101-2020_09_21_305516
     author = Nikolic, Ana
      title = Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer
       date = 2021
      pages = 32
  extension = .pdf
       mime = application/pdf
      words = 10376
  sentences = 1280
     flesch = 67
    summary = Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer 1 Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer 1 uses single-cell epigenomic data to infer copy number variants (CNVs) that define cancer cells. We have tested the ability of Copy-scAT to use scATAC data to call CNVs with three different approaches 100 genome sequencing (WGS) data for adult GBM (aGBM) surgical resections (n = 4 samples, 3,647 cells). adult GBM samples identified using both methods, versus total numbers of gains detected by scATAC or 160 Number of chromosome-arm level gains detected in adult GBM samples identified using both methods, 163 (c) Multiple myeloma samples were profiled by both scATAC and the single-cell CNV assay. chromosome-arm level gains detected in adult GBM samples identified using both methods, versus total 166 CNVs are detected in scATAC clusters with Copy-scAT in pediatric GBM samples.
      cache = ./cache/10_1101-2020_09_21_305516.pdf
       txt  = ./txt/10_1101-2020_09_21_305516.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_10_430649
     author = Wen, Zi-Hang
      title = Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data
       date = 2021
      pages = 11
  extension = .pdf
       mime = application/pdf
      words = 8418
  sentences = 1302
     flesch = 71
    summary = Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data Recovering dropout events in a sparse gene expression matrix for scRNA-seq data is a long-standing matrix completion We introduce Bfimpute, a Bayesian factorization imputation algorithm that reconstructs two latent gene and cell matrices to impute final gene expression matrix within each cell group, with or without the aid of cell type labels or bulk Bfimpute achieves better accuracy than other six publicly notable scRNA-seq imputation methods on simulated Key words: single cell; RNA-seq; imputation; Bayesian factorization impute dropout events by adopting the bulk RNA-seq data imputation of single cell RNA-seq data could be applied by Bfimpute recovers dropout values and improves cell type identification in the simulated data. and the imputed data by Bfimpute, scImpute, and DrImpute for the human embryonic stem cell differentiation study. imputation method scimpute for single-cell rna-seq data.
      cache = ./cache/10_1101-2021_02_10_430649.pdf
       txt  = ./txt/10_1101-2021_02_10_430649.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_12_431018
     author = Truong Nguyen, Phuoc
      title = HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences.
       date = 2021
      pages = 14
  extension = .pdf
       mime = application/pdf
      words = 3786
  sentences = 502
     flesch = 61
    summary = HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences. HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage 2 Several new variants of SARS-CoV-2 have emerged globally, of which the 18 based assemblies on raw SARS-CoV-2 sequences in addition to identifying lineages to detect 26 variants of concern, we have developed an open source bioinformatic pipeline called HaVoC 27 monitor the spread of SARS-CoV-2 variants of concern during local outbreaks. currently being used in Finland for monitoring the spread of SARS-CoV-2 variants. SARS-CoV2, variant detection, reference assembly, lineage identification, coronavirus, 40 surveillance of virus variants by sequencing the SARS-CoV-2 genomes would provide a fast 80 to query SARS-CoV-2 fastq sequence libraries and assigns lineages to them individually in 92 processing and a reference genome of SARS-CoV-2 in a separate FASTA file. The likelihood of emergence of novel SARS-CoV-2 variants of concern is increased and 209 Emerging SARS-CoV-2 Variants.
      cache = ./cache/10_1101-2021_02_12_431018.pdf
       txt  = ./txt/10_1101-2021_02_12_431018.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_11_430847
     author = Pinatti, Lisa M.
      title = SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer
       date = 2021
      pages = 26
  extension = .pdf
       mime = application/pdf
      words = 6849
  sentences = 788
     flesch = 57
    summary = SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer squamous cell carcinomas; however, the impact of HPV integration into the host human genome SearcHPV uncovered HPV integration sites adjacent to known cancer-related detection of HPV-human integration sites from targeted capture DNA sequencing data. developed a novel HPV integration detection tool for targeted capture sequencing data, which we SearcHPV showed a high frequency of HPV16 integration with a total of six events in UM-SCCIn this study, SearcHPV also called HPV integration sites within TP63. HPV integration sites have been associated with structural variations in the human genome3, 8, 37, which supports an additional genetic mechanism as to why HPV integration sites Genome-wide analysis of HPV integration in human and their integration sites in host genomes through next generation sequencing data. identify viruses and their integration sites using next-generation sequencing of human cancer 
      cache = ./cache/10_1101-2021_02_11_430847.pdf
       txt  = ./txt/10_1101-2021_02_11_430847.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_12_430963
     author = Gerber, Stefan
      title = Streamlining differential exon and 3' UTR usage with diffUTR
       date = 2021
      pages = 17
  extension = .pdf
       mime = application/pdf
      words = 6710
  sentences = 896
     flesch = 62
    summary = adenylation site databases to enable differential 3' UTR usage analysis. Conclusions: diffUTR enables differential 3' UTR analysis and more generally facilitates DEU9 Popular bin-based DEU methods are provided by the limma [25,24], edgeR [23] and DEXSeq [22]41 Bins are prepared from various types of gene annotations as well as, optionally, additional APA-driven segmentation and extension, then read counts among statistically-significant genes, especially for bins with a higher expression (Figure 3A).78 diffUTR provides three main plot types to explore differential bin usage analyses, each with a88 Plotted are the UTR bins found statistically significant (binand gene-level FDR deuBinPlot (Figure 4B) provides bin-level statistic plots for a given gene, similar to those99 than CDS bins, including counts of 3' UTR when calculating overall gene expression could under-121 diffUTR streamlines DEU analysis and outperforms alternative methods in inferring UTR changes,127 For differential UTR analysis, gene-level results are ob-206
      cache = ./cache/10_1101-2021_02_12_430963.pdf
       txt  = ./txt/10_1101-2021_02_12_430963.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_11_430871
     author = Vadnais, David
      title = ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data
       date = 2021
      pages = 24
  extension = .pdf
       mime = application/pdf
      words = 10071
  sentences = 1053
     flesch = 63
    summary = ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data chromosome and genome structure reconstruction from Hi-C data using Particle Swarm Optimization approach chromosome bin, according to the particle swarm algorithm, and then iterates its position towards a global best This paper presents ParticleChromo3D, a new distance-based algorithm for chromosome 3D structure The structures generated by ParticleChromo3D also shows that the result at swarm size Structures generated by ParticleChromo3D at different swarm size values. obtained by comparing the ParticleChromo3D algorithm's output structure to the simulated dataset's true plot of ParticleChromo3D SCC performance on 500KB GM12878 cell Hi-C data for chromosome 1 to 23. plot of ParticleChromo3D SCC performance on 500KB GM12878 cell Hi-C data for chromosome 1 to 23. chromosome 3D structure reconstruction algorithms on the GM12878 data set at both the 1MB and 500KB chromosome and genome structures reconstructed from Hi-C data.
      cache = ./cache/10_1101-2021_02_11_430871.pdf
       txt  = ./txt/10_1101-2021_02_11_430871.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_13_429885
     author = Househam, Jacob
      title = A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing
       date = 2021
      pages = 36
  extension = .pdf
       mime = application/pdf
      words = 10584
  sentences = 1257
     flesch = 49
    summary = know tumour purity and the ploidy of a CNA segment, then the VAF mutations mapped A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing.
      cache = ./cache/10_1101-2021_02_13_429885.pdf
       txt  = ./txt/10_1101-2021_02_13_429885.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_12_430830
     author = Gergely, Tibély
      title = Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data
       date = 2021
      pages = 19
  extension = .pdf
       mime = application/pdf
      words = 8181
  sentences = 793
     flesch = 68
    summary = Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data widely available bulk sequencing data where mutations from individual cells are and genomic mutation rate from bulk sequencing data. based on the maximum likelihood estimation of the parameters of a generative model of tumor growth and mutations. human hepatocellular carcinoma sample reveals an elevated per cell division mutation rate and high cell turnover. Due to the limitations of bulk sequencing, which only essays mutation frequencies for a population of cells from each tumor sample and does not The estimation is based on a maximum likelihood fit of the parameters of a birth-death model to the measured mutant and be estimated from readcount data, to separate the effects of the mutation rate We use pre-generated division trees from the ELynx suite at predetermined turnover rate values. Using the turnover rate, we also estimated the number of cell
      cache = ./cache/10_1101-2021_02_12_430830.pdf
       txt  = ./txt/10_1101-2021_02_12_430830.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_12_430739
     author = Malekian, Negin
      title = Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli
       date = 2021
      pages = 13
  extension = .pdf
       mime = application/pdf
      words = 7516
  sentences = 1093
     flesch = 70
    summary = Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli Here, we systematically screen for candidate quinolone resistance-conferring mutations. coli and performed a genome-wide association study (GWAS) correlating over 200,000 mutations against quinolone resistance phenotypes. significant mutations including one located at the active site of the biofilm dispersal genes bdcA and six silent In summary, we demonstrate that GWAS effectively and comprehensively identifies resistance mutations Keywords: E Coli; Quinolone; Antibiotic Resistance; Genome-Wide Association Study (GWAS) direct route to resistance is mutations in the drug targets gyrA and parC. In summary, we aim to show that a bacterial genomewide association study can effectively and comprehensively identify targets relevant to antibiotic resistance. Based on representative resistance phenotypes, the authors selected 103 isolates for sequencing with Illumina MiSeq, 92 of which are available from coli bdcA may act indirectly on antibiotic resistance.
      cache = ./cache/10_1101-2021_02_12_430739.pdf
       txt  = ./txt/10_1101-2021_02_12_430739.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_12_430764
     author = Ascensión, Alex M.
      title = Triku: a feature selection method based on nearest neighbors for single-cell data
       date = 2021
      pages = 18
  extension = .pdf
       mime = application/pdf
      words = 9518
  sentences = 1135
     flesch = 64
    summary = Triku: a feature selection method based on nearest neighbors for single-cell data Triku is a feature selection method that favours genes defining the main Single-cell RNA sequencing (scRNA-seq) is a powerful technology to study the biological heterogeneity of tissues at the individual cell level, allowing the characterization of new cell populations and cell states–i.e. cell types responding to different scRNA-seq datasets are multidimensional, i.e. the expression profile per cell consists of multiple genes. feature selection method: 1) the ability to recover basic dataset structure (main cell low, meaning that features selected with the different methods yielded clustering solutions that were quite similar to the manually-labeled cell types, although there are We first studied the expression pattern of genes selected by triku and other methods, To evaluate the cluster expression of selected genes in benchmarking datasets, for proteins within the genes selected by different FS methods in the two sets of benchmarking datasets.
      cache = ./cache/10_1101-2021_02_12_430764.pdf
       txt  = ./txt/10_1101-2021_02_12_430764.txt
=== reduce.pl bib ===
         id = 10_1101-2020_01_28_923532
     author = Ahmadi, Saba
      title = The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective
       date = 2021
      pages = 46
  extension = .pdf
       mime = application/pdf
      words = 16705
  sentences = 1572
     flesch = 66
    summary = We focus our analysis on genes encoding protein targets that encode receptors on the cell all "modular", including one part that specifically targets the tumor cell via one gene/protein and MadHitter and each patient receives an optimal personalized combination of targeted therapies from a prespecified set (pill bottle). Cohort and Individual Target Set Sizes as Functions of Tumor Killing and Given the single-cell tumor data sets and the ILP optimization framework described above, we filtering as this threshold is decreased), decreases the size of the target cell surface receptor gene heterogeneity of the cancer, number of patients within the data set, size of target gene set, lack of used for filtering the gene set to avoid targeting non-cancerous tissues. the genes in the optimal target set, the expression of that gene in that non-tumor cell exceeds the set of genes which is known to be targetable to cell 𝐶.
      cache = ./cache/10_1101-2020_01_28_923532.pdf
       txt  = ./txt/10_1101-2020_01_28_923532.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_11_430762
     author = Schäffer, Alejandro A.
      title = Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation
       date = 2021
      pages = 28
  extension = .pdf
       mime = application/pdf
      words = 16496
  sentences = 1489
     flesch = 65
    summary = Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation alignments of SSU, LSU and 5S rRNA from all three domains as well as from organelles, along with secondary structure predictions for selected sequences. Ribovore software package for the analysis of SSU rRNA and LSU rRNA sequences 18S SSU rRNA database of 1091 sequences was updated most recently on September 27, 2018 by running version 0.28 of the Ribovore program ribodbmaker on an input set of 579,279 GenBank sequences returned from the eukaryotic SSU rRNA The results of ribotyper and rRNA sensor are combined and each sequence is separated into one of four outcome classes depending on whether it passed or failed each input a set of candidate sequences and a specified rRNA model (e.g. SSU.Bacteria) two blastn databases: one of 1267 bacterial and archaeal 16S SSU rRNA sequences
      cache = ./cache/10_1101-2021_02_11_430762.pdf
       txt  = ./txt/10_1101-2021_02_11_430762.txt
=== reduce.pl bib ===
         id = 10_1101-2020_09_23_308239
     author = Schultz, Bruce T
      title = The COVID-19 PHARMACOME: A method for the rational selection of drug repurposing candidates from multimodal knowledge harmonization
       date = 2021
      pages = 31
  extension = .pdf
       mime = application/pdf
      words = 8797
  sentences = 1318
     flesch = 57
    summary = The COVID-19 PHARMACOME: A method for the rational selection of drug repurposing COVID-19 PHARMACOME, a comprehensive drug-target-mechanism graph generated from a initial version of the COVID-19 PHARMACOME, a comprehensive drug-target-mechanism graph representing COVID-19 pathophysiology mechanisms that includes both drug targets Figure 3: Overlap of compound hits between different drug repurposing screening experiments. space overlap between different COVID-19 drug repurposing screenings. The COVID-19 PHARMACOME associates pathways derived from drug repurposing targets Figure 4 shows the distribution of repurposing drugs in the COVID-19 cause-and-effect graph, overlap analysis allows for the identification of repurposing drugs targeting mechanisms that Virus-response mechanisms are targets for repurposing drugs Figure 5: Visualization of drug repurposing candidates (and their targets) used in combination treatment as our own drug repurposing screening results, we were able to identify mechanisms targeted COVID-19 PHARMACOME, we are now able to link repurposing drugs, their targets and the SARS-CoV-2 protein interaction map reveals targets for drug repurposing.
      cache = ./cache/10_1101-2020_09_23_308239.pdf
       txt  = ./txt/10_1101-2020_09_23_308239.txt
=== reduce.pl bib ===
         id = 10_1101-2020_09_23_310276
     author = Greenfest-Allen, Emily
      title = NIAGADS Alzheimer's GenomicsDB: A resource for exploring Alzheimer's Disease genetic and genomic knowledge
       date = 2021
      pages = 19
  extension = .pdf
       mime = application/pdf
      words = 5987
  sentences = 592
     flesch = 52
    summary = The NIAGADS Alzheimer's Genomics Database (GenomicsDB) is an interactive knowledgebase for Alzheimer's disease (AD) genetics that provides access to GWAS summary statistics datasets The website makes available >70 genome-wide summary statistics datasets from GWAS and efficient real-time data analysis and variant or gene report generation. Gene reports provide summaries of co-located ADRD risk-associated variants and have pages linking summary statistics to variant and gene annotations, this resource makes these summary statistics available for browsing (on dataset, gene, and variant reports and as genome NIAGADS GenomicsDB variant reports and a track is available on the genome browser. The NIAGADS GenomicsDB includes allele frequency data from 1000 Genomes (phase 3, version visualizations for summarizing search results and annotations in gene and variant reports. compare NIAGADS GWAS summary statistics tracks to each other, against annotated gene or A detailed report is provided for each of the GWAS summary statistics and ADSP meta-analysis 
      cache = ./cache/10_1101-2020_09_23_310276.pdf
       txt  = ./txt/10_1101-2020_09_23_310276.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_11_430806
     author = Badaczewska-Dawid, Aleksandra
      title = BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences
       date = 2021
      pages = 3
  extension = .pdf
       mime = application/pdf
      words = 2698
  sentences = 301
     flesch = 53
    summary = BIAPSS BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences web platform named BIAPSS (BioInformatic Analysis of liquidliquid Phase-Separating protein Sequences) which offers the users interactive data analytic tools for facilitating the discovery of statistically significant sequence signals for proteins with Phase-Separating protein Sequences. The objective of BIAPSS is to enable a rapid and on-the-fly deep statistical analysis of LLPS-driver proteins using the pool of sequences with The comparison to benchmarks of various protein groups enables statistical inference of specific phase-separating affinities. Furthermore, the residue-resolution biophysical regularities inferred from BIAPSS will help not only to accurately identify regions prone to phase separation but also to design sequence modifications targeting various biomedical applications. for comprehensive sequence-based analysis of LLPS proteins. the driving forces for phase separation of prion-like RNA binding proteins. disordered protein regions encode a driving force for liquid-liquid phase separation? of proteins driving liquid-liquid phase separation.
      cache = ./cache/10_1101-2021_02_11_430806.pdf
       txt  = ./txt/10_1101-2021_02_11_430806.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_11_430695
     author = Gordon-Rodriguez, Elliott
      title = Learning Sparse Log-Ratios for High-Throughput Sequencing Data
       date = 2021
      pages = 12
  extension = .pdf
       mime = application/pdf
      words = 7973
  sentences = 817
     flesch = 60
    summary = Log-ratios are an important class of features for analyzing high-throughput sequencing (HTS) metagenomic data for HTS data, and more generally, high-dimensional CoDa. Unlike existing methods, CoDaCoRe is simultaneously scalable, interpretable, sparse, and accurate. unlabelled datasets, {xi}ni=1, as a method for identiLearning Sparse Log-Ratios for High-Throughput Sequencing Data CoDaCoRe variable selection for the first (most explanatory) log-ratio on the Crohn disease data (Rivera-Pinto et al., 2018). more generally, in the field of CoDa. Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data
      cache = ./cache/10_1101-2021_02_11_430695.pdf
       txt  = ./txt/10_1101-2021_02_11_430695.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_12_430979
     author = Da Silva, Kévin
      title = StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs
       date = 2021
      pages = 20
  extension = .pdf
       mime = application/pdf
      words = 10624
  sentences = 992
     flesch = 66
    summary = StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs results show that StrainFLAIR was able to distinguish and estimate the abundances of close strains, as approaches to handle multiple similar genomes as with strains use gene clustering and then select the64 StrainFLAIR assigns and estimates species and strain abundances of a bacterial metagenomic sample graph, called the "node abundance", is computed, first focusing on unique mapped reads (first step). Strain-level abundances are then obtained by exploiting the specific genes of each reference genome188 from the reference variation graph thus simulating a new strain to be identified and quantified.231 strains from a sequenced sample, mapped onto this graph.343 Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or
      cache = ./cache/10_1101-2021_02_12_430979.pdf
       txt  = ./txt/10_1101-2021_02_12_430979.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_12_430989
     author = Sofer, Tamar
      title = Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies
       date = 2021
      pages = 27
  extension = .pdf
       mime = application/pdf
      words = 8136
  sentences = 626
     flesch = 48
    summary = Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies as well as linear regression-based analyses for studying the association of continuous exposures generation of empirical null distribution of association p-values, and we apply the pipeline to Many studies of phenotypes associated with gene expression from RNA-seq consist of small Residual permutation approach for simulations and for empirical p-value computation covariates, and outcome distributions; and (b) their relationships, aside from the exposureoutcome association, are the same as in the real data, we used a residual permutation approach. association studies applied to residual permutations were included to compute empirical papproach to study the distribution of p-values under the null of no association between the phenotypes and RNA-seq, and used this approach to further study power, and to compute approaches for transcriptome-wide analysis of RNA-seq in population-based studies, including more comprehensive study of statistical permutation approaches for RNA-seq association 
      cache = ./cache/10_1101-2021_02_12_430989.pdf
       txt  = ./txt/10_1101-2021_02_12_430989.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_12_430923
     author = Modi, Vivek
      title = Kincore: a web resource for structural classification of protein kinases and their inhibitors
       date = 2021
      pages = 18
  extension = .pdf
       mime = application/pdf
      words = 7913
  sentences = 666
     flesch = 62
    summary = Kincore: a web resource for structural classification of protein kinases and their inhibitors result, among the DFGin structures, we distinguished between the catalytically active kinase conformation pages for kinase phylogenetic groups, genes, conformational labels, PDBids, ligands and ligand types. options to download data – database tables as a tab separated files; the kinase structures as PyMOL Kincore provides conformational assignments and ligand type labels to protein kinase structures from Figure 1: Representative protein kinase structure (3ETA_A) displaying the residues used to define inhibitor The distribution of different ligand types across kinase conformations is provided in Table 1. Table 1: Distribution of ligand types across protein kinase conformations (Number of chains). including conformational and ligand type labels and C-helix position, kinase family, gene name, Uniprot provides the number of kinase chains in the group across different conformations with their Database table provides the list of all the PDB chains with conformational labels and ligand 
      cache = ./cache/10_1101-2021_02_12_430923.pdf
       txt  = ./txt/10_1101-2021_02_12_430923.txt
=== reduce.pl bib ===
         id = 10_1101-727867
     author = Tangherloni, Andrea
      title = scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data
       date = 2021
      pages = 28
  extension = .pdf
       mime = application/pdf
      words = 15281
  sentences = 2865
     flesch = 72
    summary = scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data This computational tool allows for coupling low-dimensional probabilistic representation of gene expression data with the downstream analysis to consider the Finally, the currently available AEs cannot be directly exploited to obtain the latent space or to generate synthetic cells. to show the cells in this embedded space or as a starting point for other dimensionality reduction approaches (e.g., t-SNE and UMAP) as well as downstream analyses Non-linear approaches for dimensionality reduction can be effectively used to capture the non-linearities among the gene interactions that may exist in the highdimensional expression space of scRNA-Seq data [16]. be effectively applied to analyse disparate types of single-cell data from different flexible method developed to cluster single-cell data; (ii) a centroid is calculated batch-effect correction methods for single-cell rna sequencing data. Wang, D., Gu, J.: VASC: dimension reduction and visualization of single-cell RNA-seq data by deep
      cache = ./cache/10_1101-727867.pdf
       txt  = ./txt/10_1101-727867.txt
=== reduce.pl bib ===
         id = 10_1101-2020_12_24_424317
     author = Muazzam, Fariha
      title = Multi-class Cancer Classification and Biomarker Identification using Deep Learning
       date = 2021
      pages = 12
  extension = .pdf
       mime = application/pdf
      words = 4252
  sentences = 426
     flesch = 57
    summary = classification, feature extraction and relevant gene identification through deep learning methods for 12 This research picks up from detection of different types of cancer RNA-Seq expressions using deep neural classification of gene expression profiles for different kinds of cancers. Hence, the effectiveness of deep learning models for feature extraction and relevant gene identification is performed revealing substantial results and they produced five high-ranked gene sets and reduced feature This study was aimed at classifying 12 types of cancer and identifying relevant genes and the results show were able to identify cancer-relevant pathways and genes for the sets, that different experiments generated, A deep learning approach for cancer detection and relevant gene Tumor gene expression data classification via sample expansionbased deep learning. Identification of a multi-cancer gene expression Multi-class Cancer Classification and Biomarker Identification using Deep Learning Multi-class Cancer Classification and Biomarker Identification using Deep Learning
      cache = ./cache/10_1101-2020_12_24_424317.pdf
       txt  = ./txt/10_1101-2020_12_24_424317.txt
=== reduce.pl bib ===
         id = 10_1101-2020_10_08_327718
     author = Jambor, Helena
      title = Creating Clear and Informative Image-based Figures for Scientific Publications
       date = 2021
      pages = 36
  extension = .pdf
       mime = application/pdf
      words = 12824
  sentences = 1189
     flesch = 56
    summary = journals in three fields; plant sciences, cell biology and physiology (n=580 papers). figures were uncommon (physiology 16%, cell biology 12%, plant sciences 2%). among papers published in top journals in plant sciences, cell biology and physiology. contained images (plant science: 68%, cell biology: 72%, physiology: 55%). in physiology (49%) and cell biology (55%), and 28% of plant science papers provided and 29% of plant sciences papers contained no scale information on any image. Some publications use insets to show the same image at two different scales (cell Figure 1: Image types and reporting of scale information and insets physiology and plant science papers contained some images that were inaccessible to B: Most papers explain colors in image-based figures, however, explanations are less Figure 4: Using scale bars to annotate image size Creating clear and informative image-based figures for scientific publications. Creating clear and informative image-based figures for scientific publications.
      cache = ./cache/10_1101-2020_10_08_327718.pdf
       txt  = ./txt/10_1101-2020_10_08_327718.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_09_430550
     author = Song, Dongyuan
      title = scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
       date = 2021
      pages = 37
  extension = .pdf
       mime = application/pdf
      words = 13512
  sentences = 1548
     flesch = 64
    summary = (scPNMF) method to select informative genes from scRNA-seq data in an unsupervised way. Therefore, for scRNA-seq data analysis, informative gene selection Besides scRNA-seq data analysis, informative gene selection is also crucial for designing number and a scRNA-seq dataset, scPNMF selects informative genes based on its weight matrix; First, the informative genes selected by scPNMF lead to the most accurate cell clustering. the informative genes and weight matrix of scPNMF lead to the best cell type prediction accuracy Figure 3: Benchmarking scPNMF against 11 informative gene selection methods on seven scRNA-seq (b) UMAP visualization of cells in the Zheng4 dataset based on 100 informative genes selected by We benchmark scPNMF against the 11 gene selection methods in terms of cell type prediction We propose scPNMF, an unsupervised gene selection and data projection method for scRNA-seq For cell type prediction, we project every targeted gene profiling dataset and its scRNA-seq
      cache = ./cache/10_1101-2021_02_09_430550.pdf
       txt  = ./txt/10_1101-2021_02_09_430550.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_10_430656
     author = Zakeri, Mohsen
      title = A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing
       date = 2021
      pages = 7
  extension = .pdf
       mime = application/pdf
      words = 6557
  sentences = 568
     flesch = 64
    summary = A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing benchmark comparing the kallisto-bustools pipeline (2) for single-cell demonstrate that, when configured to match the computational complexity of kallisto-bustools as closely as possible, alevin-fry processes Alevin-fry (3) is a new pipeline for single-cell RNA-seq benchmarking STARsolo (9), kallisto-bustools (2) and alevin-fry (3), out new tools like alevin-fry for the pre-processing of single-cell data, (1), we have now created a simple-to-follow tutorial for speedoptimized single-cell pre-processing using alevin-fry (https:// by Booeshaghi and Pachter (1) change when a like-for-like comparison between alevin-fry and kallisto-bustools is carried out, we The time and memory used by the relevant steps of the alevin-fry and kallisto-bustools pipelines for pre-processing the 20 diverse tagged-end single-cell RNA-seq datasets used in (1). A comparison of the resulting count matrices obtained from alevin-fry and kallisto-bustools, as run in this manuscript, for the pbmc_10k_v3 dataset. peak memory than alevin-fry, with the kallisto-bustools pipeline using
      cache = ./cache/10_1101-2021_02_10_430656.pdf
       txt  = ./txt/10_1101-2021_02_10_430656.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_10_430619
     author = Schutz, Sacha
      title = Cutevariant: a GUI-based desktop application to explore genetics variations
       date = 2021
      pages = 8
  extension = .pdf
       mime = application/pdf
      words = 4932
  sentences = 632
     flesch = 66
    summary = Cutevariant: a GUI-based desktop application to explore genetics variations Cutevariant is a user-friendly GUI based desktop application for genomic research designed to search for variations in DNA samples collected in annotated files and encoded in the Variant Calling Format. application imports data into a local relational database wherefrom complex filter-queries can be built either Key words: genomics, DNA variant, desktop application, Domain Specific Language, Graphic User Interface applications import the data from VCF files into an indexed Cutevariant imports data from VCF files into a normalized Fig. 2: The Cutevariant main view showing the variants list sub-window (middle), different controllers sub-windows but not all are Just like Variant Tools, Cutevariant supports operations Features Cutevariant BrowseVCF VCF-Miner VCF-Explorer VCF-Server VCF-Filters GEMINI Variant Tools SnpSift Comparaison of time performance between cutevariant and VCF-miner for importation and query execution. 3. Pablo Cingolani, Adrian Platts, Le Lily Wang, Melissa VCF-Miner: GUI-based application for mining variants
      cache = ./cache/10_1101-2021_02_10_430619.pdf
       txt  = ./txt/10_1101-2021_02_10_430619.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_11_430789
     author = Tyagin, Ilya
      title = Accelerating COVID-19 research with graph mining and transformer-based learning
       date = 2021
      pages = 9
  extension = .pdf
       mime = application/pdf
      words = 9408
  sentences = 807
     flesch = 58
    summary = Accelerating COVID-19 research with graph mining and transformer-based learning develop text mining techniques that can help the science community answer high-priority scientific questions related to COVID-19. is currently customized and available in the open domain to massively process COVID-19 related queries. Both systems are the next generation of the AGATHA knowledge network mining transformer model [37]. (1) Most of the existing HG systems are domain-specific (e.g., genedisease interactions) that is usually expressed in limiting the processed information (e.g., significant filtering vocabulary and papers a trained deep bi-LSTM model for extracting predicates from unstructured text. For instance, the node representing the entity "COVID-19" is connected to every sentence and predicate that The prior AGATHA semantic network only includes UMLS terms that appear in SemMedDB predicates [18] which is a major limitation. obtain embeddings per node in the semantic graph, we train AGATHA system ranking model.
      cache = ./cache/10_1101-2021_02_11_430789.pdf
       txt  = ./txt/10_1101-2021_02_11_430789.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_10_430512
     author = Kim, Catherine
      title = Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification
       date = 2021
      pages = 41
  extension = .pdf
       mime = application/pdf
      words = 11859
  sentences = 1137
     flesch = 56
    summary = into DDIs. In this study, a hierarchical machine learning model was created to predict DDIassociated ADRs and pharmacological insight thereof for any drug pair. drugs' chemical structures as inputs to predict their target, enzyme, and transporter (TET) Development of RFCs for Prediction of Target, Enzyme, and Transporter Profiles of Drugs Development of a Model for Prediction of DDI-associated ADRs from TET Profiles of Drugs ADR prediction from Target, Enzyme, and Transporter Profiles of Drug Pairs To predict ADRs of a drug pair from its TET profiles, Random Forest Classifier (RFC), Application of the SVM model for DDI-associated ADRs Involving Three Major Drugs through predicted PRR changes of drug pairs upon removal of each of the targets, enzymes, and changes of drug pairs were predicted by the model upon removal of each of the targets, enzymes, Target, enzyme, and transporter (TET) profiles of atorvastatin and concomitant drugs, 
      cache = ./cache/10_1101-2021_02_10_430512.pdf
       txt  = ./txt/10_1101-2021_02_10_430512.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_10_430705
     author = Stassen, Shobana V.
      title = VIA: Generalized and scalable trajectory inference in single-cell omics data
       date = 2021
      pages = 24
  extension = .pdf
       mime = application/pdf
      words = 13590
  sentences = 1383
     flesch = 53
    summary = 1 VIA: Generalized and scalable trajectory inference in single-cell omics data 1 VIA: Generalized and scalable trajectory inference in single-cell omics data 35 strategy to compute pseudotime, and reconstruct cell lineages based on lazy-teleporting random walks Step 1: Single-cell level graph is clustered such that each node 50 user defined start cell) is first computed by the expected hitting time for a lazy-teleporting random walk along an 57 network topology and single-cell level pseudotime/lineage probability properties onto an embedding using GAMs, as The cell fates and their lineage pathways are then computed by a two-stage probabilistic method, 94 graph-traversal allows it to infer cell fates when the underlying data spans combinations of multifurcating 201 detected cell fates annotated (o) lineage pathway and gene-pseudotime trend shown for the CD41 Megakaryocytic 259 Figure 3 VIA infers trajectories in single-cell multi-omic and image datasets (a) Major lineages of human Single cells are represented by graph nodes that are connected based on 
      cache = ./cache/10_1101-2021_02_10_430705.pdf
       txt  = ./txt/10_1101-2021_02_10_430705.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_10_430606
     author = Wei, Zheng
      title = NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks
       date = 2021
      pages = 31
  extension = .pdf
       mime = application/pdf
      words = 12013
  sentences = 1107
     flesch = 63
    summary = Each point is a decoupled motif generate by a sample set of sequence. Only the max activation value of the decoupled motifs in Fig. 3b are significantly higher than the decoupled motifs of other neurons in layer 3 of Basset-3 model. discovered (q-value < 0.001) from the neuron in convolutional output layer of Basset, BD-5 and BD-10 model. c, The number of motif discovered (q-value < 0.01) from the neuron in layer 3 of Basset model using different sub-patterns in the input feature map of the max pooling layer to split the sequences set of which are DNA-sequence based DCNN models with 3 general convolutional layers for stacking sequences of different synonymous motifs with the maximum activation value In summary, we presented NeuronMotif as an effective algorithm to reveal the cisregulatory motif grammar learned by DCNN model that use DNA sequence to annotate sequences indicate more synonymous motif mixture in this DCNN model.
      cache = ./cache/10_1101-2021_02_10_430606.pdf
       txt  = ./txt/10_1101-2021_02_10_430606.txt
=== reduce.pl bib ===
         id = 10_1101-698605
     author = Sarantopoulou, Dimitra
      title = Comparative evaluation of full-length isoform quantification from RNA-Seq
       date = 2021
      pages = 37
  extension = .pdf
       mime = application/pdf
      words = 12853
  sentences = 1332
     flesch = 55
    summary = Comparative evaluation of full-length isoform quantification from RNA-Seq Full-length isoform quantification from RNA-Seq is a key goal in transcriptomics analyses benchmarking, isoform quantification, simulated data, pseudo-alignment, RNA-Seq, short Given the difficulty in full-length isoform quantification, many RNA-Seq studies simply analysis performed on the known true isoform quantifications of the simulated data to the For the simulated data we started with 11 real RNA-Seq samples: six liver and six the isoform expression level using idealized and realistic simulated data, with full and true counts), for the set of expressed isoforms in sample 1 in C) idealized and D) realistic data. Method effect on differential expression analysis, using realistic data. Method effect on differential expression analysis, using realistic data. RSEM is a gene/isoform abundance tool for RNA-Seq data which uses a generative model S1 Fig. Method effect on full-length isoform quantification using simulated data. Method effect on full-length isoform quantification using simulated data.
      cache = ./cache/10_1101-698605.pdf
       txt  = ./txt/10_1101-698605.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_10_430563
     author = Bandrowski, Anita
      title = SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data
       date = 2021
      pages = 16
  extension = .pdf
       mime = application/pdf
      words = 10901
  sentences = 1026
     flesch = 48
    summary = investigators across the SPARC consortium that provide key details about organ-specific circuitry, including structural (BIDS), the SDS has been designed to capture the large variety of data generated by SPARC investigators who are description of the SPARC curation process and the automated tools for complying with the SDS, including the SDS validator and Software to Organize Data Automatically (SODA) for SPARC. required to organize their data files and metadata organized according to the SPARC Data Structure data according to the SPARC Dataset Structure. is the preferred file format for tabular data in SPARC, the Data files are organized into 3 different top-level folders, The organization structure of the files and folders for a SPARC dataset. https://github.com/SciCrunch/sparc-curation/releases/tag/dataset-template-1.2.3 https://github.com/SciCrunch/sparc-curation/releases/tag/dataset-template-1.2.3 investigators include folders that organize data along a from these subjects, data files are organized within fields, the curation team developed a SPARC Dataset files/folders, and share datasets with the SPARC 
      cache = ./cache/10_1101-2021_02_10_430563.pdf
       txt  = ./txt/10_1101-2021_02_10_430563.txt
=== reduce.pl bib ===
         id = 10_1101-2020_11_17_386649
     author = Danciu, Daniel
      title = Topology-based Sparsification of Graph Annotations
       date = 2021
      pages = 15
  extension = .pdf
       mime = application/pdf
      words = 8205
  sentences = 774
     flesch = 67
    summary = Experiments on 10,000 RNA-seq datasets show that RowDiff combined with MultiBRWT results in a 30% reduction in annotation footprint over Mantis-MST, the previously known most a binary matrix, where the k-mer set indexes the rows and each annotation label specifies a column. Starting from any vertex in the de Bruijn graph, Algorithm 1 defines a traversal leading to an anchor Each row in a RowDiff-transformed annotation matrix has the same or fewer set bits than A naı̈ve implementation of the RowDiff construction would be to load the matrix A in memory, and gradually replace its rows with their sparsified counterpart, while traversing the graph. We now note that, when querying annotations for paths in the graph, or sets of rows corresponding to vertices We constructed annotated de Bruijn graphs from the RNA-Seq data set in the same We now compare the representation size for RowDiff and other state-of-the-art graph annotation compression methods.
      cache = ./cache/10_1101-2020_11_17_386649.pdf
       txt  = ./txt/10_1101-2020_11_17_386649.txt
=== reduce.pl bib ===
         id = 10_1101-2020_05_15_090266
     author = Zhang, R.
      title = SpacePHARER: Sensitive identification of phages from CRISPR spacers in prokaryotic hosts
       date = 2021
      pages = 6
  extension = .pdf
       mime = application/pdf
      words = 2191
  sentences = 283
     flesch = 64
    summary = Summary: SpacePHARER (CRISPR Spacer Phage-Host Pair Finder) is a sensitive and fast tool for de novo prediction of phage-host relationships via identifying phage genomes that match CRISPR spacers in genomic or metagenomic data. SpacePHARER gains sensitivity by comparing spacers and phages at the protein level, optimizing its scores for matching SpacePHARER by searching a comprehensive spacer list against all complete phage genomes. methods compare individual CRISPR spacers with phage To increase sensitivity, (1) we compare protein coding sequences because phage genomes are mostly coding, and, (0) Preprocess input: scan the phage genome and CRISPR spacers in six ORFs q of CRISPR spacers extracted from one prokaryotic genome, and each target set T comprises the putative protein sequences t from a single phage. The performance of SpacePHARER was evaluated on the spacer test set against a target database predicted the correct host for more phages than BLASTN BLASTN in detecting phage-host pairs, due to searching
      cache = ./cache/10_1101-2020_05_15_090266.pdf
       txt  = ./txt/10_1101-2020_05_15_090266.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_01_429246
     author = Zheng, Hongyu
      title = Sequence-specific minimizers via polar sets
       date = 2021
      pages = 24
  extension = .pdf
       mime = application/pdf
      words = 15440
  sentences = 1407
     flesch = 71
    summary = minimizers focus on sampling fewer k-mers on a random sequence and use universal hitting sets (sets suggests, a UHS is a set of k-mers that "hits" every w-long window of every possible sequence (hence the the elements of the polar sets are in the sequence: the higher the energy, the more spread apart the k-mers have densities upper bounded by |U|/σk, because only k-mers from the universal hitting set can be selected. Section 2.2 gives a formal definition of the link energy of a polar set and Theorem 1 gives upper and lower bounds using this link energy for the density of a minimizer compatible with a polar set. form a link, which in turn is the number of k-mer pairs in the polar set that are exactly w bases away on S. A context is charged if the minimizer selects a different k-mer in the first window than in the second
      cache = ./cache/10_1101-2021_02_01_429246.pdf
       txt  = ./txt/10_1101-2021_02_01_429246.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_10_430623
     author = Aberasturi, Dillon
      title = “Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases
       date = 2021
      pages = 9
  extension = .pdf
       mime = application/pdf
      words = 9478
  sentences = 748
     flesch = 58
    summary = published S3-type N-of-1-pathways MixEnrich to two paired samples (e.g., diseased vs unaffected tissues) for determining patient-specific enriched genes sets: Odds Ratios (S3-OR) and S3-variance using these models to derive effect sizes and statistical significance in singlesubject studies of transcriptomes, these samples are isogenic or quasi-isogenic, and thus do not necessarily generalize to a group of subjects (cohort-level signal). The novel bioinformatic method identifies meaningful biomechanism differences between very small cohorts by using single-subject-study-derived effect sizes for gene sets. (B) For the generalized linear model-based analyses, we applied a different filtering process to the raw data where we eliminated all the transcripts with 0 counts for each subject and then calculated the coefficient 2.3 Description of the Generalized Linear Models and application of Inter-N-of-1 methods for small cohort comparison and their evaluation in the Breast Cancer Data the analysis of subsets of the TCGA Breast Cancer data, genes were declared differentially expressed if their abs(log2FC) > log2(1.2) and their FDR-adjusted p-value < 
      cache = ./cache/10_1101-2021_02_10_430623.pdf
       txt  = ./txt/10_1101-2021_02_10_430623.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_09_430405
     author = Quazi, Sameer
      title = <em>In-silico</em> Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD
       date = 2021
      pages = 23
  extension = .pdf
       mime = application/pdf
      words = 5941
  sentences = 1038
     flesch = 64
    summary = In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD including structure-based drug-like compounds screening from online databases, molecular The final small molecules of drug-like compounds would have more effective and selected for the molecular docking with FGI-103 antiviral drug-using AutoDock 4.2 software. After that, FGI-103 was set and screen other drug-like compounds from PubChem databases. The finally selected drug-like compounds were docked with the P1 site of VP35 of based on ap1 site for ligand in every dock for VP35 MARV utilizing a grid chart of 50 × 50 × 50 The ADMET properties of finally selected drug-like compounds were checked to utilize 2D molecules structure of selected drug-like compounds (A) represents the 2D The molecule structure of three drug-like compounds is shown in Figure 6. "In-Silico Structural and Molecular Docking-Based Drug Discovery "In-Silico Structural and Molecular Docking-Based Drug Discovery 
      cache = ./cache/10_1101-2021_02_09_430405.pdf
       txt  = ./txt/10_1101-2021_02_09_430405.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_10_430367
     author = Chen, Meili
      title = Genome Warehouse: A Public Repository Housing Genome-scale Data
       date = 2021
      pages = 18
  extension = .pdf
       mime = application/pdf
      words = 4875
  sentences = 656
     flesch = 66
    summary = Running title: Chen M et al / Genome Assembly Data Repository 21 Genomics Data Center (NGDC), part of the China National Center for Bioinformation 40 archive high-quality genome sequences and annotations, GWH is equipped with a 46 Collectively, GWH serves as an important resource for genome-scale data 51 https://bigd.big.ac.cn/) [13], the aim of GWH is to accept data submissions worldwide 78 GWH is a centralized resource housing genome-scale data, with the purpose to 105 GWH not only accepts genome assembly associated data through an on-line 111 GWH will assign a unique accession number to the submitted genome assembly upon 149 GWH provides data visualization for both genome 163 Collectively, GWH is a user-friendly portal for genome data submission, release, and 209 Database resources of the National Genomics Data 302 Genome assembly accession number is prefixed with "GWH", followed by four 334 Genome assembly accession number is prefixed with "GWH", followed by four 334 
      cache = ./cache/10_1101-2021_02_10_430367.pdf
       txt  = ./txt/10_1101-2021_02_10_430367.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_09_430536
     author = Lin, Cui-Xiang
      title = Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes
       date = 2021
      pages = 47
  extension = .pdf
       mime = application/pdf
      words = 20656
  sentences = 4864
     flesch = 79
    summary = Genome-wide prediction and integrative functional characterization of Alzheimer's disease-associated genes example, a module-trait network approach was proposed and applied to identify gene 63 functional enrichment-based approach to identify negative genes that are not likely 94 associated genes through an optimal selection of networks and machine learning 98 FGN, and prediction of AD-associated genes using machine learning models (Fig. 1). addition, we tested their enrichment in three AD-related gene sets associated with 122 The top-ranked genes are enriched in AD-associated functions and phenotypes 154 These results provide additional evidence that our predicted genes are associated with 194 The top-ranked genes are associated with AD based on miRNA-target networks 227 We investigated whether top-ranked genes were functionally related to AD-associated 229 We tested whether the top-ranked k genes were more likely to interact with AD-associated 576 related to AD-associated genes or miRNAs based on miRNA-target interaction networks.
      cache = ./cache/10_1101-2021_02_09_430536.pdf
       txt  = ./txt/10_1101-2021_02_09_430536.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_09_430363
     author = Bayer, Johanna M. M.
      title = Accommodating site variation in neuroimaging data using hierarchical and Bayesian models
       date = 2021
      pages = 20
  extension = .pdf
       mime = application/pdf
      words = 13439
  sentences = 1891
     flesch = 70
    summary = Accommodating site variation in neuroimaging data using hierarchical and Bayesian models The potential of normative modeling to make individualized predictions has led to structural neuroimaging results that go beyond the case-control approach. in a similar way for multi-site modeling in a pooled neuroimaging data set, which contained 7499 participants that org/abide/) data set to compare a non-linear, Gaussian version of the model, to a linear hierarchical Bayesian version and mathematical description of our approach to include site as predictor in a normative hierarchical Bayesian model. With the aim to create reliable normative models in multi-site neuroimaging data, we developed and compared two model is also able to capture non-linear effects between age and thickness of the cortical region ("Hierarchical Bayesian Gaussian Process term, which allows to model non-linear association between age and cortical thickness measures. The only models that perform better for most regions than the mean of the training data set are the Hierarchical Bayesian
      cache = ./cache/10_1101-2021_02_09_430363.pdf
       txt  = ./txt/10_1101-2021_02_09_430363.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_09_430460
     author = Banerjee, Shayantan
      title = Sequence neighborhoods enable reliable prediction of pathogenic mutations in cancer genomes
       date = 2021
      pages = 39
  extension = .pdf
       mime = application/pdf
      words = 15659
  sentences = 1731
     flesch = 40
    summary = experimentally validated cancer mutation data in this study, we explored various string-based evolutionary features resulted in the development of a pan-cancer mutation effect prediction Distinguishing between driver and passenger mutations from sequenced cancer genomes is a Recent studies have identified specific signatures or patterns of mutations in different cancer than passenger mutations and built probabilistic models to identify driver genes that had this study, missense mutations from 58 genes that were pan-cancer-based were combined from We used the same datasets to judge our model's ability to predict rare driver mutations based Driver and Passenger Mutations' Features Used to Train NBDriver are Significantly Although our method's focus was to identify missense driver mutations from sequenced cancer surrounding driver and passenger mutations obtained from sequenced cancer genomes. computational prediction of driver missense mutations," Cancer Res., vol. functionally validated cancer-related missense mutations," Genome Biology, vol. Figure 7: Differences in the distribution of features between driver and passenger mutations 
      cache = ./cache/10_1101-2021_02_09_430460.pdf
       txt  = ./txt/10_1101-2021_02_09_430460.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_08_430070
     author = Zhang, Yao-zhong
      title = On the application of BERT models for nanopore methylation detection
       date = 2021
      pages = 7
  extension = .pdf
       mime = application/pdf
      words = 5183
  sentences = 586
     flesch = 60
    summary = On the application of BERT models for nanopore methylation detection with deep learning models, have achieved significant performance improvements on nanopore methylation recurrent patterns of positional-signal-shift in the context window surrounding target 5-methylcytosine that the refined BERT model can achieve competitive or even better results than the state-of-the-art biRNN of datasets from the different research groups, BERT models demonstrate a good generalization Fig. 1: Basic BERT's and refined BERT's model structure used for methylation detection. a refined BERT model to take account of signal-shift patterns in the proposed refined BERT model achieves a competitive or even better result explore applying the BERT model for the nanopore methylation detection 2.2 Applying BERT models for nanopore methylation For the cross-sample evaluation, we train models on one dataset and test a BERT model to pay more attention to center positions. In-sample evaluation of different deep learning models on 5mC datasets.
      cache = ./cache/10_1101-2021_02_08_430070.pdf
       txt  = ./txt/10_1101-2021_02_08_430070.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_09_430036
     author = Goldsborough, Thibaut
      title = A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea
       date = 2021
      pages = 7
  extension = .pdf
       mime = application/pdf
      words = 3128
  sentences = 477
     flesch = 70
    summary = A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea is a carnivorous plant that grows on nitrogen-poor waterlogged sandstone aurea's genome, CDS and non-coding DNA 2) Determination of transcriptomic nitrogen content and codon usage bias associated with higher nitrogen content tRNAs (among codons that are coding for the same amino a considerably lower number of nitrogen atoms in its genome than the two other plant species. has higher nitrogen counts per molecular unit in genomic DNA, CDS, Non-Coding DNA, protein, aurea has a higher nitrogen usage in its DNA, RNA and proteins Figure 2: Average number of nitrogen atoms per molecular unit in genomic DNA, CDS, Non-Coding DNA, aurea had lower nitrogen content in tRNA sequences but not in other Figure 3: Bar graph representing the codon usage bias and tRNA nitrogen content in G.
      cache = ./cache/10_1101-2021_02_09_430036.pdf
       txt  = ./txt/10_1101-2021_02_09_430036.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_08_428881
     author = Lu, Yang Young
      title = ACE: Explaining cluster from an adversarial perspective
       date = 2021
      pages = 12
  extension = .pdf
       mime = application/pdf
      words = 7909
  sentences = 790
     flesch = 66
    summary = A common workflow in single-cell RNA-seq analysis is to project the data to a latent space, cluster the cells in that space, and identify sets of marker genes that explain the differences among the nonlinear embedding model which maps the gene expression to the low-dimensional representation where the groups A notable feature of ACE's approach is that, by identifying genes jointly, the method moves away from the notion Input: gene expression matrix Deep autoencoder learns low-dimensional representation Embedding clustering Clustering is neuralized and concatenated with the encoder Differentiation analysis by ACE Output: gene relevance ACE takes as input a single-cell gene expression matrix and learns a low-dimensional representation for each Next, a neuralized version of the k-means algorithm is applied to the learned representation to identify cell groups. input gene expression profile that lead the neuralized clustering model to alter the assignment from one group to the other.
      cache = ./cache/10_1101-2021_02_08_428881.pdf
       txt  = ./txt/10_1101-2021_02_08_428881.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_08_430343
     author = Gibbs, David L
      title = Patient-specific cell communication networks associate with disease progression in cancer
       date = 2021
      pages = 29
  extension = .pdf
       mime = application/pdf
      words = 11335
  sentences = 1445
     flesch = 58
    summary = tumor microenvironment, the method identified ligands, receptors and cells meeting certain criteria of 56 9,234 samples in The Cancer Genome Atlas (TCGA), starting from a network of 64 cell types and 1,894 62 Data sources including TCGA and cell-sorted gene expression, bulk tumor expression, cell type scores, 78 ligands and receptors for each of the 64 cell types in xCell, using the source gene expression data. With this procedure, a network scaffold is induced, where cells produce ligands that bind to receptors on 113 (PFI) and tumor stage for each sample, a matrix of patient-specific edge weights was constructed 206 number of high weight edges in each tumor type did not associate with the number of samples, as might 254 in the tumor stage contrast, a majority of ligand-producing cells include GMP cells, Osteoblasts, MSC 283 In the PFI results, Th1 cells appeared in 13 high scoring edges in SKCM, all with 394 
      cache = ./cache/10_1101-2021_02_08_430343.pdf
       txt  = ./txt/10_1101-2021_02_08_430343.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_08_430275
     author = Zhang, Jianbo
      title = Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes
       date = 2021
      pages = 6
  extension = .pdf
       mime = application/pdf
      words = 6404
  sentences = 694
     flesch = 68
    summary = Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes identified using BSA-Seq, a technology in which next-generation sequencing (NGS) is applied to bulked segregant analysis (BSA). recently developed the significant structural variant method for BSASeq data analysis that exhibits higher detection power than standard to analyze BSA-Seq data in which genome sequences of one parent served as the reference sequences in genotype calling, and thus We analyzed a public BSA-Seq dataset using our modified method and the standard allele frequency and Gmethod allows the detection of such associations without sequencing the parental genomes, leading to further lower the the BSA-Seq data with the genome sequences of both the parents101 when the parental genome sequences are used to aid BSA-Seq data 193 The allele frequency method: The ΔAF value of each SNP in 267 BSA-Seq data analysis using the genome sequences of both the parents and the bulks. BSA-Seq data analysis using only the bulk genome sequences.
      cache = ./cache/10_1101-2021_02_08_430275.pdf
       txt  = ./txt/10_1101-2021_02_08_430275.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_08_430270
     author = Gerard, David
      title = Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty
       date = 2021
      pages = 22
  extension = .pdf
       mime = application/pdf
      words = 7219
  sentences = 1582
     flesch = 69
    summary = Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty Keywords and phrases: attenuation bias, genotype likelihood, linkage disequilibrium, polyploidy, reliability ratio. Let XiA and XiB be the posterior means at loci A and B for individual Equations (5)–(7) take the naive estimators most researchers use in practice (the sample covariance/correlation of posterior means) and inflate these by a multiplicative effect. Gerard and Ferrão, 2019] to obtain the posterior moments for each individual's genotype at each SNP reliability ratios of most SNPs only increase their correlation estimates by less than 10%. To evaluate the LD estimates of high reliability ratio SNPs, we calculated the MLEs for ρ2 applied to simple linear regression with an additive effects model (where the SNP effect is proportional to the dosage), result in the standard ordinary least squares estimates when using the extreme reliability ratio of PotVar0080327, the genotype-error adjusted correlation estimate is -1.
      cache = ./cache/10_1101-2021_02_08_430270.pdf
       txt  = ./txt/10_1101-2021_02_08_430270.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_10_430604
     author = Youngblut, Nicholas D.
      title = Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets
       date = 2021
      pages = 4
  extension = .pdf
       mime = application/pdf
      words = 1409
  sentences = 157
     flesch = 56
    summary = Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets 1 Struo2: efficient metagenome profiling database construction for ever-expanding 10 Mapping metagenome reads to reference databases is the standard approach for 12 reference databases often lack recently generated genomic data such as 15 method for constructing custom databases; however, the pipeline does not scale well with the 17 not allow for efficient database updating as new data are generated. 20 HUMAnN3 databases that can be easily updated with new genomes and/or individual gene Struo2 enables feasible database generation for continually increasing large-scale 25 ● Pre-built databases: http://ftp.tue.mpg.de/ebio/projects/struo2/ 26 ● Utility tools: https://github.com/nick-youngblut/gtdb_to_taxdump 28 Metagenome profiling involves mapping reads to reference sequence databases and is 39 computational resources, which led us to create Struo for straight-forward custom metagenome 54 CPU hours per genome versus ~2.4 for Struo (Figure 1B). 67 taxonomy (available at https://github.com/nick-youngblut/gtdb_to_taxdump ). (2020) Struo: a pipeline for building custom databases for 
      cache = ./cache/10_1101-2021_02_10_430604.pdf
       txt  = ./txt/10_1101-2021_02_10_430604.txt
=== reduce.pl bib ===
         id = 10_1101-2021_02_08_430280
     author = Kasukurthi, Mohan V
      title = SALTS – <span class="underline">S</span>URFR (sncRNA) <span class="underline">A</span>nd <span class="underline">L</span>AGOOn (lncRNA) <span class="underline">T</span>ranscriptomics <span class="underline">S</span>uite
       date = 2021
      pages = 23
  extension = .pdf
       mime = application/pdf
      words = 12363
  sentences = 1218
     flesch = 58
    summary = given transcriptome provided as either a raw user-generated RNA-Seq dataset or NCBI SRR file identifier. SURFR identifies all ncRNA fragments (both annotated and novel) and their expressions in up to ten datasets per comprehensively compare all fragment expressions identified in up to 30 individual datasets by entering multiple SURFR session IDs window detailing each fragment identified in the individual, selected small RNA-Seq dataset. of the results page redirects the user to a SURFR window detailing the expressions of all full length sncRNAs in the provided datasets. Fragments" window (Figure 2D) for each fragment identified in the individual, selected small RNA-Seq dataset within its host gene along with the fragment's expression (RPM) in each individual small RNA-Seq dataset, and lncRNAs expressed in a given human transcriptome from either a user-provided RNA-Seq dataset or publically More importantly, however, LAGOOn identified MALAT1 as the most highly expressed lncRNA in MDAMB-231 breast cancer cells (Figure 9).
      cache = ./cache/10_1101-2021_02_08_430280.pdf
       txt  = ./txt/10_1101-2021_02_08_430280.txt
=== reduce.pl bib ===
         id = 10_1101-2020_02_04_934216
     author = Kirchoff, Kathryn E.
      title = EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning
       date = 2021
      pages = 13
  extension = .pdf
       mime = application/pdf
      words = 8121
  sentences = 726
     flesch = 59
    summary = EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning task of kinase-motif phosphorylation prediction as a multi-label kinase or substrate, as well as protein scaffolds that facilitate structural orientation and downstream catalysis of the reaction, modify the efficacy of motif phosphorylation. prediction of phosphorylation events), a deep learning approach for predicting multi-label kinase-motif phosphorylation relationships. example, the TLK kinase family only has nine positive labels (verified TLK-motif interactions) and more than 10,000 resulting data set is comprised of 7302 phosphorylatable motifs and their reaction-associated kinase families (Table 1). The final output is a vector, k, of length eight, where each value corresponds to the probability that the motif a was phosphorylated by one of the kinase families indicated in We sought to illuminate the relationship between kinase-family dissimilarity and phosphorylated motif-group dissimilarity described results provide motivation to incorporate both motif dissimilarity and kinase relatedness into the predictive model, as of kinase-motif prediction compared to the single-label approaches.
      cache = ./cache/10_1101-2020_02_04_934216.pdf
       txt  = ./txt/10_1101-2020_02_04_934216.txt
=== reduce.pl bib ===
         id = 10_1101-2020_09_02_279521
     author = Abi Nader, Clément
      title = Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data
       date = 2021
      pages = 32
  extension = .pdf
       mime = application/pdf
      words = 12164
  sentences = 1038
     flesch = 56
    summary = Simulating the outcome of amyloid treatments in Alzheimer's disease from imaging and clinical data When applied to multimodal imaging and clinical data from the Alzheimer's Disease Neuroimaging Initiative our * Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database Keywords : Alzheimer's Disease ; Clinical trials ; Disease progression; Amyloid hypothesis; of large datasets of different data modalities, such as clinical scores, or brain imaging measures to model Alzheimer's disease progression based on specific assumptions on the biochemical combining traditional DPMs with dynamical models of Alzheimer's disease progression. In this work we present a novel computational model of Alzheimer's disease progression to multi-modal imaging and clinical data from the Alzheimer's Disease Neuroimaging To simulate the long-term progression of Alzheimer's disease we first project the AD subjects Figure 3 Model-based progression of Alzheimer's disease. clinical data, based on the estimation of latent biomarkers' relationships governing Alzheimer's 
      cache = ./cache/10_1101-2020_09_02_279521.pdf
       txt  = ./txt/10_1101-2020_09_02_279521.txt
Building ./etc/reader.txt
10_1101-2021_02_09_430460
10_1101-727867
10_1101-2021_02_13_429885
10_1101-2021_02_09_430363
10_1101-2021_02_13_429885
10_1101-2021_02_09_430460
                number of items: 50
                   sum of words: 466,439
          average size in words: 9,328
      average readability score: 61

                          nouns: preprint; data; cell; genes; version; gene; copyright; review; author; holder; funder; peer; license; %; cells; preprintthis; analysis; methods; perpetuity; model; licenseavailable; number; set; sequence; expression; sequences; results; datasets; cancer; values; seq; method; dataset; figure; models; mutations; value; p; information; ad; sample; time; types; structure; sets; approach; k; samples; size; drug
                          verbs: is; are; was; be; has; using; were; posted; certified; made; display; granted; used; based; biorxiv; have; https://www.zotero.org/google-docs/?hsltkm; associated; �; been; selected; set; given; found; compared; shown; obtained; identified; including; use; generated; show; see; performed; identify; shows; provided; do; known; related; developed; described; defined; applied; following; calculated; provide; expected; expressed; proposed
                     adjectives: -; single; different; available; other; high; such; specific; non; same; genome; human; new; more; multiple; similar; large; first; small; many; low; random; clinical; best; individual; significant; average; linear; functional; higher; most; genomic; original; biological; international; polar; wide; relative; possible; subject; additional; computational; total; multi; standard; real; full; top; various; dimensional
                        adverbs: not; also; only; more; then; however; e.g.; well; thus; most; as; here; first; very; respectively; highly; therefore; out; often; finally; further; even; significantly; so; hence; previously; top; still; fully; up; instead; much; currently; at; less; specifically; recently; now; randomly; least; directly; similarly; rather; next; already; easily; better; usually; generally; moreover
                       pronouns: we; it; our; their; its; they; i; them; us; one; itself; https://doi.org/10.1101/2020.09.21.305516; you; themselves; your; https://doi.org/10.1101/2020.11.17.386649; https://doi.org/10.1101/2020.01.28.923532; he; 𝒙; ​sample​; u; s; https://doi.org/10.1101/2021.02.10.430705; ∆̂′; ours; m′; https://doi.org/10.1101/2021.02.12.430830; his; 𝑙𝑎; λ; ourselves; n; my; il-; https://doi.org/10.1101/2021.02.08.430280; http://paperpile.com/b/5tes3g/x5omi; 𝜟; 𝑒𝑖; 𝑆∗of; ’s; τ2; α; ʻʻuniprotdom_postmodenzʼ; y∗; yij; yes; whole-644; when398; uw; us-
                   proper nouns: al; february; et; international; j.; m.; .; nc; rna; s.; d.; c.; a.; nd; k; m; r.; c; l.; p.; s; j; b.; e.; g.; figure; fig; t.; a; k.; t; by; supplementary; n; h.; µm; r; seq; n.; e; d; http://creativecommons.org/licenses/by-nc/4.0/; data; f.; alzheimer; b; genome; li; y.; y
                       keywords: february; international; rna; cell; gene; seq; figure; supplementary; mutation; fig; data; alzheimer; set; sequence; sars; pca; motif; method; gwas; covid-19; cancer; δaf; vql; vp35; vcf; vaf; utr; usc; umls; type; tet; target; swarm; surfr; subject; struo; strain; stad; ssu; sparc; snp; skcm; single; siamese; sds; sda; scc; rrna; ribovore; remdesivir

       one topic; one dimension: 10
                        file(s): ./cache/10_1101-2020_09_21_305516.pdf
                      titles(s): Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer

    three topics; one dimension: 10; org; 10
                        file(s): ./cache/10_1101-2021_02_09_430536.pdf, ./cache/10_1101-2021_02_09_430460.pdf, ./cache/10_1101-2021_02_11_430762.pdf
                      titles(s): Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes | Sequence neighborhoods enable reliable prediction of pathogenic mutations in cancer genomes | Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation

  five topics; three dimensions: 10 org doi; 10 org 2021; org https google; http com https; 10 sequences org
                        file(s): ./cache/10_1101-727867.pdf, ./cache/10_1101-2021_02_09_430536.pdf, ./cache/10_1101-2021_02_09_430460.pdf, ./cache/10_1101-2021_02_08_430343.pdf, ./cache/10_1101-2021_02_11_430762.pdf
                      titles(s): scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data | Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes | Sequence neighborhoods enable reliable prediction of pathogenic mutations in cancer genomes | Patient-specific cell communication networks associate with disease progression in cancer | Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation

      Type: biorxiv
     title: neuroscience-from-bioarxiv
      date: 2021-02-14
      time: 21:20
  username: emorgan
    patron: Eric Morgan
     email: emorgan@nd.edu
     input: eMBISSE4cL.xml
==== make-pages.sh htm files
==== make-pages.sh complex files
==== make-pages.sh named enities
==== making bibliographics
         id: 10_1101-2021_02_10_430623
     author: Aberasturi, Dillon
      title: “Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases
       date: 2021
      words: 9478
  sentences: 748
      pages: 9
     flesch: 58
      cache: ./cache/10_1101-2021_02_10_430623.pdf
        txt: ./txt/10_1101-2021_02_10_430623.txt
    summary: published S3-type N-of-1-pathways MixEnrich to two paired samples (e.g., diseased vs unaffected tissues) for determining patient-specific enriched genes sets: Odds Ratios (S3-OR) and S3-variance using these models to derive effect sizes and statistical significance in singlesubject studies of transcriptomes, these samples are isogenic or quasi-isogenic, and thus do not necessarily generalize to a group of subjects (cohort-level signal). The novel bioinformatic method identifies meaningful biomechanism differences between very small cohorts by using single-subject-study-derived effect sizes for gene sets. (B) For the generalized linear model-based analyses, we applied a different filtering process to the raw data where we eliminated all the transcripts with 0 counts for each subject and then calculated the coefficient 2.3 Description of the Generalized Linear Models and application of Inter-N-of-1 methods for small cohort comparison and their evaluation in the Breast Cancer Data the analysis of subsets of the TCGA Breast Cancer data, genes were declared differentially expressed if their abs(log2FC) > log2(1.2) and their FDR-adjusted p-value < 

         id: 10_1101-2020_09_02_279521
     author: Abi Nader, Clément
      title: Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data
       date: 2021
      words: 12164
  sentences: 1038
      pages: 32
     flesch: 56
      cache: ./cache/10_1101-2020_09_02_279521.pdf
        txt: ./txt/10_1101-2020_09_02_279521.txt
    summary: Simulating the outcome of amyloid treatments in Alzheimer''s disease from imaging and clinical data When applied to multimodal imaging and clinical data from the Alzheimer''s Disease Neuroimaging Initiative our * Data used in preparation of this article were obtained from the Alzheimer''s Disease Neuroimaging Initiative (ADNI) database Keywords : Alzheimer''s Disease ; Clinical trials ; Disease progression; Amyloid hypothesis; of large datasets of different data modalities, such as clinical scores, or brain imaging measures to model Alzheimer''s disease progression based on specific assumptions on the biochemical combining traditional DPMs with dynamical models of Alzheimer''s disease progression. In this work we present a novel computational model of Alzheimer''s disease progression to multi-modal imaging and clinical data from the Alzheimer''s Disease Neuroimaging To simulate the long-term progression of Alzheimer''s disease we first project the AD subjects Figure 3 Model-based progression of Alzheimer''s disease. clinical data, based on the estimation of latent biomarkers'' relationships governing Alzheimer''s 

         id: 10_1101-2020_01_28_923532
     author: Ahmadi, Saba
      title: The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective
       date: 2021
      words: 16705
  sentences: 1572
      pages: 46
     flesch: 66
      cache: ./cache/10_1101-2020_01_28_923532.pdf
        txt: ./txt/10_1101-2020_01_28_923532.txt
    summary: We focus our analysis on genes encoding protein targets that encode receptors on the cell all "modular", including one part that specifically targets the tumor cell via one gene/protein and MadHitter and each patient receives an optimal personalized combination of targeted therapies from a prespecified set (pill bottle). Cohort and Individual Target Set Sizes as Functions of Tumor Killing and Given the single-cell tumor data sets and the ILP optimization framework described above, we filtering as this threshold is decreased), decreases the size of the target cell surface receptor gene heterogeneity of the cancer, number of patients within the data set, size of target gene set, lack of used for filtering the gene set to avoid targeting non-cancerous tissues. the genes in the optimal target set, the expression of that gene in that non-tumor cell exceeds the set of genes which is known to be targetable to cell 𝐶.

         id: 10_1101-2021_02_12_430764
     author: Ascensión, Alex M.
      title: Triku: a feature selection method based on nearest neighbors for single-cell data
       date: 2021
      words: 9518
  sentences: 1135
      pages: 18
     flesch: 64
      cache: ./cache/10_1101-2021_02_12_430764.pdf
        txt: ./txt/10_1101-2021_02_12_430764.txt
    summary: Triku: a feature selection method based on nearest neighbors for single-cell data Triku is a feature selection method that favours genes defining the main Single-cell RNA sequencing (scRNA-seq) is a powerful technology to study the biological heterogeneity of tissues at the individual cell level, allowing the characterization of new cell populations and cell states–i.e. cell types responding to different scRNA-seq datasets are multidimensional, i.e. the expression profile per cell consists of multiple genes. feature selection method: 1) the ability to recover basic dataset structure (main cell low, meaning that features selected with the different methods yielded clustering solutions that were quite similar to the manually-labeled cell types, although there are We first studied the expression pattern of genes selected by triku and other methods, To evaluate the cluster expression of selected genes in benchmarking datasets, for proteins within the genes selected by different FS methods in the two sets of benchmarking datasets.

         id: 10_1101-2021_02_11_430806
     author: Badaczewska-Dawid, Aleksandra
      title: BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences
       date: 2021
      words: 2698
  sentences: 301
      pages: 3
     flesch: 53
      cache: ./cache/10_1101-2021_02_11_430806.pdf
        txt: ./txt/10_1101-2021_02_11_430806.txt
    summary: BIAPSS BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences web platform named BIAPSS (BioInformatic Analysis of liquidliquid Phase-Separating protein Sequences) which offers the users interactive data analytic tools for facilitating the discovery of statistically significant sequence signals for proteins with Phase-Separating protein Sequences. The objective of BIAPSS is to enable a rapid and on-the-fly deep statistical analysis of LLPS-driver proteins using the pool of sequences with The comparison to benchmarks of various protein groups enables statistical inference of specific phase-separating affinities. Furthermore, the residue-resolution biophysical regularities inferred from BIAPSS will help not only to accurately identify regions prone to phase separation but also to design sequence modifications targeting various biomedical applications. for comprehensive sequence-based analysis of LLPS proteins. the driving forces for phase separation of prion-like RNA binding proteins. disordered protein regions encode a driving force for liquid-liquid phase separation? of proteins driving liquid-liquid phase separation.

         id: 10_1101-2021_02_10_430563
     author: Bandrowski, Anita
      title: SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data
       date: 2021
      words: 10901
  sentences: 1026
      pages: 16
     flesch: 48
      cache: ./cache/10_1101-2021_02_10_430563.pdf
        txt: ./txt/10_1101-2021_02_10_430563.txt
    summary: investigators across the SPARC consortium that provide key details about organ-specific circuitry, including structural (BIDS), the SDS has been designed to capture the large variety of data generated by SPARC investigators who are description of the SPARC curation process and the automated tools for complying with the SDS, including the SDS validator and Software to Organize Data Automatically (SODA) for SPARC. required to organize their data files and metadata organized according to the SPARC Data Structure data according to the SPARC Dataset Structure. is the preferred file format for tabular data in SPARC, the Data files are organized into 3 different top-level folders, The organization structure of the files and folders for a SPARC dataset. https://github.com/SciCrunch/sparc-curation/releases/tag/dataset-template-1.2.3 https://github.com/SciCrunch/sparc-curation/releases/tag/dataset-template-1.2.3 investigators include folders that organize data along a from these subjects, data files are organized within fields, the curation team developed a SPARC Dataset files/folders, and share datasets with the SPARC 

         id: 10_1101-2021_02_09_430460
     author: Banerjee, Shayantan
      title: Sequence neighborhoods enable reliable prediction of pathogenic mutations in cancer genomes
       date: 2021
      words: 15659
  sentences: 1731
      pages: 39
     flesch: 40
      cache: ./cache/10_1101-2021_02_09_430460.pdf
        txt: ./txt/10_1101-2021_02_09_430460.txt
    summary: experimentally validated cancer mutation data in this study, we explored various string-based evolutionary features resulted in the development of a pan-cancer mutation effect prediction Distinguishing between driver and passenger mutations from sequenced cancer genomes is a Recent studies have identified specific signatures or patterns of mutations in different cancer than passenger mutations and built probabilistic models to identify driver genes that had this study, missense mutations from 58 genes that were pan-cancer-based were combined from We used the same datasets to judge our model''s ability to predict rare driver mutations based Driver and Passenger Mutations'' Features Used to Train NBDriver are Significantly Although our method''s focus was to identify missense driver mutations from sequenced cancer surrounding driver and passenger mutations obtained from sequenced cancer genomes. computational prediction of driver missense mutations," Cancer Res., vol. functionally validated cancer-related missense mutations," Genome Biology, vol. Figure 7: Differences in the distribution of features between driver and passenger mutations 

         id: 10_1101-2021_02_09_430363
     author: Bayer, Johanna M. M.
      title: Accommodating site variation in neuroimaging data using hierarchical and Bayesian models
       date: 2021
      words: 13439
  sentences: 1891
      pages: 20
     flesch: 70
      cache: ./cache/10_1101-2021_02_09_430363.pdf
        txt: ./txt/10_1101-2021_02_09_430363.txt
    summary: Accommodating site variation in neuroimaging data using hierarchical and Bayesian models The potential of normative modeling to make individualized predictions has led to structural neuroimaging results that go beyond the case-control approach. in a similar way for multi-site modeling in a pooled neuroimaging data set, which contained 7499 participants that org/abide/) data set to compare a non-linear, Gaussian version of the model, to a linear hierarchical Bayesian version and mathematical description of our approach to include site as predictor in a normative hierarchical Bayesian model. With the aim to create reliable normative models in multi-site neuroimaging data, we developed and compared two model is also able to capture non-linear effects between age and thickness of the cortical region ("Hierarchical Bayesian Gaussian Process term, which allows to model non-linear association between age and cortical thickness measures. The only models that perform better for most regions than the mean of the training data set are the Hierarchical Bayesian

         id: 10_1101-2021_02_10_430367
     author: Chen, Meili
      title: Genome Warehouse: A Public Repository Housing Genome-scale Data
       date: 2021
      words: 4875
  sentences: 656
      pages: 18
     flesch: 66
      cache: ./cache/10_1101-2021_02_10_430367.pdf
        txt: ./txt/10_1101-2021_02_10_430367.txt
    summary: Running title: Chen M et al / Genome Assembly Data Repository 21 Genomics Data Center (NGDC), part of the China National Center for Bioinformation 40 archive high-quality genome sequences and annotations, GWH is equipped with a 46 Collectively, GWH serves as an important resource for genome-scale data 51 https://bigd.big.ac.cn/) [13], the aim of GWH is to accept data submissions worldwide 78 GWH is a centralized resource housing genome-scale data, with the purpose to 105 GWH not only accepts genome assembly associated data through an on-line 111 GWH will assign a unique accession number to the submitted genome assembly upon 149 GWH provides data visualization for both genome 163 Collectively, GWH is a user-friendly portal for genome data submission, release, and 209 Database resources of the National Genomics Data 302 Genome assembly accession number is prefixed with "GWH", followed by four 334 Genome assembly accession number is prefixed with "GWH", followed by four 334 

         id: 10_1101-2021_02_12_430979
     author: Da Silva, Kévin
      title: StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs
       date: 2021
      words: 10624
  sentences: 992
      pages: 20
     flesch: 66
      cache: ./cache/10_1101-2021_02_12_430979.pdf
        txt: ./txt/10_1101-2021_02_12_430979.txt
    summary: StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs results show that StrainFLAIR was able to distinguish and estimate the abundances of close strains, as approaches to handle multiple similar genomes as with strains use gene clustering and then select the64 StrainFLAIR assigns and estimates species and strain abundances of a bacterial metagenomic sample graph, called the "node abundance", is computed, first focusing on unique mapped reads (first step). Strain-level abundances are then obtained by exploiting the specific genes of each reference genome188 from the reference variation graph thus simulating a new strain to be identified and quantified.231 strains from a sequenced sample, mapped onto this graph.343 Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or

         id: 10_1101-2020_11_17_386649
     author: Danciu, Daniel
      title: Topology-based Sparsification of Graph Annotations
       date: 2021
      words: 8205
  sentences: 774
      pages: 15
     flesch: 67
      cache: ./cache/10_1101-2020_11_17_386649.pdf
        txt: ./txt/10_1101-2020_11_17_386649.txt
    summary: Experiments on 10,000 RNA-seq datasets show that RowDiff combined with MultiBRWT results in a 30% reduction in annotation footprint over Mantis-MST, the previously known most a binary matrix, where the k-mer set indexes the rows and each annotation label specifies a column. Starting from any vertex in the de Bruijn graph, Algorithm 1 defines a traversal leading to an anchor Each row in a RowDiff-transformed annotation matrix has the same or fewer set bits than A naı̈ve implementation of the RowDiff construction would be to load the matrix A in memory, and gradually replace its rows with their sparsified counterpart, while traversing the graph. We now note that, when querying annotations for paths in the graph, or sets of rows corresponding to vertices We constructed annotated de Bruijn graphs from the RNA-Seq data set in the same We now compare the representation size for RowDiff and other state-of-the-art graph annotation compression methods.

         id: 10_1101-2021_02_08_430270
     author: Gerard, David
      title: Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty
       date: 2021
      words: 7219
  sentences: 1582
      pages: 22
     flesch: 69
      cache: ./cache/10_1101-2021_02_08_430270.pdf
        txt: ./txt/10_1101-2021_02_08_430270.txt
    summary: Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty Keywords and phrases: attenuation bias, genotype likelihood, linkage disequilibrium, polyploidy, reliability ratio. Let XiA and XiB be the posterior means at loci A and B for individual Equations (5)–(7) take the naive estimators most researchers use in practice (the sample covariance/correlation of posterior means) and inflate these by a multiplicative effect. Gerard and Ferrão, 2019] to obtain the posterior moments for each individual''s genotype at each SNP reliability ratios of most SNPs only increase their correlation estimates by less than 10%. To evaluate the LD estimates of high reliability ratio SNPs, we calculated the MLEs for ρ2 applied to simple linear regression with an additive effects model (where the SNP effect is proportional to the dosage), result in the standard ordinary least squares estimates when using the extreme reliability ratio of PotVar0080327, the genotype-error adjusted correlation estimate is -1.

         id: 10_1101-2021_02_12_430963
     author: Gerber, Stefan
      title: Streamlining differential exon and 3'' UTR usage with diffUTR
       date: 2021
      words: 6710
  sentences: 896
      pages: 17
     flesch: 62
      cache: ./cache/10_1101-2021_02_12_430963.pdf
        txt: ./txt/10_1101-2021_02_12_430963.txt
    summary: adenylation site databases to enable differential 3'' UTR usage analysis. Conclusions: diffUTR enables differential 3'' UTR analysis and more generally facilitates DEU9 Popular bin-based DEU methods are provided by the limma [25,24], edgeR [23] and DEXSeq [22]41 Bins are prepared from various types of gene annotations as well as, optionally, additional APA-driven segmentation and extension, then read counts among statistically-significant genes, especially for bins with a higher expression (Figure 3A).78 diffUTR provides three main plot types to explore differential bin usage analyses, each with a88 Plotted are the UTR bins found statistically significant (binand gene-level FDR deuBinPlot (Figure 4B) provides bin-level statistic plots for a given gene, similar to those99 than CDS bins, including counts of 3'' UTR when calculating overall gene expression could under-121 diffUTR streamlines DEU analysis and outperforms alternative methods in inferring UTR changes,127 For differential UTR analysis, gene-level results are ob-206

         id: 10_1101-2021_02_12_430830
     author: Gergely, Tibély
      title: Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data
       date: 2021
      words: 8181
  sentences: 793
      pages: 19
     flesch: 68
      cache: ./cache/10_1101-2021_02_12_430830.pdf
        txt: ./txt/10_1101-2021_02_12_430830.txt
    summary: Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data widely available bulk sequencing data where mutations from individual cells are and genomic mutation rate from bulk sequencing data. based on the maximum likelihood estimation of the parameters of a generative model of tumor growth and mutations. human hepatocellular carcinoma sample reveals an elevated per cell division mutation rate and high cell turnover. Due to the limitations of bulk sequencing, which only essays mutation frequencies for a population of cells from each tumor sample and does not The estimation is based on a maximum likelihood fit of the parameters of a birth-death model to the measured mutant and be estimated from readcount data, to separate the effects of the mutation rate We use pre-generated division trees from the ELynx suite at predetermined turnover rate values. Using the turnover rate, we also estimated the number of cell

         id: 10_1101-2021_02_08_430343
     author: Gibbs, David L
      title: Patient-specific cell communication networks associate with disease progression in cancer
       date: 2021
      words: 11335
  sentences: 1445
      pages: 29
     flesch: 58
      cache: ./cache/10_1101-2021_02_08_430343.pdf
        txt: ./txt/10_1101-2021_02_08_430343.txt
    summary: tumor microenvironment, the method identified ligands, receptors and cells meeting certain criteria of 56 9,234 samples in The Cancer Genome Atlas (TCGA), starting from a network of 64 cell types and 1,894 62 Data sources including TCGA and cell-sorted gene expression, bulk tumor expression, cell type scores, 78 ligands and receptors for each of the 64 cell types in xCell, using the source gene expression data. With this procedure, a network scaffold is induced, where cells produce ligands that bind to receptors on 113 (PFI) and tumor stage for each sample, a matrix of patient-specific edge weights was constructed 206 number of high weight edges in each tumor type did not associate with the number of samples, as might 254 in the tumor stage contrast, a majority of ligand-producing cells include GMP cells, Osteoblasts, MSC 283 In the PFI results, Th1 cells appeared in 13 high scoring edges in SKCM, all with 394 

         id: 10_1101-2021_02_09_430036
     author: Goldsborough, Thibaut
      title: A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea
       date: 2021
      words: 3128
  sentences: 477
      pages: 7
     flesch: 70
      cache: ./cache/10_1101-2021_02_09_430036.pdf
        txt: ./txt/10_1101-2021_02_09_430036.txt
    summary: A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea is a carnivorous plant that grows on nitrogen-poor waterlogged sandstone aurea''s genome, CDS and non-coding DNA 2) Determination of transcriptomic nitrogen content and codon usage bias associated with higher nitrogen content tRNAs (among codons that are coding for the same amino a considerably lower number of nitrogen atoms in its genome than the two other plant species. has higher nitrogen counts per molecular unit in genomic DNA, CDS, Non-Coding DNA, protein, aurea has a higher nitrogen usage in its DNA, RNA and proteins Figure 2: Average number of nitrogen atoms per molecular unit in genomic DNA, CDS, Non-Coding DNA, aurea had lower nitrogen content in tRNA sequences but not in other Figure 3: Bar graph representing the codon usage bias and tRNA nitrogen content in G.

         id: 10_1101-2021_02_11_430695
     author: Gordon-Rodriguez, Elliott
      title: Learning Sparse Log-Ratios for High-Throughput Sequencing Data
       date: 2021
      words: 7973
  sentences: 817
      pages: 12
     flesch: 60
      cache: ./cache/10_1101-2021_02_11_430695.pdf
        txt: ./txt/10_1101-2021_02_11_430695.txt
    summary: Log-ratios are an important class of features for analyzing high-throughput sequencing (HTS) metagenomic data for HTS data, and more generally, high-dimensional CoDa. Unlike existing methods, CoDaCoRe is simultaneously scalable, interpretable, sparse, and accurate. unlabelled datasets, {xi}ni=1, as a method for identiLearning Sparse Log-Ratios for High-Throughput Sequencing Data CoDaCoRe variable selection for the first (most explanatory) log-ratio on the Crohn disease data (Rivera-Pinto et al., 2018). more generally, in the field of CoDa. Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data

         id: 10_1101-2020_09_23_310276
     author: Greenfest-Allen, Emily
      title: NIAGADS Alzheimer''s GenomicsDB: A resource for exploring Alzheimer''s Disease genetic and genomic knowledge
       date: 2021
      words: 5987
  sentences: 592
      pages: 19
     flesch: 52
      cache: ./cache/10_1101-2020_09_23_310276.pdf
        txt: ./txt/10_1101-2020_09_23_310276.txt
    summary: The NIAGADS Alzheimer''s Genomics Database (GenomicsDB) is an interactive knowledgebase for Alzheimer''s disease (AD) genetics that provides access to GWAS summary statistics datasets The website makes available >70 genome-wide summary statistics datasets from GWAS and efficient real-time data analysis and variant or gene report generation. Gene reports provide summaries of co-located ADRD risk-associated variants and have pages linking summary statistics to variant and gene annotations, this resource makes these summary statistics available for browsing (on dataset, gene, and variant reports and as genome NIAGADS GenomicsDB variant reports and a track is available on the genome browser. The NIAGADS GenomicsDB includes allele frequency data from 1000 Genomes (phase 3, version visualizations for summarizing search results and annotations in gene and variant reports. compare NIAGADS GWAS summary statistics tracks to each other, against annotated gene or A detailed report is provided for each of the GWAS summary statistics and ADSP meta-analysis 

         id: 10_1101-2021_02_13_429885
     author: Househam, Jacob
      title: A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing
       date: 2021
      words: 10584
  sentences: 1257
      pages: 36
     flesch: 49
      cache: ./cache/10_1101-2021_02_13_429885.pdf
        txt: ./txt/10_1101-2021_02_13_429885.txt
    summary: know tumour purity and the ploidy of a CNA segment, then the VAF mutations mapped A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing.

         id: 10_1101-2020_10_08_327718
     author: Jambor, Helena
      title: Creating Clear and Informative Image-based Figures for Scientific Publications
       date: 2021
      words: 12824
  sentences: 1189
      pages: 36
     flesch: 56
      cache: ./cache/10_1101-2020_10_08_327718.pdf
        txt: ./txt/10_1101-2020_10_08_327718.txt
    summary: journals in three fields; plant sciences, cell biology and physiology (n=580 papers). figures were uncommon (physiology 16%, cell biology 12%, plant sciences 2%). among papers published in top journals in plant sciences, cell biology and physiology. contained images (plant science: 68%, cell biology: 72%, physiology: 55%). in physiology (49%) and cell biology (55%), and 28% of plant science papers provided and 29% of plant sciences papers contained no scale information on any image. Some publications use insets to show the same image at two different scales (cell Figure 1: Image types and reporting of scale information and insets physiology and plant science papers contained some images that were inaccessible to B: Most papers explain colors in image-based figures, however, explanations are less Figure 4: Using scale bars to annotate image size Creating clear and informative image-based figures for scientific publications. Creating clear and informative image-based figures for scientific publications.

         id: 10_1101-2021_02_08_430280
     author: Kasukurthi, Mohan V
      title: SALTS – <span class="underline">S</span>URFR (sncRNA) <span class="underline">A</span>nd <span class="underline">L</span>AGOOn (lncRNA) <span class="underline">T</span>ranscriptomics <span class="underline">S</span>uite
       date: 2021
      words: 12363
  sentences: 1218
      pages: 23
     flesch: 58
      cache: ./cache/10_1101-2021_02_08_430280.pdf
        txt: ./txt/10_1101-2021_02_08_430280.txt
    summary: given transcriptome provided as either a raw user-generated RNA-Seq dataset or NCBI SRR file identifier. SURFR identifies all ncRNA fragments (both annotated and novel) and their expressions in up to ten datasets per comprehensively compare all fragment expressions identified in up to 30 individual datasets by entering multiple SURFR session IDs window detailing each fragment identified in the individual, selected small RNA-Seq dataset. of the results page redirects the user to a SURFR window detailing the expressions of all full length sncRNAs in the provided datasets. Fragments" window (Figure 2D) for each fragment identified in the individual, selected small RNA-Seq dataset within its host gene along with the fragment''s expression (RPM) in each individual small RNA-Seq dataset, and lncRNAs expressed in a given human transcriptome from either a user-provided RNA-Seq dataset or publically More importantly, however, LAGOOn identified MALAT1 as the most highly expressed lncRNA in MDAMB-231 breast cancer cells (Figure 9).

         id: 10_1101-2021_02_10_430512
     author: Kim, Catherine
      title: Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification
       date: 2021
      words: 11859
  sentences: 1137
      pages: 41
     flesch: 56
      cache: ./cache/10_1101-2021_02_10_430512.pdf
        txt: ./txt/10_1101-2021_02_10_430512.txt
    summary: into DDIs. In this study, a hierarchical machine learning model was created to predict DDIassociated ADRs and pharmacological insight thereof for any drug pair. drugs'' chemical structures as inputs to predict their target, enzyme, and transporter (TET) Development of RFCs for Prediction of Target, Enzyme, and Transporter Profiles of Drugs Development of a Model for Prediction of DDI-associated ADRs from TET Profiles of Drugs ADR prediction from Target, Enzyme, and Transporter Profiles of Drug Pairs To predict ADRs of a drug pair from its TET profiles, Random Forest Classifier (RFC), Application of the SVM model for DDI-associated ADRs Involving Three Major Drugs through predicted PRR changes of drug pairs upon removal of each of the targets, enzymes, and changes of drug pairs were predicted by the model upon removal of each of the targets, enzymes, Target, enzyme, and transporter (TET) profiles of atorvastatin and concomitant drugs, 

         id: 10_1101-2020_02_04_934216
     author: Kirchoff, Kathryn E.
      title: EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning
       date: 2021
      words: 8121
  sentences: 726
      pages: 13
     flesch: 59
      cache: ./cache/10_1101-2020_02_04_934216.pdf
        txt: ./txt/10_1101-2020_02_04_934216.txt
    summary: EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning task of kinase-motif phosphorylation prediction as a multi-label kinase or substrate, as well as protein scaffolds that facilitate structural orientation and downstream catalysis of the reaction, modify the efficacy of motif phosphorylation. prediction of phosphorylation events), a deep learning approach for predicting multi-label kinase-motif phosphorylation relationships. example, the TLK kinase family only has nine positive labels (verified TLK-motif interactions) and more than 10,000 resulting data set is comprised of 7302 phosphorylatable motifs and their reaction-associated kinase families (Table 1). The final output is a vector, k, of length eight, where each value corresponds to the probability that the motif a was phosphorylated by one of the kinase families indicated in We sought to illuminate the relationship between kinase-family dissimilarity and phosphorylated motif-group dissimilarity described results provide motivation to incorporate both motif dissimilarity and kinase relatedness into the predictive model, as of kinase-motif prediction compared to the single-label approaches.

         id: 10_1101-2021_02_09_430536
     author: Lin, Cui-Xiang
      title: Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes
       date: 2021
      words: 20656
  sentences: 4864
      pages: 47
     flesch: 79
      cache: ./cache/10_1101-2021_02_09_430536.pdf
        txt: ./txt/10_1101-2021_02_09_430536.txt
    summary: Genome-wide prediction and integrative functional characterization of Alzheimer''s disease-associated genes example, a module-trait network approach was proposed and applied to identify gene 63 functional enrichment-based approach to identify negative genes that are not likely 94 associated genes through an optimal selection of networks and machine learning 98 FGN, and prediction of AD-associated genes using machine learning models (Fig. 1). addition, we tested their enrichment in three AD-related gene sets associated with 122 The top-ranked genes are enriched in AD-associated functions and phenotypes 154 These results provide additional evidence that our predicted genes are associated with 194 The top-ranked genes are associated with AD based on miRNA-target networks 227 We investigated whether top-ranked genes were functionally related to AD-associated 229 We tested whether the top-ranked k genes were more likely to interact with AD-associated 576 related to AD-associated genes or miRNAs based on miRNA-target interaction networks.

         id: 10_1101-2021_02_08_428881
     author: Lu, Yang Young
      title: ACE: Explaining cluster from an adversarial perspective
       date: 2021
      words: 7909
  sentences: 790
      pages: 12
     flesch: 66
      cache: ./cache/10_1101-2021_02_08_428881.pdf
        txt: ./txt/10_1101-2021_02_08_428881.txt
    summary: A common workflow in single-cell RNA-seq analysis is to project the data to a latent space, cluster the cells in that space, and identify sets of marker genes that explain the differences among the nonlinear embedding model which maps the gene expression to the low-dimensional representation where the groups A notable feature of ACE''s approach is that, by identifying genes jointly, the method moves away from the notion Input: gene expression matrix Deep autoencoder learns low-dimensional representation Embedding clustering Clustering is neuralized and concatenated with the encoder Differentiation analysis by ACE Output: gene relevance ACE takes as input a single-cell gene expression matrix and learns a low-dimensional representation for each Next, a neuralized version of the k-means algorithm is applied to the learned representation to identify cell groups. input gene expression profile that lead the neuralized clustering model to alter the assignment from one group to the other.

         id: 10_1101-2021_02_12_430739
     author: Malekian, Negin
      title: Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli
       date: 2021
      words: 7516
  sentences: 1093
      pages: 13
     flesch: 70
      cache: ./cache/10_1101-2021_02_12_430739.pdf
        txt: ./txt/10_1101-2021_02_12_430739.txt
    summary: Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli Here, we systematically screen for candidate quinolone resistance-conferring mutations. coli and performed a genome-wide association study (GWAS) correlating over 200,000 mutations against quinolone resistance phenotypes. significant mutations including one located at the active site of the biofilm dispersal genes bdcA and six silent In summary, we demonstrate that GWAS effectively and comprehensively identifies resistance mutations Keywords: E Coli; Quinolone; Antibiotic Resistance; Genome-Wide Association Study (GWAS) direct route to resistance is mutations in the drug targets gyrA and parC. In summary, we aim to show that a bacterial genomewide association study can effectively and comprehensively identify targets relevant to antibiotic resistance. Based on representative resistance phenotypes, the authors selected 103 isolates for sequencing with Illumina MiSeq, 92 of which are available from coli bdcA may act indirectly on antibiotic resistance.

         id: 10_1101-2021_02_12_430923
     author: Modi, Vivek
      title: Kincore: a web resource for structural classification of protein kinases and their inhibitors
       date: 2021
      words: 7913
  sentences: 666
      pages: 18
     flesch: 62
      cache: ./cache/10_1101-2021_02_12_430923.pdf
        txt: ./txt/10_1101-2021_02_12_430923.txt
    summary: Kincore: a web resource for structural classification of protein kinases and their inhibitors result, among the DFGin structures, we distinguished between the catalytically active kinase conformation pages for kinase phylogenetic groups, genes, conformational labels, PDBids, ligands and ligand types. options to download data – database tables as a tab separated files; the kinase structures as PyMOL Kincore provides conformational assignments and ligand type labels to protein kinase structures from Figure 1: Representative protein kinase structure (3ETA_A) displaying the residues used to define inhibitor The distribution of different ligand types across kinase conformations is provided in Table 1. Table 1: Distribution of ligand types across protein kinase conformations (Number of chains). including conformational and ligand type labels and C-helix position, kinase family, gene name, Uniprot provides the number of kinase chains in the group across different conformations with their Database table provides the list of all the PDB chains with conformational labels and ligand 

         id: 10_1101-2020_12_24_424317
     author: Muazzam, Fariha
      title: Multi-class Cancer Classification and Biomarker Identification using Deep Learning
       date: 2021
      words: 4252
  sentences: 426
      pages: 12
     flesch: 57
      cache: ./cache/10_1101-2020_12_24_424317.pdf
        txt: ./txt/10_1101-2020_12_24_424317.txt
    summary: classification, feature extraction and relevant gene identification through deep learning methods for 12 This research picks up from detection of different types of cancer RNA-Seq expressions using deep neural classification of gene expression profiles for different kinds of cancers. Hence, the effectiveness of deep learning models for feature extraction and relevant gene identification is performed revealing substantial results and they produced five high-ranked gene sets and reduced feature This study was aimed at classifying 12 types of cancer and identifying relevant genes and the results show were able to identify cancer-relevant pathways and genes for the sets, that different experiments generated, A deep learning approach for cancer detection and relevant gene Tumor gene expression data classification via sample expansionbased deep learning. Identification of a multi-cancer gene expression Multi-class Cancer Classification and Biomarker Identification using Deep Learning Multi-class Cancer Classification and Biomarker Identification using Deep Learning

         id: 10_1101-2020_09_21_305516
     author: Nikolic, Ana
      title: Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer
       date: 2021
      words: 10376
  sentences: 1280
      pages: 32
     flesch: 67
      cache: ./cache/10_1101-2020_09_21_305516.pdf
        txt: ./txt/10_1101-2020_09_21_305516.txt
    summary: Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer 1 Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer 1 uses single-cell epigenomic data to infer copy number variants (CNVs) that define cancer cells. We have tested the ability of Copy-scAT to use scATAC data to call CNVs with three different approaches 100 genome sequencing (WGS) data for adult GBM (aGBM) surgical resections (n = 4 samples, 3,647 cells). adult GBM samples identified using both methods, versus total numbers of gains detected by scATAC or 160 Number of chromosome-arm level gains detected in adult GBM samples identified using both methods, 163 (c) Multiple myeloma samples were profiled by both scATAC and the single-cell CNV assay. chromosome-arm level gains detected in adult GBM samples identified using both methods, versus total 166 CNVs are detected in scATAC clusters with Copy-scAT in pediatric GBM samples.

         id: 10_1101-2021_02_11_430847
     author: Pinatti, Lisa M.
      title: SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer
       date: 2021
      words: 6849
  sentences: 788
      pages: 26
     flesch: 57
      cache: ./cache/10_1101-2021_02_11_430847.pdf
        txt: ./txt/10_1101-2021_02_11_430847.txt
    summary: SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer squamous cell carcinomas; however, the impact of HPV integration into the host human genome SearcHPV uncovered HPV integration sites adjacent to known cancer-related detection of HPV-human integration sites from targeted capture DNA sequencing data. developed a novel HPV integration detection tool for targeted capture sequencing data, which we SearcHPV showed a high frequency of HPV16 integration with a total of six events in UM-SCCIn this study, SearcHPV also called HPV integration sites within TP63. HPV integration sites have been associated with structural variations in the human genome3, 8, 37, which supports an additional genetic mechanism as to why HPV integration sites Genome-wide analysis of HPV integration in human and their integration sites in host genomes through next generation sequencing data. identify viruses and their integration sites using next-generation sequencing of human cancer 

         id: 10_1101-2021_02_09_430405
     author: Quazi, Sameer
      title: <em>In-silico</em> Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD
       date: 2021
      words: 5941
  sentences: 1038
      pages: 23
     flesch: 64
      cache: ./cache/10_1101-2021_02_09_430405.pdf
        txt: ./txt/10_1101-2021_02_09_430405.txt
    summary: In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD including structure-based drug-like compounds screening from online databases, molecular The final small molecules of drug-like compounds would have more effective and selected for the molecular docking with FGI-103 antiviral drug-using AutoDock 4.2 software. After that, FGI-103 was set and screen other drug-like compounds from PubChem databases. The finally selected drug-like compounds were docked with the P1 site of VP35 of based on ap1 site for ligand in every dock for VP35 MARV utilizing a grid chart of 50 × 50 × 50 The ADMET properties of finally selected drug-like compounds were checked to utilize 2D molecules structure of selected drug-like compounds (A) represents the 2D The molecule structure of three drug-like compounds is shown in Figure 6. "In-Silico Structural and Molecular Docking-Based Drug Discovery "In-Silico Structural and Molecular Docking-Based Drug Discovery 

         id: 10_1101-698605
     author: Sarantopoulou, Dimitra
      title: Comparative evaluation of full-length isoform quantification from RNA-Seq
       date: 2021
      words: 12853
  sentences: 1332
      pages: 37
     flesch: 55
      cache: ./cache/10_1101-698605.pdf
        txt: ./txt/10_1101-698605.txt
    summary: Comparative evaluation of full-length isoform quantification from RNA-Seq Full-length isoform quantification from RNA-Seq is a key goal in transcriptomics analyses benchmarking, isoform quantification, simulated data, pseudo-alignment, RNA-Seq, short Given the difficulty in full-length isoform quantification, many RNA-Seq studies simply analysis performed on the known true isoform quantifications of the simulated data to the For the simulated data we started with 11 real RNA-Seq samples: six liver and six the isoform expression level using idealized and realistic simulated data, with full and true counts), for the set of expressed isoforms in sample 1 in C) idealized and D) realistic data. Method effect on differential expression analysis, using realistic data. Method effect on differential expression analysis, using realistic data. RSEM is a gene/isoform abundance tool for RNA-Seq data which uses a generative model S1 Fig. Method effect on full-length isoform quantification using simulated data. Method effect on full-length isoform quantification using simulated data.

         id: 10_1101-2020_09_23_308239
     author: Schultz, Bruce T
      title: The COVID-19 PHARMACOME: A method for the rational selection of drug repurposing candidates from multimodal knowledge harmonization
       date: 2021
      words: 8797
  sentences: 1318
      pages: 31
     flesch: 57
      cache: ./cache/10_1101-2020_09_23_308239.pdf
        txt: ./txt/10_1101-2020_09_23_308239.txt
    summary: The COVID-19 PHARMACOME: A method for the rational selection of drug repurposing COVID-19 PHARMACOME, a comprehensive drug-target-mechanism graph generated from a initial version of the COVID-19 PHARMACOME, a comprehensive drug-target-mechanism graph representing COVID-19 pathophysiology mechanisms that includes both drug targets Figure 3: Overlap of compound hits between different drug repurposing screening experiments. space overlap between different COVID-19 drug repurposing screenings. The COVID-19 PHARMACOME associates pathways derived from drug repurposing targets Figure 4 shows the distribution of repurposing drugs in the COVID-19 cause-and-effect graph, overlap analysis allows for the identification of repurposing drugs targeting mechanisms that Virus-response mechanisms are targets for repurposing drugs Figure 5: Visualization of drug repurposing candidates (and their targets) used in combination treatment as our own drug repurposing screening results, we were able to identify mechanisms targeted COVID-19 PHARMACOME, we are now able to link repurposing drugs, their targets and the SARS-CoV-2 protein interaction map reveals targets for drug repurposing.

         id: 10_1101-2021_02_10_430619
     author: Schutz, Sacha
      title: Cutevariant: a GUI-based desktop application to explore genetics variations
       date: 2021
      words: 4932
  sentences: 632
      pages: 8
     flesch: 66
      cache: ./cache/10_1101-2021_02_10_430619.pdf
        txt: ./txt/10_1101-2021_02_10_430619.txt
    summary: Cutevariant: a GUI-based desktop application to explore genetics variations Cutevariant is a user-friendly GUI based desktop application for genomic research designed to search for variations in DNA samples collected in annotated files and encoded in the Variant Calling Format. application imports data into a local relational database wherefrom complex filter-queries can be built either Key words: genomics, DNA variant, desktop application, Domain Specific Language, Graphic User Interface applications import the data from VCF files into an indexed Cutevariant imports data from VCF files into a normalized Fig. 2: The Cutevariant main view showing the variants list sub-window (middle), different controllers sub-windows but not all are Just like Variant Tools, Cutevariant supports operations Features Cutevariant BrowseVCF VCF-Miner VCF-Explorer VCF-Server VCF-Filters GEMINI Variant Tools SnpSift Comparaison of time performance between cutevariant and VCF-miner for importation and query execution. 3. Pablo Cingolani, Adrian Platts, Le Lily Wang, Melissa VCF-Miner: GUI-based application for mining variants

         id: 10_1101-2021_02_11_430762
     author: Schäffer, Alejandro A.
      title: Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation
       date: 2021
      words: 16496
  sentences: 1489
      pages: 28
     flesch: 65
      cache: ./cache/10_1101-2021_02_11_430762.pdf
        txt: ./txt/10_1101-2021_02_11_430762.txt
    summary: Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation alignments of SSU, LSU and 5S rRNA from all three domains as well as from organelles, along with secondary structure predictions for selected sequences. Ribovore software package for the analysis of SSU rRNA and LSU rRNA sequences 18S SSU rRNA database of 1091 sequences was updated most recently on September 27, 2018 by running version 0.28 of the Ribovore program ribodbmaker on an input set of 579,279 GenBank sequences returned from the eukaryotic SSU rRNA The results of ribotyper and rRNA sensor are combined and each sequence is separated into one of four outcome classes depending on whether it passed or failed each input a set of candidate sequences and a specified rRNA model (e.g. SSU.Bacteria) two blastn databases: one of 1267 bacterial and archaeal 16S SSU rRNA sequences

         id: 10_1101-2021_02_12_430989
     author: Sofer, Tamar
      title: Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies
       date: 2021
      words: 8136
  sentences: 626
      pages: 27
     flesch: 48
      cache: ./cache/10_1101-2021_02_12_430989.pdf
        txt: ./txt/10_1101-2021_02_12_430989.txt
    summary: Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies as well as linear regression-based analyses for studying the association of continuous exposures generation of empirical null distribution of association p-values, and we apply the pipeline to Many studies of phenotypes associated with gene expression from RNA-seq consist of small Residual permutation approach for simulations and for empirical p-value computation covariates, and outcome distributions; and (b) their relationships, aside from the exposureoutcome association, are the same as in the real data, we used a residual permutation approach. association studies applied to residual permutations were included to compute empirical papproach to study the distribution of p-values under the null of no association between the phenotypes and RNA-seq, and used this approach to further study power, and to compute approaches for transcriptome-wide analysis of RNA-seq in population-based studies, including more comprehensive study of statistical permutation approaches for RNA-seq association 

         id: 10_1101-2021_02_09_430550
     author: Song, Dongyuan
      title: scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
       date: 2021
      words: 13512
  sentences: 1548
      pages: 37
     flesch: 64
      cache: ./cache/10_1101-2021_02_09_430550.pdf
        txt: ./txt/10_1101-2021_02_09_430550.txt
    summary: (scPNMF) method to select informative genes from scRNA-seq data in an unsupervised way. Therefore, for scRNA-seq data analysis, informative gene selection Besides scRNA-seq data analysis, informative gene selection is also crucial for designing number and a scRNA-seq dataset, scPNMF selects informative genes based on its weight matrix; First, the informative genes selected by scPNMF lead to the most accurate cell clustering. the informative genes and weight matrix of scPNMF lead to the best cell type prediction accuracy Figure 3: Benchmarking scPNMF against 11 informative gene selection methods on seven scRNA-seq (b) UMAP visualization of cells in the Zheng4 dataset based on 100 informative genes selected by We benchmark scPNMF against the 11 gene selection methods in terms of cell type prediction We propose scPNMF, an unsupervised gene selection and data projection method for scRNA-seq For cell type prediction, we project every targeted gene profiling dataset and its scRNA-seq

         id: 10_1101-2021_02_10_430705
     author: Stassen, Shobana V.
      title: VIA: Generalized and scalable trajectory inference in single-cell omics data
       date: 2021
      words: 13590
  sentences: 1383
      pages: 24
     flesch: 53
      cache: ./cache/10_1101-2021_02_10_430705.pdf
        txt: ./txt/10_1101-2021_02_10_430705.txt
    summary: 1 VIA: Generalized and scalable trajectory inference in single-cell omics data 1 VIA: Generalized and scalable trajectory inference in single-cell omics data 35 strategy to compute pseudotime, and reconstruct cell lineages based on lazy-teleporting random walks Step 1: Single-cell level graph is clustered such that each node 50 user defined start cell) is first computed by the expected hitting time for a lazy-teleporting random walk along an 57 network topology and single-cell level pseudotime/lineage probability properties onto an embedding using GAMs, as The cell fates and their lineage pathways are then computed by a two-stage probabilistic method, 94 graph-traversal allows it to infer cell fates when the underlying data spans combinations of multifurcating 201 detected cell fates annotated (o) lineage pathway and gene-pseudotime trend shown for the CD41 Megakaryocytic 259 Figure 3 VIA infers trajectories in single-cell multi-omic and image datasets (a) Major lineages of human Single cells are represented by graph nodes that are connected based on 

         id: 10_1101-727867
     author: Tangherloni, Andrea
      title: scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data
       date: 2021
      words: 15281
  sentences: 2865
      pages: 28
     flesch: 72
      cache: ./cache/10_1101-727867.pdf
        txt: ./txt/10_1101-727867.txt
    summary: scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data This computational tool allows for coupling low-dimensional probabilistic representation of gene expression data with the downstream analysis to consider the Finally, the currently available AEs cannot be directly exploited to obtain the latent space or to generate synthetic cells. to show the cells in this embedded space or as a starting point for other dimensionality reduction approaches (e.g., t-SNE and UMAP) as well as downstream analyses Non-linear approaches for dimensionality reduction can be effectively used to capture the non-linearities among the gene interactions that may exist in the highdimensional expression space of scRNA-Seq data [16]. be effectively applied to analyse disparate types of single-cell data from different flexible method developed to cluster single-cell data; (ii) a centroid is calculated batch-effect correction methods for single-cell rna sequencing data. Wang, D., Gu, J.: VASC: dimension reduction and visualization of single-cell RNA-seq data by deep

         id: 10_1101-2021_02_12_431018
     author: Truong Nguyen, Phuoc
      title: HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences.
       date: 2021
      words: 3786
  sentences: 502
      pages: 14
     flesch: 61
      cache: ./cache/10_1101-2021_02_12_431018.pdf
        txt: ./txt/10_1101-2021_02_12_431018.txt
    summary: HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences. HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage 2 Several new variants of SARS-CoV-2 have emerged globally, of which the 18 based assemblies on raw SARS-CoV-2 sequences in addition to identifying lineages to detect 26 variants of concern, we have developed an open source bioinformatic pipeline called HaVoC 27 monitor the spread of SARS-CoV-2 variants of concern during local outbreaks. currently being used in Finland for monitoring the spread of SARS-CoV-2 variants. SARS-CoV2, variant detection, reference assembly, lineage identification, coronavirus, 40 surveillance of virus variants by sequencing the SARS-CoV-2 genomes would provide a fast 80 to query SARS-CoV-2 fastq sequence libraries and assigns lineages to them individually in 92 processing and a reference genome of SARS-CoV-2 in a separate FASTA file. The likelihood of emergence of novel SARS-CoV-2 variants of concern is increased and 209 Emerging SARS-CoV-2 Variants.

         id: 10_1101-2021_02_11_430789
     author: Tyagin, Ilya
      title: Accelerating COVID-19 research with graph mining and transformer-based learning
       date: 2021
      words: 9408
  sentences: 807
      pages: 9
     flesch: 58
      cache: ./cache/10_1101-2021_02_11_430789.pdf
        txt: ./txt/10_1101-2021_02_11_430789.txt
    summary: Accelerating COVID-19 research with graph mining and transformer-based learning develop text mining techniques that can help the science community answer high-priority scientific questions related to COVID-19. is currently customized and available in the open domain to massively process COVID-19 related queries. Both systems are the next generation of the AGATHA knowledge network mining transformer model [37]. (1) Most of the existing HG systems are domain-specific (e.g., genedisease interactions) that is usually expressed in limiting the processed information (e.g., significant filtering vocabulary and papers a trained deep bi-LSTM model for extracting predicates from unstructured text. For instance, the node representing the entity "COVID-19" is connected to every sentence and predicate that The prior AGATHA semantic network only includes UMLS terms that appear in SemMedDB predicates [18] which is a major limitation. obtain embeddings per node in the semantic graph, we train AGATHA system ranking model.

         id: 10_1101-2021_02_11_430871
     author: Vadnais, David
      title: ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data
       date: 2021
      words: 10071
  sentences: 1053
      pages: 24
     flesch: 63
      cache: ./cache/10_1101-2021_02_11_430871.pdf
        txt: ./txt/10_1101-2021_02_11_430871.txt
    summary: ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data chromosome and genome structure reconstruction from Hi-C data using Particle Swarm Optimization approach chromosome bin, according to the particle swarm algorithm, and then iterates its position towards a global best This paper presents ParticleChromo3D, a new distance-based algorithm for chromosome 3D structure The structures generated by ParticleChromo3D also shows that the result at swarm size Structures generated by ParticleChromo3D at different swarm size values. obtained by comparing the ParticleChromo3D algorithm''s output structure to the simulated dataset''s true plot of ParticleChromo3D SCC performance on 500KB GM12878 cell Hi-C data for chromosome 1 to 23. plot of ParticleChromo3D SCC performance on 500KB GM12878 cell Hi-C data for chromosome 1 to 23. chromosome 3D structure reconstruction algorithms on the GM12878 data set at both the 1MB and 500KB chromosome and genome structures reconstructed from Hi-C data.

         id: 10_1101-2021_02_10_430606
     author: Wei, Zheng
      title: NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks
       date: 2021
      words: 12013
  sentences: 1107
      pages: 31
     flesch: 63
      cache: ./cache/10_1101-2021_02_10_430606.pdf
        txt: ./txt/10_1101-2021_02_10_430606.txt
    summary: Each point is a decoupled motif generate by a sample set of sequence. Only the max activation value of the decoupled motifs in Fig. 3b are significantly higher than the decoupled motifs of other neurons in layer 3 of Basset-3 model. discovered (q-value < 0.001) from the neuron in convolutional output layer of Basset, BD-5 and BD-10 model. c, The number of motif discovered (q-value < 0.01) from the neuron in layer 3 of Basset model using different sub-patterns in the input feature map of the max pooling layer to split the sequences set of which are DNA-sequence based DCNN models with 3 general convolutional layers for stacking sequences of different synonymous motifs with the maximum activation value In summary, we presented NeuronMotif as an effective algorithm to reveal the cisregulatory motif grammar learned by DCNN model that use DNA sequence to annotate sequences indicate more synonymous motif mixture in this DCNN model.

         id: 10_1101-2021_02_10_430649
     author: Wen, Zi-Hang
      title: Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data
       date: 2021
      words: 8418
  sentences: 1302
      pages: 11
     flesch: 71
      cache: ./cache/10_1101-2021_02_10_430649.pdf
        txt: ./txt/10_1101-2021_02_10_430649.txt
    summary: Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data Recovering dropout events in a sparse gene expression matrix for scRNA-seq data is a long-standing matrix completion We introduce Bfimpute, a Bayesian factorization imputation algorithm that reconstructs two latent gene and cell matrices to impute final gene expression matrix within each cell group, with or without the aid of cell type labels or bulk Bfimpute achieves better accuracy than other six publicly notable scRNA-seq imputation methods on simulated Key words: single cell; RNA-seq; imputation; Bayesian factorization impute dropout events by adopting the bulk RNA-seq data imputation of single cell RNA-seq data could be applied by Bfimpute recovers dropout values and improves cell type identification in the simulated data. and the imputed data by Bfimpute, scImpute, and DrImpute for the human embryonic stem cell differentiation study. imputation method scimpute for single-cell rna-seq data.

         id: 10_1101-2021_02_10_430604
     author: Youngblut, Nicholas D.
      title: Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets
       date: 2021
      words: 1409
  sentences: 157
      pages: 4
     flesch: 56
      cache: ./cache/10_1101-2021_02_10_430604.pdf
        txt: ./txt/10_1101-2021_02_10_430604.txt
    summary: Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets 1 Struo2: efficient metagenome profiling database construction for ever-expanding 10 Mapping metagenome reads to reference databases is the standard approach for 12 reference databases often lack recently generated genomic data such as 15 method for constructing custom databases; however, the pipeline does not scale well with the 17 not allow for efficient database updating as new data are generated. 20 HUMAnN3 databases that can be easily updated with new genomes and/or individual gene Struo2 enables feasible database generation for continually increasing large-scale 25 ● Pre-built databases: http://ftp.tue.mpg.de/ebio/projects/struo2/ 26 ● Utility tools: https://github.com/nick-youngblut/gtdb_to_taxdump 28 Metagenome profiling involves mapping reads to reference sequence databases and is 39 computational resources, which led us to create Struo for straight-forward custom metagenome 54 CPU hours per genome versus ~2.4 for Struo (Figure 1B). 67 taxonomy (available at https://github.com/nick-youngblut/gtdb_to_taxdump ). (2020) Struo: a pipeline for building custom databases for 

         id: 10_1101-2021_02_10_430656
     author: Zakeri, Mohsen
      title: A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing
       date: 2021
      words: 6557
  sentences: 568
      pages: 7
     flesch: 64
      cache: ./cache/10_1101-2021_02_10_430656.pdf
        txt: ./txt/10_1101-2021_02_10_430656.txt
    summary: A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing benchmark comparing the kallisto-bustools pipeline (2) for single-cell demonstrate that, when configured to match the computational complexity of kallisto-bustools as closely as possible, alevin-fry processes Alevin-fry (3) is a new pipeline for single-cell RNA-seq benchmarking STARsolo (9), kallisto-bustools (2) and alevin-fry (3), out new tools like alevin-fry for the pre-processing of single-cell data, (1), we have now created a simple-to-follow tutorial for speedoptimized single-cell pre-processing using alevin-fry (https:// by Booeshaghi and Pachter (1) change when a like-for-like comparison between alevin-fry and kallisto-bustools is carried out, we The time and memory used by the relevant steps of the alevin-fry and kallisto-bustools pipelines for pre-processing the 20 diverse tagged-end single-cell RNA-seq datasets used in (1). A comparison of the resulting count matrices obtained from alevin-fry and kallisto-bustools, as run in this manuscript, for the pbmc_10k_v3 dataset. peak memory than alevin-fry, with the kallisto-bustools pipeline using

         id: 10_1101-2021_02_08_430275
     author: Zhang, Jianbo
      title: Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes
       date: 2021
      words: 6404
  sentences: 694
      pages: 6
     flesch: 68
      cache: ./cache/10_1101-2021_02_08_430275.pdf
        txt: ./txt/10_1101-2021_02_08_430275.txt
    summary: Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes identified using BSA-Seq, a technology in which next-generation sequencing (NGS) is applied to bulked segregant analysis (BSA). recently developed the significant structural variant method for BSASeq data analysis that exhibits higher detection power than standard to analyze BSA-Seq data in which genome sequences of one parent served as the reference sequences in genotype calling, and thus We analyzed a public BSA-Seq dataset using our modified method and the standard allele frequency and Gmethod allows the detection of such associations without sequencing the parental genomes, leading to further lower the the BSA-Seq data with the genome sequences of both the parents101 when the parental genome sequences are used to aid BSA-Seq data 193 The allele frequency method: The ΔAF value of each SNP in 267 BSA-Seq data analysis using the genome sequences of both the parents and the bulks. BSA-Seq data analysis using only the bulk genome sequences.

         id: 10_1101-2020_05_15_090266
     author: Zhang, R.
      title: SpacePHARER: Sensitive identification of phages from CRISPR spacers in prokaryotic hosts
       date: 2021
      words: 2191
  sentences: 283
      pages: 6
     flesch: 64
      cache: ./cache/10_1101-2020_05_15_090266.pdf
        txt: ./txt/10_1101-2020_05_15_090266.txt
    summary: Summary: SpacePHARER (CRISPR Spacer Phage-Host Pair Finder) is a sensitive and fast tool for de novo prediction of phage-host relationships via identifying phage genomes that match CRISPR spacers in genomic or metagenomic data. SpacePHARER gains sensitivity by comparing spacers and phages at the protein level, optimizing its scores for matching SpacePHARER by searching a comprehensive spacer list against all complete phage genomes. methods compare individual CRISPR spacers with phage To increase sensitivity, (1) we compare protein coding sequences because phage genomes are mostly coding, and, (0) Preprocess input: scan the phage genome and CRISPR spacers in six ORFs q of CRISPR spacers extracted from one prokaryotic genome, and each target set T comprises the putative protein sequences t from a single phage. The performance of SpacePHARER was evaluated on the spacer test set against a target database predicted the correct host for more phages than BLASTN BLASTN in detecting phage-host pairs, due to searching

         id: 10_1101-2021_02_08_430070
     author: Zhang, Yao-zhong
      title: On the application of BERT models for nanopore methylation detection
       date: 2021
      words: 5183
  sentences: 586
      pages: 7
     flesch: 60
      cache: ./cache/10_1101-2021_02_08_430070.pdf
        txt: ./txt/10_1101-2021_02_08_430070.txt
    summary: On the application of BERT models for nanopore methylation detection with deep learning models, have achieved significant performance improvements on nanopore methylation recurrent patterns of positional-signal-shift in the context window surrounding target 5-methylcytosine that the refined BERT model can achieve competitive or even better results than the state-of-the-art biRNN of datasets from the different research groups, BERT models demonstrate a good generalization Fig. 1: Basic BERT''s and refined BERT''s model structure used for methylation detection. a refined BERT model to take account of signal-shift patterns in the proposed refined BERT model achieves a competitive or even better result explore applying the BERT model for the nanopore methylation detection 2.2 Applying BERT models for nanopore methylation For the cross-sample evaluation, we train models on one dataset and test a BERT model to pay more attention to center positions. In-sample evaluation of different deep learning models on 5mC datasets.

         id: 10_1101-2021_02_01_429246
     author: Zheng, Hongyu
      title: Sequence-specific minimizers via polar sets
       date: 2021
      words: 15440
  sentences: 1407
      pages: 24
     flesch: 71
      cache: ./cache/10_1101-2021_02_01_429246.pdf
        txt: ./txt/10_1101-2021_02_01_429246.txt
    summary: minimizers focus on sampling fewer k-mers on a random sequence and use universal hitting sets (sets suggests, a UHS is a set of k-mers that "hits" every w-long window of every possible sequence (hence the the elements of the polar sets are in the sequence: the higher the energy, the more spread apart the k-mers have densities upper bounded by |U|/σk, because only k-mers from the universal hitting set can be selected. Section 2.2 gives a formal definition of the link energy of a polar set and Theorem 1 gives upper and lower bounds using this link energy for the density of a minimizer compatible with a polar set. form a link, which in turn is the number of k-mer pairs in the polar set that are exactly w bases away on S. A context is charged if the minimizer selects a different k-mer in the first window than in the second

==== make-pages.sh questions
==== make-pages.sh search
==== make-pages.sh topic modeling corpus
Zipping study carrel