id author title date pages extension mime words sentences flesch summary cache txt cord-310464-lkdkdque Rayko, Mikhail Quality control of low-frequency variants in SARS-CoV-2 genomes 2020-05-07 .txt text/plain 1702 109 58 During the current outbreak of COVID-19, research labs around the globe submit sequences of the local SARS-CoV-2 genomes to the GISAID database to provide a comprehensive analysis of the variability and spread of the virus during the outbreak. As a result of the collaborative efforts of the researchers worldwide, on April 14, 2020 it contained over 8,000 SARS-nCoV-2 genomes from different countries, sequenced and assembled using various technologies and approaches. GISAID database curators do a tremendous job of filtering submitted sequences, but sometimes it is difficult to distinguish real variants from errors, especially at the lack of information about coverage. Dataset 8,053 full-length (>29,000 bp) sequences of the SARS-CoV-2 were downloaded from the GISAID database ( www.epicov.org ) on April 14, 2020, including 5,556 genomes marked as "high coverage". Full table with percentage of singleton-containing genomes depending on sequencing and assembly method. ./cache/cord-310464-lkdkdque.txt ./txt/cord-310464-lkdkdque.txt