id author title date pages extension mime words sentences flesch summary cache txt cord-291523-4dtk1kyh Nguyen, Thanh Thi Origin of Novel Coronavirus (COVID-19): A Computational Biology Study using Artificial Intelligence 2020-07-01 .txt text/plain 5361 313 62 Outcomes of a phylogenetic analysis suggest that the virus belongs to the genus Betacoronavirus, sub-genus Sarbecovirus, which includes many bat SARS-like CoVs and SARS CoVs. Another study in [5] confirms this finding by analysing genomes obtained from three adult patients admitted to a hospital in Wuhan on December 27, 2019. With the cut-off parameter C is set equal to 0.7, the hierarchical clustering algorithm separates the reference sequences into 6 clusters in which cluster "5" comprises all examined viruses of the Sarbecovirus sub-genus, including many SARS CoVs, bat SARS-like CoVs and pangolin CoVs (Fig. 7A) . With the results obtained in Fig. 7D (and also in the experiments with the DBSCAN method presented next), we support a hypothesis that bats or pangolins are the probable origin of SARS-CoV-2. In this Appendix, we first present results of the hierarchical clustering method applied to the dataset that combines Set 1 of reference sequences (Table 1 ) with all 334 SARS-CoV-2 sequences (see Fig. 9 ). ./cache/cord-291523-4dtk1kyh.txt ./txt/cord-291523-4dtk1kyh.txt