key: cord-283579-aejbfk3l authors: Hilda, Awoyelu Elukunbi; Kolawole, Oladipo Elijah; Olufemi, Adetuyi Babatunde; Senbadejo, Tosin Yetunde; Oyawoye, Olubukola Monisola; Kola, Oloke Julius title: Phyloevolutionary analysis of SARS-CoV-2 in Nigeria date: 2020-06-14 journal: New Microbes New Infect DOI: 10.1016/j.nmni.2020.100717 sha: doc_id: 283579 cord_uid: aejbfk3l Abstract Background Phyloepidemiologic approaches have given specific insight to understanding emergence and evolution of infection. Knowledge on the outbreak and spread of SARS-CoV-2 in Nigeria would assist in providing preventive measures to reduce transmission among populations at risk. Therefore, this study aimed at investigating the evolution of SARS-CoV-2 in Nigeria. Materials and Method A total of 39 complete genomes of SARS-CoV-2 were retrieved from the GISAID EpiFluTM database on March 29th 2020 to investigate its evolution in Nigeria. Sequences were selected based on the travel history of the patient and the collection date. Other sequences were not selected because they were short, contained artefacts, not from original source or had insufficient information. Evolutionary history was inferred using Maximum Likelihood method based on the General Time Reversible model. Phylogenetic tree was constructed to determine the common ancestor of each strain. Results The phylogenetic analysis showed the strain in Nigeria clustered in a monophyletic clade with a Wuhan sublineage. Nucleotide alignment also showed a 100% similarity indicating a common origin of evolution. Comparative analysis showed 27,972 (93.6%) identical sites and 97.6% pairwise identity with the consensus. Conclusion The study evidently showed the entire outbreak of COVID-19 infection in Nigeria stemmed from a single introduction sharing consensus similarity with the reference SARS-CoV-2 human genome from Wuhan. Preventive measures that can limit the spread of the infection among populations at risk should be implemented. The first confirmed case of COVID-19 in Nigeria was announced on February 27, 2020 when an infected traveler from one of the WHO identified high-risk country; Italy arrived by commercial aircraft into Lagos. Although, the traveler's movement was restricted, another positive case was reported in Ewekoro, Ogun State, a Nigerian citizen who had contact with the Italian citizen. Despite lockdown in some states and several precautionary measures put in place to prevent and contain the spread of the disease, the Nigeria Centre for Disease Control (NCDC) has since then reported over 400 cases of infected patients with 17 deaths in about 20 States (NCDC, 2020) . Incidences of emerging/reemerging viral infections have significantly affected human health despite extraordinary progress in the area of biomedical knowledge (Parvez and Parveen, 2017) . The key to understanding this emergence and evolution of novel viruses is subject to knowledge of intricate host-pathogen-environment relationship (Susan and Julian, 2011) . Understanding the modes of transmission of emerging infectious disease continues to be a key factor in implementing effective public health measures (Rota et al., 2003) . Proper tracking of genome sequences has helped to ensure optimal virus diagnostic tests, track and trace the ongoing outbreak and proper identification of potential intervention options . Reports have suggested the route of transmission via airborne (Booth et al., 2005) , direct contact, droplet and transmission from mildly ill or asymptomatic individuals (Omrani et al., 2013; Paules et al., 2020) . However, lack of evidence on transmission dynamics can lead to inconsistencies in the isolation guidelines. Phyloepidemiologic approaches have given specific insight into understanding emergence and evolution of emerging and reemerging viruses, particularly SARS-CoV-2 (Avise, 2000). Knowledge on the outbreak and spread of SARS-CoV-2 in Nigeria would help in providing preventive measures and reduce transmission among populations at risk. Hence, this study is aimed at investigating the evolution of SARS-CoV-2 in Nigeria. A total of 39 complete genome (Table 1 ) of novel SARS-CoV-2 were retrieved from the GISAID database (https://www.epicov.org) on March 29 th 2020. Sequences retrieved include China, Italy, France, Nigeria, South Africa and Congo. Sequences were selected based on the travel history of the patient and the collection date. The sequences were then aligned to obtain the conserved regions using multiple sequence alignment (MSA) with the aid of Clustal W on Mega X. The sequences were subjected to evolutionary divergence analysis. Phylogenetic tree was constructed to determine the common ancestor of each strain using MEGA 5.2. Comparative analysis of strains within clades was performed on Geneious Prime (https://www.geneious.com/) based on statistical analysis to determine positions in the genomic sequences from Nigeria that significantly differ between other strain. The analysis involved 39 nucleotide sequences. The evolutionary history was inferred using Maximum Likelihood method based on the General Time Reversible model. The bootstrap consensus tree inferred from 1000 replicates is taken to represent the evolutionary history of the taxa analyzed. Branches corresponding to partitions reproduced in less than 50% bootstrap replicates were collapsed. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 200.0000)). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 0.0000% sites). Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 25459 positions in the final dataset. The maximum likelihood tree is shown in Figure 1 . Multiple sequence alignment using Clustal W (https://www.ebi.ac.uk/Tools/services/rest/muscle), showed all the genomes that formed a clade with the strain from Nigeria generally resembled with more than 70% similarity in the genetic sequence. Figures 2a -g showed consensus similarities and variants between 3 strains from Wuhan, China and Nigeria, including human SARS-CoV-2 human genome. Sequences in the alignment were compared to the consensus to identify polymorphism. At each position, the consensus is the allele with frequency greater than 50%. N ambiguity if no allele exceeds 50%. Knowledge on the transmission chain of an emerging or reemerging virus combined with sequence data provide more insight into explaining occurrence of mutation and spread of infection (Folarin et al., 2016) . In the Nigerian COVID-19 outbreak, viewed by itself, Nigerian sequence data submitted by Okwuraiwe et al., (2020) has provided a clear picture that the entire outbreak stemmed from a single introduction into the country. This is an indication that the first case is an imported case and this has turned to a serious health challenge in the society. Along with somewhat close proximity of the sample collection dates to each other, it is possible to reconstruct a transmission chain and infer that patient from Nigeria contracted the virus through a patient that got infected with the Wuhan strain as seen from the phylogenetic tree. The two strains are in the same clade by sharing the genetic information. The reconstruction can be explained through the findings of Parvez and Parveen (2017) that the emergence of infectious diseases in naïve regions is dependent on movement of pathogens via trade and travel while local emergence is driven by a combination of environmental and social change . Multiple sequence alignment separated the strains studied into 3 distinct clades depending on their divergence with their common ancestor. The strains from Wuhan further subdivided the clades into 3 subclades. The strain from Nigeria was found in the Wuhan subclade 3 together with some strains from Congo and France. The strains that formed a monophyletic clade with Wuhan subclade 3 resembled with more than 70% similarity in the genetic sequence. More importantly, the tree confirmed that the outbreak in Nigeria was due to a single introduction from China/Wuhan through an imported case of an Italian. More specifically, the imported SARS-CoV-2 strain from Nigeria is a descendant of China/Wuhan strain as likewise described by Zhu et al., (2020) . Comparative analysis of the strain from Nigeria, 2 strains from Wuhan sharing the same clade and the reference human SARS-CoV-2 genome was done. Results from Geneious Prime showed all the 4 sequences had 27,972 (93.6%) identical sites and 97.6% pairwise identity. The strain from Nigeria and Wuhan strain (WH05/2020) had more genome sequence similarity as compared with strain WH01/2020. They shared consensus similarity with the reference SARS-CoV-2 human genome showing common descendant as observed from other studies by Holshue et al., (2020) and Arima et al., (2020) . The imported strain into Nigeria by the Italian shared less than 20% variant characteristics with Wuhan strain WH01/2020. Compared with the consensus, the strain from Nigeria had 49 gaps, 39 unknowns and 7 point mutations. More than 80% of these differences were unique to Nigeria. Summarily, on the basis of the evolutionary analysis, it is evident that human-to-human transmission occurred, hence preventive measures should be adhered to control the spread of the virus. The study evidently showed the entire outbreak of COVID-19 infection in Nigeria stemmed from a single introduction sharing consensus similarity with the reference SARS-CoV-2 human genome from Wuhan. Establishment of the phyloevolutionary relationship of the Nigerian obtained reference sequence for SARS-CoV-2 could benefit biological study of this virus, diagnosis, clinical monitoring and intervention of SARS-CoV-2 in Nigeria. Coronavirus Pathogenesis Detection of airborne severe acute respiratory syndrome (SARS) coronavirus and environmental contamination in 20 SARS outbreak units Clinical features of patients infected with 2019 novel coronavirus in Wuhan Coronavirus infections-more than just the common cold A family cluster of Middle East respiratory syndrome coronavirus infections related to a likely unrecognized asymptomatic or mild case A complete sequence and comparative analysis of a SARS-associated virus (isolate BJ01) Decoding the evolution and transmission of the novel pneumonia coronavirus (SATS-CoV-2) using whole genome data Molecular Evolution and Phylogenetics MEGA5: Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods Confidence limits on phylogenies: An approach using the bootstrap Ebola virus epidemiology and evolution in Nigeria Evolution and emergence of pathogenic viruses: past, present and future First African SARS-CoV-2 genome sequence from Nigerian COVID-19 case Genomic characterization and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding A novel coronavirus from patients with pneumonia in China for the Washington State 2019-nCoV Case Investigation Team* (2020). First case of 2019 novel coronavirus in the United States Severe Acute Respiratory Syndrome Coronavirus 2 Infection among Returnees to Japan from Wuhan, China, 2020. Emerging Infectious Diseases