key: cord-1039997-860zqov3 authors: Chaillon, A; Smith, D M title: Phylogenetic analyses of SARS-CoV-2 B.1.1.7 lineage suggest a single origin followed by multiple exportation events versus convergent evolution date: 2021-03-27 journal: Clin Infect Dis DOI: 10.1093/cid/ciab265 sha: dbc8d2aec162689a3ddc2905137bb4b30d9b789b doc_id: 1039997 cord_uid: 860zqov3 The emergence of new variants of SARS-CoV-2 herald a new phase of the pandemic. This study used state-of-the-art phylodynamic methods to ascertain that the rapid rise of B.1.1.7 “Variant of Concern” most likely occurred by global dispersal rather than convergent evolution from multiple sources. A c c e p t e d M a n u s c r i p t Following phylogenetic and epidemiological investigations, the SARS-CoV-2 genetic lineage B.1.1.7 is suspected to be associated with an increase human-to-human viral transmissibility 1,2 , and was classified as a "Variant of Concern" (VOC B.1.1.7) on December 18, 2020 3 . The variant was first discovered in Kent, United Kingdom (UK) on September 21, 2020, and has since been identified in over 40 countries across the world, including the United States [3] [4] [5] [6] . We sought to evaluate whether the breadth of VOC B.1.1.7 identification represents convergent evolution 7 Figure S1 ). We combined these B.1.1.7 sequences with a representative set of non-B.1.1.7 sequences (n=4,768) based on sequence homology. All sequences were aligned using MAFFT and highly homoplasic sites were masked 10 . To reduce the dataset size while maintaining an appropriate set of epidemiologically relevant background sequences, we used BLAST 11 The earliest estimated seeding of B.1.1.7 from the UK dates to September 9 th 2020 in Denmark, and the most recent to January 8 th 2021 in Spain, see Supplementary Table 3 and Supplementary Figure S2 ). The number of weekly introductions outside UK peaked in mid-December (Figure 2) . In the US, the first introduction was estimated on November 14 th in Florida. Five distinct introductions in California were also identified from December 3 rd to Devember 26 th , including one cluster of 19 sequences. Of note, 6 international non-UK clusters including ≥2 countries were identified of whom 2 did not include European Table 3 ). In response to the rapid increase in viral infections and spread, UK officials announced a lockdown on October 31 st that came into force on November 5 th and ended on December 5 th . Given time to the most recent common ancestor (TMRCA) estimates, we determined that 19% (17/90) of the exportation events that gave rise to detectable non-UK VOC B1.1.7 transmission lineages occurred during this period (the remaining 81% occurred before or after these dates). The emergence and rapid dispersal of this new VOC led to the implementation of a new national strict lockdown in UK on January 4, 2021 26 . As previously described by du Plessis et al. 14 , we next used the TMRCA of each non-UK clade to estimate the genomic "detection lag" for each cluster, which represents the duration that a transmission lineage went undetected before it was first sampled by genome sequencing. The mean detection lag was ~10.6 days (IQR= [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] . This largely agrees with detection lag-time estimates from SARS-CoV-2 importation into the UK in the first months of the pandemic 14 A c c e p t e d M a n u s c r i p t Acknowledgements. We gratefully acknowledge the Authors from the Originating Laboratories and the Submitting Laboratories who generated and shared via GISAID the data on which this research is based. In particular , we would like to acknowledge the role of the Genomics UK (COG-UK) consortium who generated the vast majority of sequences from UK. See supplementary material for the acknowledgment table. Funding: This work was supported by grants from the NIH (San Diego Center for AIDS Research, CFAR, AI036214). AC was supported by NIH Grant AI131971 (R21). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Neither of the authors has any potential conflicts to disclose. A c c e p t e d M a n u s c r i p t Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations Transmission of SARS-CoV-2 Lineage B.1.1.7 in England: Insights from linking epidemiological and genetic data. medRxiv Variant of Concern 202012/01 New York detects first case of UK variant, California reveals further occurrences -as it happened Global report investigating novel coronavirus haplotypes (grinch). B.1.1.7 report A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity Global initiative on sharing all influenza data -from vision to reality An integrated national scale SARS-CoV-2 genomic surveillance network Issues with SARS-CoV-2 sequencing data Basic local alignment search tool BLAST+: architecture and applications Contribution of Epidemiological Predictors in Unraveling the Phylogeographic History of HIV-1 Subtype C in Brazil The multi-faceted dynamics of HIV-1 transmission in Northern Alberta: A combined analysis of virus genetic and public health data. Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular biology and evolution IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0 A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic Inference Molecular footprint of drug-selective pressure in a human immunodeficiency virus transmission chain Limitations of using mobile phone data to model COVID-19 transmission in the USA Association between mobility patterns and COVID-19 transmission in the USA: a mathematical modelling study The severe acute respiratory syndrome Updating the accounts: global mortality of the 1918-1920 "Spanish" influenza pandemic The next influenza pandemic: lessons from Hong Kong Avian influenza viruses infecting humans Nipah virus: a recently emergent deadly paramyxovirus The outbreak of West Nile virus infection in the New York City area in 1999 The origin of the 1918 pandemic influenza virus: a continuing enigma Epidemic timeline, differential diagnoses, determining factors, and lessons for future response Increasing importance of European lineages in seeding the hepatitis C virus subtype 1a epidemic in Spain Cross-country migration linked to people who inject drugs challenges the long-term impact of national HCV elimination programmes A c c e p t e d M a n u s c r i p t