key: cord-0736728-dzyqiyck authors: Leung, K.; Shum, M. H.; Leung, G. M.; Lam, T. T.; Wu, J. T. title: Early empirical assessment of the N501Y mutant strains of SARS-CoV-2 in the United Kingdom, October to November 2020 date: 2020-12-22 journal: nan DOI: 10.1101/2020.12.20.20248581 sha: 8687fdd912f93d9aac71eed36734fc0c3c2667e7 doc_id: 736728 cord_uid: dzyqiyck Two new SARS-CoV-2 lineages with the N501Y mutation in the receptor binding domain of the spike protein have rapidly become prevalent in the UK. We estimated that the earlier 501Y lineage without amino acid deletion {Delta}69/{Delta}70 circulating mainly between early September to mid-November was 10% (6-13%) more transmissible than the 501N lineage, and the currently dominant 501Y lineage with amino acid deletion {Delta}69/{Delta}70 circulating since late September was 75% (70-80%) more transmissible than the 501N lineage. Δ 69/Δ70 circulating since late September was 75% (70-80%) more transmissible than the 501N lineage. (Abstract word count: 75) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 22, 2020. ; https://doi.org/10.1101/2020.12.20.20248581 doi: medRxiv preprint Two new SARS-CoV-2 lineages carrying the amino acid substitution N501Y in the receptor binding domain (RBD) of the spike protein (S protein) have rapidly become prevalent in the UK. The earlier 501Y lineage (501Y Variant 1) cocirculated with the 501N lineage between early September and mid-November in Wales, where the prevalence never exceeded 2%. However, a later 501Y lineage (501Y Variant 2) started to cocirculate with the 501N lineage in England since late September and has since become the dominant lineage from late November. In the UK, the prevalence of the latter 501Y Variant 2 lineage has increased from 0.1% in early October to 49.7% in the late November. Of note, the N501Y mutation co-occurs with several mutations in the ORF1a, ORF8, N and the S protein in 501Y Variant 2, including two amino acid deletions Δ 69 and Δ 70. The rapid spread of 501Y Variant 2 suggests it may have a transmission advantage over the 501N lineage. Structural biological studies of the SARS-CoV-2 RBD offer insights proposing that 501Y may increase ACE2 binding [1] and that the open state conformation of the 501Y S protein [2] is associated with more efficient viral entry and infection. Epidemiologically however, there has been limited assessment to date investigating whether any of these mutations may have changed transmissibility [3] . Here we adopted our previous epidemiological framework for relative fitness inference of co-circulating pathogen strains, which has been applied on influenza viruses [4] and SARS-CoV-2 614D/G strains [5] , to characterize the comparative transmissibility of the 501Y Variant 1 and Variant 2 strains. We downloaded the multiple sequence alignment of complete (and nearly complete) genomes of SARS-CoV-2 from the GISAID database (www.gisaid.org) on 14 December 2020. We extracted all those viral genomes carrying 501Y in the translated spike protein sequences, and analyzed them with other closely is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 22, 2020. ; https://doi.org/10.1101/2020.12.20.20248581 doi: medRxiv preprint 501Y (without Δ 69/Δ70) have been detected in Australia and South Africa circulating during June-July and October-November 2020, respectively. To include more sequences for the study of virus fitness, we extended our search of more 501Y Variant 1 and Variant 2 sequences in the latest GISAID dataset downloaded on 19 December 2020, including both the complete genomes and the partial ones covering spike genes. We applied a fitness inference framework to the sequence data collected from the UK between September 29 and November 16 during the cocirculation period of the three strains (see Supplementary Information for details). We assumed that the N501Y mutation and Using confirmed deaths (adjusted for the delay between onset and death [7] ) as the proxy for the COVID-19 epidemic curve [8] , we estimated that That is, the basic reproductive number of the 501Y Variant 1 and Variant 2 was 10% (6-13%) and 75% (70-80%) higher than that of the 501N stain, respectively. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted December 22, 2020. ; https://doi.org/10.1101/2020.12.20.20248581 doi: medRxiv preprint Outside the UK, only a small number of 501Y Variant 2 has been identified (e.g., in Denmark), but it is unclear whether they were exports from the UK until more sequence data become available. Although sporadic spread of the 501Y mutation occurred in Wales and elsewhere (e.g., Australia, Spain, and US), not all variants with 501Y have become prominent. In South Africa, a new variant with 501Y but not . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 22, 2020. ; https://doi.org/10.1101/2020.12.20.20248581 doi: medRxiv preprint We collated all data from publicly available data sources. All data included in the analyses are available in the main text or the supplementary materials. We thank colleagues who have shared the SARS-CoV-2 sequences in GISAID (www.gisaid.org). The authors declare no competing interests. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 22, 2020. ; https://doi.org/10.1101/2020.12.20.20248581 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 22, 2020. ; https://doi.org/10.1101/2020.12.20.20248581 doi: medRxiv preprint SARS-CoV-2 RBD DMS 2020 Modelling conformational state dynamics and its role on infection for SARS-CoV-2 Spike protein variants Genomics UK (COG-UK) Consortium. Update on new SARS-CoV-2 variant and how COG-UK tracks emerging mutations Monitoring the fitness of antiviral-resistant influenza strains during an epidemic: a mathematical modelling study. The Lancet Infectious Diseases Empirical transmission advantage of the D614G mutant strain of SARS-CoV-2. medRxiv FastTree 2-approximately maximum-likelihood trees for large alignments First-wave COVID-19 transmissibility and severity in China outside Hubei after control measures, and second-wave scenario planning: a modelling impact assessment. The Lancet An interactive web-based dashboard to track COVID-19 in real time We assumed that the N501Y mutation and Δ 69/Δ70 deletions characterize the three strains 501N, 501Y Variant 1 and 501Y Variant 2, but their differential transmissibility (if any) might be attributable to the combination of N501Y and other mutations including Δ 69/Δ70 deletions acquired in the emergence of 501Y Variant 1 and 2 lineages (Figure 1 ). For conciseness, we used N, Y1 and Y2 to denote the three strains. We defined the comparative transmissibility of any two strains as the ratio of their basic reproductive numbers. That is, the comparative transmissibility of strains Y1 and Y2 with respective to strain N wasWe formulated a framework to infer ߪ ଵ and ߪ ଶ under the following base case assumptions: (1) the three strains co-circulated locally during our study period (September 22 to November 16, 2020); (2) nonpharmaceutical interventions (NPIs) had the same effect on all three strains; (3) the probability that an infected person was selected for viral sequencing did not depend on which strain he/she was infected with;(4) recovery from infection with any strain provided protection against reinfection of all strains during our study period; (5) age-specific susceptibility to infection (if any) was the same for all three strains; and (6) after community transmission of strains Y1 and Y2 have been established, the effect of further de novo emergence on their prevalence was negligible.Under these base case assumptions [4, 5] , the next generation matrix (NGM) of infections by strains Y1 ) were estimated using the following likelihood function:. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted December 22, 2020. ; https://doi.org/10.1101/2020.12.20.20248581 doi: medRxiv preprintThe statistical inference was performed in a Bayesian framework with non-informative (flat) priors using Markov Chain Monte Carlo.. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprintThe copyright holder for this this version posted December 22, 2020. ; https://doi.org/10.1101/2020.12.20.20248581 doi: medRxiv preprint