key: cord-1011267-44p12y2i authors: Li, Xiaoyan; Gao, Xin; Zou, Ming; Zhuang, Zhichao; Tan, Zhaolin; Zheng, Baolu; Yu, Aiping; Su, Xu title: SARS-CoV-2 Genomic Sequencing Revealed N501Y and L452R Mutants of S/A Lineage in Tianjin Municipality, China date: 2021-08-11 journal: Virol Sin DOI: 10.1007/s12250-021-00432-5 sha: 02c728d3bdd2735314504e2a0cb1e90936f95fd3 doc_id: 1011267 cord_uid: 44p12y2i nan Coronavirus Disease 2019 (COVID-19) outbreak caused by sever acute respiratory syndrome coronavirus 2 (SARS-CoV-2) presents a global pandemic which has resulted in more than 4 million people death in the world. In China, the initial transmission of COVID-19 has already been blocked by strict control strategies and effective treatment for patients. All local COVID-19 outbreaks after April 2020 were related to overseas importing. Up to February 2021, 207 imported COVID-19 cases have been reported in Tianjin, China (Tianjin Health Commission, 2021) . SARS-CoV-2, a kind of positive-sense single stranded RNA virus and belongs to the Betacoronavirus genus of the Coronaviridae family. Fourteen open reading frames (ORFs) constitute the majority of the * 2.9-kb SARS-CoV-2 genome, which encodes four structural proteins including nucleocapsid protein (N), membrane protein (M), spike protein (S) and RNA independent RNA polymerase (RdRp) and other non-structural proteins . The intrinsically high error rates of RdRp result in the stochastic introduction of mutations during viral genome replication (Liu et al. 2021) . SARS-CoV-2 has evolved into two main lineages according to genome-based typing method established by China CDC (China CDC Lineage) and PANGO Lineages method (PANGO Lineage) (Rambaut et al. 2020; Wu et al. 2020; Yang and Xu 2021) . The S/A-lineage defined by two nucleotides (8782T, 28144C) mainly circulates in Asia and the L/B-lineage defined by 8782C and 28144T mainly circulates in Europe and America. B.1.1.7-lineage/501Y.V1 mutation exhibited a rapid increase in its range and incidence, and other variants emerged in succession namely B.1.351-lineage/ 501Y.V2 from South Africa, P.1-lineage/501Y.V3 from Brazil, and B.1.207-lineage from Nigeria (Hodcroft et al. 2021; Oude Munnink et al. 2021) . In this study, high through-put whole genome sequencing (WGS) was used to sequence all the imported SARS-CoV-2 samples for showing the genome characteristics and helping to design control strategies. The study involved 94 throat swabs and one swab collected on the outer packaging of cold-chain food products (sample ID: TJ_20TF_FH158) from March 2020 to February 2021 (Supplementary Table S1 ). The nucleic acid was extracted using viral RNA extraction kits and instruments (Xi'an Tianlong Science and Technology Co., China). WGS was carried out using Nextera XT DNA Library Preparation Kit and MiniSeq System (Illumina, USA). Fifty-six SARS-CoV-2 genomes with the coverage higher than 98% and the sequencing depth higher than 1009 were obtained by de novo assembly. For nucleotide variation analysis, 309 single nucleotide polymorphisms (SNPs) and 14 deletion/insertion sites were found in all 56 SARS-CoV-2 genomes by using Bowtie2 software (Supplementary Figure S1 ). Less than ten SNPs were identified from each of genomes collected from March 2020 to June, and more than 17 SNPs were found from genomes collected from October 2020 to February 2021 (Supplementary Table S1 ). According to China CDC Lineages and PANGO Lineages, T8782C and C28144T were two specific SNPs of S/A-lineage. In this study, four sequences fell into the S/A-lineage, and 34 SNPs and 3 deletion/insertion sites were found in these strains. The other 52 sequences all fell into the L/B-lineage. Four SNPs (C241T, C3037T, C14408T and A23403G) were detected as the characteristics of L-lineage European branch (China For phylogenetic analysis, 1494 high-quality and highcoverage SARS-CoV-2 genomic sequences were downloaded from Global Initiative on Sharing Avian Influenza Data (GISAID) and aligned by MAFFT v7.42 and IQ-tree v2.1.2 (Shu and McCauley 2017) . 895 sequences of S/Alineage included 18 strains which all had the N501Y and L452R mutation of S protein and can constitute an individual sub-lineage, named as the A_501Y lineage including TJ_A371, TJ_21TF_FH07 and TJ_A106 (Fig. 1A) . Two genomes (EPI_ISL_801441, EPI_ISL_801442) were closely related to TJ_A371, TJ_21TF_FH07 and TJ_A106. The original region of these strains included Belgium, Turkey, Mayotte, Spain, et al, and the date of specimen Virologica Sinica collection ranged from December 2020 to January 2021. The other 599 sequences of L/B-lineage included the B.1.1.7/501Y.V1 variant (sample ID: TJ_B449_2) which was the first detection of a major international variant in Tianjin Municipality in January 2021 and was closely related to two genomes, EPI_ISL_1060597, EPI_ISL_1018092, collected from France and Ghana, respectively. Four genomes (EPI_ISL_806290, EPI_ISL_428910, EPI_ISL_1383183, EPI_ISL_1111200) were closely related to TJ_20TF_FH158 according to phylogenetic analysis, all fell into the B.1.1.1-lineage and shared four specific SNPs, C4002T, G10097A, C13536T and C23731T, and other five unique SNPs (C1612T, G3606T, C7772T, C25665T and C26600T) were found in genome of TJ_20TF_FH158. S protein comprises S1 subunit locating at the N terminal and S2 subunit locating at the C terminal. The receptor-binding domain (RBD, S protein aa319-541) of S1 subunit binds to the human angiotensin I converting enzyme 2 (ACE2) receptor which mainly depends on the receptor binding motiff (RBM, S protein aa438-506) region, and S2 subunit helps virus fusion into cells (Rathnasinghe et al. 2021) . RBD is also a target of the SARS-CoV-2 neutralizing antibody, and mutations in this region may affect the neutralizing titer of antibody (Massacci et al. 2020) . 29 amino acid (aa) variants and three aa deletion sites of spike protein were found in all 56 genomes, and 22 variants of them located in the S1 subunit including four variants in the RBD region of S1 subunit (Fig. 1B) . Seven aa variants including L18F, L452R, N501Y, A653V, H655Y, D796Y and G1219V were identified in TJ_A371, TJ_21TF_FH07 and TJ_A106 of A_501Y lineage (Fig. 1B) . For B.1.1.7/501Y.V1 variant, 28 nucleotide variants as the characteristics were identified in the genome of TJ_B449_2 including ten aa variants/ deletion of S gene (Fig. 1B) , and other six unique SNPs including C2110T, G2914T, T7984C, G10887A, C14120T, C19390T and nucleotide deletion of 27792-27794 and 28271 were found in this genome. Nucleotide deletions of 26160-26167, 27386 and 28248-28253 were found in genomes of TJ_21TF_FH07 and TJ_A371 (Supplementary Table S2 ). Initial low percentage of sequences in S/A-lineage exhibited a slight increase from December 2020 to the beginning of 2021 and were discovered in many countries worldwide with several aa mutations of S protein. In this study, N501Y and L452R mutants locating in RBD region of S protein were found in three sequences belonging to A_501Y lineage. L452R may decrease the sensitivity of the virus to neutralizing antibody. N501Y mutation may affect the immunogenicity of the virus (Liu et al. 2021; Gu et al. 2020) . N501Y variant of A lineage have been identified in Turkey, the United Kingdom, France, Denmark, Niger, et al . The pathogenicity and transmissibility of A_501Y mutation were still unknown. Two waves of COVID-19 outbreaks emerged in Tianjin Municipality. At the beginning of 2020, the outbreak was mainly resulted by small-scale local area transmission. In November 2020, the outbreak was associated with imported cold chain products emerged in local communities. Numbers of accumulated COVID-19 positive travelers from overseas in Tianjin continually rise, and the risk of imported COVID-19 outbreaks remains. Sustained SARS-CoV-2 genome sequencing and monitoring of variants can be used to enrich genome data of SARS-CoV-2 which is of great significance for implementing prevention and control strategy of COVID-19 outbreak and tracing the infection sources Eden et al. 2020) . COVID-19 CG: tracking SARS-CoV-2 mutations by locations and dates of interest An emergent clade of SARS-CoV-2 linked to returned travellers from Iran Adaptation of SARS-CoV-2 in BALB/c mice for testing vaccine efficacy Emergence in late 2020 of multiple lineages of SARS-CoV-2 Spike protein variants affecting amino acid position 677 The impact of mutations in SARS-CoV-2 spike on viral infectivity and antigenicity Identification of SARS-CoV-2 spike mutations that attenuate monoclonal and serum antibody neutralization Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding Design of a companion bioinformatic tool to detect the emergence and geographical distribution of SARS-CoV-2 Spike protein genetic variants Transmission of SARS-CoV-2 on mink farms between humans and mink and back to humans A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology The N501Y mutation in SARS-CoV-2 spike leads to morbidity in obese and aged mice and is neutralized by convalescent and post-vaccination human sera GISAID: global initiative on sharing all influenza data-from vision to reality Tianjin Health Commission (2021) Covid-19 epidemic situation in Tianjin Genome composition and divergence of the novel coronavirus (2019-nCoV) originating in China Genomic analysis platforms and typing methods for SARS-CoV-2 genome sequences Conflict of interest The authors declare that they have no conflict of interest.Animal and Human Rights Statement Additional informed consent was obtained from all patients for whom identifying information is included in this article.