key: cord-0928200-sipy9v3b authors: Chakraborty, Chiranjib; Sharma, Ashish Ranjan; Bhattacharya, Manojit; Agoramoorthy, Govindasamy; Lee, Sang-Soo title: Evolution, Mode of Transmission, and Mutational Landscape of Newly Emerging SARS-CoV-2 Variants date: 2021-08-31 journal: mBio DOI: 10.1128/mbio.01140-21 sha: e2c202e6875e4653155b3e1662607f540e4ed4aa doc_id: 928200 cord_uid: sipy9v3b The recent emergence of multiple variants of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has become a significant concern for public health worldwide. New variants have been classified either as variants of concern (VOCs) or variants of interest (VOIs) by the CDC (USA) and WHO. The VOCs include lineages such as B.1.1.7 (20I/501Y.V1 variant), P.1 (20J/501Y.V3 variant), B.1.351 (20H/501Y.V2 variant), and B.1.617.2. In contrast, the VOI category includes B.1.525, B.1.526, P.2, and B.1.427/B.1.429. The WHO provided the alert for last two variants (P.2 and B.1.427/B.1.429) and labeled them for further monitoring. As per the WHO, these variants can be reclassified due to their status at a particular time. At the same time, the CDC (USA) has marked these two variants as VOIs up through today. This article analyzes the evolutionary patterns of all these emerging variants, as well as their geographical distributions and transmission patterns, including the circulating frequency, entropy diversity, and mutational event diversity throughout the genomes of all SARS-CoV-2 lineages. The transmission pattern was observed highest in the B.1.1.7 lineage. Our frequency evaluation found that this lineage achieved 100% frequency in early October 2020. We also critically evaluated the above emerging variants mutational landscape and significant spike protein mutations (E484K, K417T/N, N501Y, and D614G) impacting public health. Finally, the effectiveness of vaccines against newly SARS-CoV-2 variants was also analyzed. emergence of significant variants affecting public health and producing mutated viral antigens. These mutated antigens may affect the vaccine-generated antibodies and thus protection. Here, we have analyzed the critical questions about the SARS-CoV-2 variants in three directions, or sections. A summary of the study methodology is provided as a flowchart in Fig. 1 . In the first section, we studied the evolutionary patterns of the newly emerging variants, their geographical distributions and transmission patterns, circulating frequencies, entropy diversity, and mutational event diversity throughout In the second section, we critically evaluated the mutational landscape of all of the above emerging variants and significant spike protein mutations. We also assessed the significant mutations (E484K, K417T/N, N501Y, and D614G) in emerging variants that may concern public health. Finally, in the third section, we analyzed the efficacy of vaccines against the new SARS-CoV-2 variants. Our analysis will help design the future direction for pandemic control, next-generation vaccine development using alternative viral antigens, and planning for an effective vaccination program. Likewise, we depict the phylogenetic tree of lineage P.2, and 41 genome samples of the P.2 lineage between December 2020 and June 2021 from the 3,905 genomes were observed (Fig. 4C ). It appears from the phylogenetic tree that some important mutations have been obtained by the P.2 lineage since the middle of December 2020. Next, we also created a phylogenetic tree of lineage B. ones. This regression model shows a heteroscedasticity pattern. We next developed scatterplots using 1,182 genomes of the B. Fig. 11 . The circulating frequency of the B.1.1.7 lineage is noted in Fig. 12A . The circulating frequency indicated that it originated in September, and it achieved 100% frequency in early October 2020. We next analyzed the circulating frequency of the P.1 lineage and recorded it in Fig. 12B . The data indicated that this variant achieved 100% frequency in the middle of October 2020. The circulating frequency of the B.1.351 lineage is represented in Fig. 12C . It was observed that the lineage achieved 100% frequency between June and July 2020. Similarly, we evaluated the circulating frequencies of the B.1.617.2 lineage, as shown in Fig. 12D . It was noted that the lineage achieved 100% frequency during the last week of January 2021 and early February 2021. Furthermore, the circulating frequency of B.1.525 lineage is noted in Fig. 13A . It achieved 100% frequency during November 2020. At the same time, we mapped the circulating frequency of the B.1.526 lineage (Fig. 13B ). This variant achieved 100% frequency during the first week of December 2020. Finally, we have drawn the circulating frequency pattern of the P.2 lineage (Fig. 13C) . The frequency was recorded as 100% at the end of September 2020 or early October 2020. The circulating frequency of lineage B.1.427/B.1.429 is noted in Fig. 13D . The lineage achieved 100% frequency during early December 2020. . Entropy diversity is a measure to understand the pattern of mutational changes in a particular position in the genome. At the same time, it also helps us to understand the tendency of amino acids to swap from wild type to mutant (16) . Similarly, mutational event diversity or mutational profiling studies inform us about the mutational events throughout the genome or at a specific position. Such studies can assist us in understanding the mechanisms that cause the SARS-CoV-2 evolution (17) . Thus, we have depicted the entropy diversity and mutational event pattern throughout the genomes of all circulating lineages in a frame (Fig. 14) . Based on the entropy pattern and mutational event points throughout the genome of the B.1.1.7 lineage (Fig. 15A) , there was a maximum entropy of 0.8, and in two positions in ORF1b, the entropy was noted nearby as 0.6. The entropy diversity and event diversity points throughout the genome of P.1 lineage are depicted in (Fig. 15D) . In this case, also, maximum entropy was noted at about 0.8, but it was noted at different positions (nine positions in ORF1a, one position in ORF1b, one position in S-protein, and one position in Nprotein). Again, we have depicted the entropy diversity and mutational event pattern plot throughout the genome of the B.1.525 lineage (Fig. 16A) . The maximum entropy was noted at about 0.4 in ORF3a. At two positions in ORF1a, the entropy observed was about 0.3. At the same time, we mapped the entropy diversity and mutational events throughout the genome of the B.1.526 lineage (Fig. 16B) . In this case, the maximum entropy noted was about 0.6 in different positions. Furthermore, we have drawn the entropy diversity and mutational event pattern throughout the genome of the P.2 lineage (Fig. 16C) . In this case, the maximum entropy noted was about 0.6 in four different positions in ORF1a. Finally, we recorded the entropy diversity and mutational event plots throughout the genome of lineage B.1.427/B.1.429 (Fig. 16D) . Here, the maximum entropy noted was about 0.6 in five different positions (two positions in ORF1a, one position in ORF1b, and one position in N-protein). Viral mutational landscapes of all major VOCs and VOIs, their significant mutations, and significant mutations in spike protein. The mutational landscapes of all emerging variants (VOC and VOI), as reported by the CDC and WHO, are shown in Tables 2 and 3 . The mutational landscapes and the significant mutational positions are described through the schematic illustration for all major VOCs, such as lineages B.1.1.7 (Fig. 17A) , P.1 (Fig. 17B) , B.1.351 (Fig. 17C) , and B.1.617.2 (Fig. 17D ). At the same time, the mutational landscapes and the significant mutational positions are Evolution, Transmission, and Mutation of SARS-CoV-2 ® described through the schematic illustration for all major VOIs, such as lineages B.1.525 (Fig. 18A) , B.1.526 (Fig. 18B) , P.2 (Fig. 18C), and B.1.427/B.1.429 (Fig. 18D) . The significant mutational positions in the S-protein are illustrated through the diagrammatic representation of lineages B.1.1.7 ( Fig. 19A and B) , P.1 (Fig. 19C) , B.1.351 (Fig. 19D) , B.1.617.2 (Fig. 19E) , B.1.525 (Fig. 20A) , B.1.526 (Fig. 20B) , P.2 (Fig. 20C) , and B.1.427/B.1.429 (Fig. 20D) . It has been reported that there are three significant mutations in the P.1 lineage. These are present in spike receptor binding domain (RBD) (E484K, K417T/N, and N501Y) (7, 18) . Simultaneously, it was reported that the mutation D614G can augment the capability to spread compared to the wild type (10, 18) . Another mutation (nonsynonymous mutation P681H) was observed in the S protein of the B.1.1.7 lineage. Some significant mutations (E484K, K417T/N, N501Y, and D614G) found in emerging variants and their structural landscapes. We performed structural landscape analysis of some significant mutations, such as E484K, K417T/N, N501Y, and D614G, which are frequently reported in emerging variants. We have analyzed the E484K mutation. Here, the structure of E484K changes due to the replacement Glu 484 !Lys. The structural analysis of E484K is shown in different forms, such as the interaction abilities of the wild-type residues (Fig. 21A ) and the interaction abilities of the mutant protein structure (Fig. 21B) , and in addition, a snapshot during the toggle of the molecular interaction is included to understand the interactions between the wild-type (dashed lines) and mutant (straight lines) residues (Fig. 21C) . The wild-type amino acid configuration shows the interaction of Glu 484 with Gly 482 . In contrast, the mutant amino acid configuration shows the interaction of Lys 484 with Phe 486 . We next analyzed the K417T mutation, and the structure of K417T is altered due to the replacement Lys 417 !Thr. The structural analysis of K417T is shown in different forms, such as interaction abilities of the wild-type residues (Fig. 22A) , interaction abilities of the mutant residues (Fig. 22B) , and interactions of wild-type with the mutant residues (Fig. 22C) . The wild-type amino acid configuration shows the interaction of Lys 417 with Leu 455 , Tyr 421 , and Asn 370 . In contrast, the mutant amino acid configuration shows the interaction of Thr 417 with Tyr 421 , Leu 455 , and Asp 420 . Evolution, Transmission, and Mutation of SARS-CoV-2 ® Furthermore, we analyzed the K417N mutation. Here, the structure of K417N is changed due to the replacement Lys 417 !Asn. The structural analysis of K417N is shown in different forms, such as interaction abilities of the wild-type residues (Fig. 23A) , interaction abilities of the mutant residues (Fig. 23B) , and interactions of the wild-type and mutant residues (Fig. 23C) . The wild-type amino acid shows the interaction of Lys 417 with Leu 455 , Tyr 421 , and Asn 370 . In contrast, the mutant amino acid configuration shows the interaction of Asn 417 with Tyr 421 , Leu 455 , and Asp 420 . Simultaneously, we also analyzed the N501Y mutation. Here, the structure of N501Y was altered due to the replacement Asn 501 !Tyr. The structural analysis of N501Y is shown in different forms, such as interaction abilities of the wild-type residues (Fig. 24A) , interaction abilities of the mutant residues (Fig. 24B) , and interaction of the wild-type and mutant residues (Fig. 24C) . The wild-type amino acid configuration shows the interaction of Asn 501 with Gln 506 and Pro 499 . In contrast, the mutant amino acid configuration shows the interaction of Tyr 501 with Gln 498 and Gln 506 . The wild-type and mutant residues also show some other interactions. Finally, the D614G mutation was analyzed. Here, the structure of D614G was changed due to the replacement of Asp 614 !Gly. The structural analysis of D614G is shown in different forms, such as interaction abilities of the wild-type residues (Fig. 25A) , interaction abilities of the mutant residues (Fig. 25B) , and interactions of wild-type and mutant residues (Fig. 25C) . The wild-type amino acid configuration shows the interaction of Asp 614 with Ala 647 . In contrast, the mutant amino acid configuration shows the interaction of Gly 614 with Ala 647 . The interaction pattern between the residues of the wild type and mutant did not show any changes. The new SARS-CoV-2 variants and efficacy of vaccines against these variants. We have tried to understand the new SARS-CoV-2 variants and their recognition by neutralizing antibodies produced by different vaccines. The data were obtained from the various available data in the literature ( Table 4 ). The effectiveness of vaccines against newly emerging SARS-CoV-2 variants is noted from different reports in the literature and is summarized in Table 5 . These two tables help us to evaluate the reduction in vaccine efficacy by the newly developed virus variants. The first section of the data analysis has shown a depiction of the current scenario of the evolution of emerging variants of SARS-CoV-2 virus, the cluster of genome samples and their divergence, and the geographical distribution and transmission pattern of the variants. Presently, the transmission of the virus has become uncontrolled in different parts of the world. According to published reports, some varieties have shown higher transmission potential, such as the B.1.1.7 lineage (19) . Similarly, we have found a higher transmission pattern in the B.1.1.7 lineage. Therefore, the transmission pattern analyzed for these variants is highly significant in the present perspective when the transmission continues to increase in several countries. We have also described the circulating frequencies of all the lineages and all the newly emerging lineages. From the analysis, we found that 100% frequency was achieved by lineage B.1.1.7 in early October 2020. We also found entropy diversity and mutational event diversity throughout the genomes of all newly emerging lines. We have critically evaluated the mutational landscape of all the above emerging variants. The data analysis has shown significant mutations throughout the genomes of all newly emerging lineages. The important mutations were recorded in the spike protein, especially in the RBD of all newly emerging lineages. There is already recorded evidence that the mutations in spike protein, especially in the RBD, can influence the transmission pattern of this virus (10) . We have also evaluated the structural landscape of several significant mutations (E484K, K417T/N, N501Y, and D614G) in the emerging variants. The structural analysis will help us understand the structural contacts of the wild-type protein, structural connections of the mutant protein, and wild-type and mutant interactions, assisting in understanding the mutational landscape. These significant mutations are related to more infectivity in most emerging lineages and death tolls (7, 10, 18) . It is a known fact that the E484K mutation affects neutralization by convalescentphase sera or monoclonal antibodies (MAbs) (19, 20) . Similarly, the combination mutation of K417N and N501Y affects neutralization by MAbs and convalescent-phase sera (19) (20) (21) . Therefore, our analysis confirms this observed phenomenon and is very noteworthy for now. Finally, we analyzed the vaccines' efficacy against the new variants of this virus and compared them with the vaccines' reported effectiveness. We found a reduction in vaccine efficacy for the new variants. However, the efficacy of all approved vaccines has not been analyzed against all newly emerging SARS-CoV-2 variants. Therefore, further evaluation is urgently needed. Furthermore, the year 2021 will be more challenging due to the emergence of numerous variants. Therefore, we raise the following queries. How will the recently emerged variants affect people's health? Will the COVID-19 vaccines protect people against the recently emerged variants? Can the COVID-19 vaccination program be successfully implemented across the developing world? Conclusion. The emergence of several new lineages of SARS-CoV-2 due to viral mutation is crucial as vaccination programs have started throughout the globe. Scientists have started research to protect against all new significant mutant variants. They have already designed next-generation vaccines against this pandemic virus using different epitopes from all important mutant variants, including the Wuhan variant (22) . However, our analysis will help to design future countrywide pandemic planning, focusing on emerging variants, as well as next-generation vaccine development The socio-economic implications of the coronavirus and COVID-19 pandemic: a review Race for a COVID-19 vaccine The COVID-19 vaccine race: challenges and opportunities in vaccine formulation Vaccine designers take first shots at COVID-19 A comprehensive review of the global efforts on COVID-19 vaccine development What does 95% COVID-19 vaccine efficacy really mean? SARS-CoV-2 variants and ending the COVID-19 pandemic Variant analysis of SARS-CoV-2 genomes Genetic variants of SARS-CoV-2-what do they mean? Tracking changes in SARS-CoV-2 spike: evidence that D614G increases infectivity of the COVID-19 virus Present variants of concern (VOC) and variants of interest (VOI) of SARS-CoV-2: their significant mutations in S-glycoprotein, infectivity, re-infectivity, immune escape, and vaccines activity New SARS-CoV-2 variants-clinical, public health, and vaccine implications SARS-CoV-2 Brazil variant in Latin America: more serious research urgently needed on public health and vaccine protection SARS-CoV-2 vaccines and the growing threat of viral variants SARS-CoV-2 vaccines: a triumph of science and collaboration New pathways of mutational change in SARS-CoV-2 proteomes involve regions of intrinsic disorder important for virus replication and release The mutation profile of SARS-CoV-2 is primarily shaped by the host antiviral defense New SARS-CoV-2 variants challenge vaccines protection COVID-19 Genomics UK (COG-UK) Consortium. 2021. Sensitivity of SARS-CoV-2 B.1.1.7 to mRNA vaccine-elicited antibodies Comprehensive mapping of mutations to the SARS-CoV-2 receptor-binding domain that affect recognition by polyclonal human serum antibodies Escape from neutralizing antibodies by SARS-CoV-2 spike protein variants A nextgeneration vaccine candidate using alternative epitopes to protect against Wuhan and all significant mutant variants of SARS-CoV-2: an immunoinformatics approach WHO. 2020. SARS-CoV-2 variants CDC. 2021. SARS-CoV-2 variant classifications and definitions Bibliometric analysis of coronavirus disease (COVID-19) literature published in Web of Science 2019-2020 How do we share data in COVID-19 research? A systematic review of COVID-19 datasets in PubMed Central articles Coronavirus variants and mutations Nextstrain: real-time tracking of pathogen evolution Nextstrain SARS-CoV-2 resources PANGO lineages Exploring the structural distribution of genetic variation in SARS-CoV-2 with the COVID-3D online resource Author correction: exploring the structural distribution of genetic variation in SARS-CoV-2 with the COVID-3D online resource A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology ENSEMBLE Study Group. 2021. Safety and efficacy of single-dose Ad26.COV2.S vaccine against Covid-19 C4591001 Clinical Trial Group. 2020. Safety and efficacy of the BNT162b2 mRNA Covid-19 vaccine Efficacy and safety of the mRNA-1273 SARS-CoV-2 vaccine Sputnik V COVID-19 vaccine candidate appears safe and effective Safety and efficacy of the ChA-dOx1 nCoV-19 vaccine (AZD1222) against SARS-CoV-2: an interim analysis of four randomised controlled trials in Brazil, South Africa, and the UK Oxford-Cardiff COVID-19 Literature Consortium. 2021. Overview of approved and upcoming vaccines for SARS-CoV-2: a living review WHO approval of Chinese CoronaVac COVID vaccine will be crucial to curbing pandemic Emerging SARS-CoV-2 variants and impact in global vaccination programs against SARS-CoV-2/COVID-19 using alternative wild-type antigens and significant viral antigens, and immediate planning for ongoing vaccination programs worldwide. Data collection. We retrieved or collected different data sets for new SARS-CoV-2 variants from the WHO (23) and CDC (24) . We searched for different keywords in databases such as Web of Science (25) , PubMed (26, 27) , and Google Scholar. For the database search, we used keywords such as "SARS-CoV-2 variants," "variants of consequence," "VOI," "VOC," and "variants and vaccines," etc. We also searched for the new variants with keywords such as "B. We have tried to collect data from different sources, such as The New York Times (coronavirus-variant-tracker) (28) and various other resources. For further data collection, we used several databases and servers, such as Nextstrain (SARS-CoV-2 resources) (28, 29) , Pango lineages (30), Pango lineages on GitHub (31), and COVID-3D (32, 33) . Nextstrain uses the data from GISAID. We analyzed and retrieved the data from the Nextstrain server in April 2021.We have followed the nomenclature for lineages of this virus proposed by Rambaut et al. (34) . Data analysis and interpretation. For data analysis, we used several servers, such as Nextstrain (SARS-CoV-2 resources) (26, 27) , Pango lineages (30), Pango lineages on GitHub (31), and COVID-3D (32) . We also used COVID-3D for the structural analysis of significant mutations (E484K, K417T/N, N501Y, and D614G) in emerging variants (18) . We have depicted the study methodology through a flowchart summarizing the overall process in Fig. 1 . This study was supported by Hallym University Research Fund and by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2020R1C1C1008694 and NRF-2020R1I1A3074575).We used different webservers and databases (e.g., Nextstrain SARS-CoV-2 resources, Pango lineages, GitHub, COVID-3D, WHO, CDC, Web of Science, PubMed, and Google Scholar) in this study. We are thankful to the researchers who developed these webservers and databases.The authors have no conflict of interests to declare.