key: cord-0876028-3aytaksh authors: Sarkar, R.; Lo, M.; Saha, R.; Dutta, S.; Chawla-Sarkar, M. title: S glycoprotein diversity of the Omicron Variant date: 2021-12-05 journal: nan DOI: 10.1101/2021.12.04.21267284 sha: c9134cbc086c78c54ed04f4a029dd49a6ebc24a5 doc_id: 876028 cord_uid: 3aytaksh On the backdrop of ongoing Delta variant infection and vaccine-induced immunity, the emergence of the new Variant of Concern, the Omicron, has again fuelled the fears of COVID-19 around the world. Currently, a very little information is available about the S glycoprotein mutations, transmissibility, severity and immune evasion behaviour of the Omicron variant. In the present study, we have performed a comprehensive analysis of the S glycoprotein mutations of 309 strains of the Omicron variant and also discussed the probable effects of observed mutations on several aspects of virus biology based on known available data of mutational effects on S glycoprotein structure, function, and immune evasion characteristics. Increased transmissibility and high mutation rate of SARS-CoV-2, the causative agent of COVID-19, led to the emergence of multiple variants of concern (VOCs) characterized by the presence of genetic changes which are known to affect virus characteristics such as transmissibility, disease severity, immune escape, and diagnostic or therapeutic escape. SARS-CoV-2 genome codes for multiple proteins, including the spike (S) glycoprotein that protrudes from the virus envelope [3] . The S glycoprotein plays crucial role in the very early stage of virus life cycle that includes virus attachment to the host cell surface, membrane fusion and entry into the host cell [3] [4] [5] [6] [7] [8] [9] [10] . The S glycoprotein, as a surface protein, is the primary target of neutralizing antibodies elicited by the host adaptive immune response [4, [11] [12] [13] [14] . In the constant tug of war between the host and the virus, virus strains with S glycoprotein mutations that facilitate virus entry and/or help the virus escape neutralizing antibodies are frequently selected and eventually predominate. In the lights of crucial role of S glycoprotein in virus infection and host immune evasion, scientists have prioritized mutations that have been emerged within the S glycoprotein of circulating SARS-CoV-2 strains and also investigated the biological significance of those mutations [15, 16] . In the present study, we performed a comprehensive analysis of S glycoprotein mutations of the Omicron variant and also classified them into different groups based on different combination of coexisting S glycoprotein mutations. For retrieval of SARS-CoV-2 genome sequences of the Omicron variant, we accessed to Global Initiative on Sharing All Influenza Data (GISAID) on 2 December 2021 [17] . By applying filter on Variant (VOC Omicron GR/484A), we observed that a total 309 genome sequences of the Omicron variant has been submitted. We downloaded all these genome sequences from GISAID for further analysis. Name of all the Omicron variants were presented in Table S1 . The genome sequences of the prototype SARS-CoV-2 strain hCoV-19/Wuhan/WIV04/2019 (GISAID accession no. EPI_ISL_402124) was also downloaded from the GISAID database for the purpose of mutational analysis. For performing mutational analysis, the S glycoprotein protein coding region of 309 genome sequences of the Omicron variant as well as the prototype genome (hCoV-19/Wuhan/WIV04/2019) were translated to amino acid sequences by using TRANSEQ nucleotide-to-protein sequence conversion tool (EMBL-EBI, Cambridgeshire, UK). Next, the S glycoprotein protein sequences of all the SARS-CoV-2 strains including the prototype variant and the Omicron variants were aligned by MEGA software (Version X) and subsequently observed for amino acid substitutions in the S protein of the Omicron variants with compared to the prototype strain [18] . The amino acid substitution observed at a particular location of the S glycoprotein of the Omicron variant was marked with the number according to its position with reference to the first amino acid of the S glycoprotein of the prototype strain. The phylogenetic analysis of the 155 sequences of the Omicron variant was performed with the Ultafast Sample Placement of Existing Trees (UShER) that has been integrated in the UCSC SARS-CoV-2 Genome Browser [19] . We accessed to the UCSC SARS-CoV-2 Genome Browser (https://genome.ucsc.edu/cgi-bin/hgPhyloPlace) and uploaded the sequence name of 155 sequences (Table S2) for the construction of the phylogenetic tree. UShER is a program that rapidly places new samples onto an existing phylogeny using maximum parsimony. It is particularly helpful in understanding the relationships of newly sequenced SARS-CoV-2 genomes with each other and with previously sequenced genomes in a global phylogeny. were 62 sequences submitted from 11 different countries (United Kingdom, N=18; Portugal, N=13; Netherlands, N=12; Austria, N=5; Germany, N=5; Italy, N=3; Belgium, N=1; Czech Republic, N=1; Spain, N=1; Sweden, N=1; Ireland, N=1) in Europe. Among the rest 24 sequences, 9 were submitted from three Asian countries (Hong Kong, N=6; Japan, N=2; Israel, N=1), 9 from Australia of Oceania, 3 from Canada of North America, and 3 from Brazil of South America. Mutational analysis of the S glycoprotein of the Omicron variant revealed the presence of 37 dominant mutations which ranges in frequency from 59% to 100% ( Table 2 ). The S1 domain of the S glycoprotein contains 29 different mutations that encompasses 11 mutations (A67V, ∆H69, ∆V70, T95I, G142D, ∆V143, ∆Y144, ∆Y145, ∆N211, L212I and ins214EPE) within the N-terminal domain (NTD), 15 mutations (G339D, S371L, S373P, S375F, K417N, N440K, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y and Y505H) within the Receptor binding domain (RBD) and 3 mutations (T547K, D614G and H655Y) at the C-terminus of the S1 subunit. Interestingly, among the 15 mutations of the RBD, 10 mutations (N440K, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y and Y505H) were observed within the Receptor binding motif (RBM). The S2 subunit of the S glycoprotein was found to have 8 mutations, of which 3 (N679K, P681H and N764K) were present at the N-terminus of the S2 subunit, D796Y was present within the Fusion peptide (FP), 3 (Q954H, N969K, and L981F) were present within the Heptad repeat sequence 1 (HR1), and N856K was found within the region between FP and HR1 (Figure 1) . By comparing the S glycoprotein mutations of four VOCs Alpha, Beta, Gamma, and Delta, it has been observed that the Omicron contains 25 unique mutations (A67V, ∆V143, ∆N211, L212I, ins214EPE, G339D, S371L, S373P, S375F, N440K, G446S, S477N, E484A, Q493R, G496S, Q498R, Y505H, T547K, N679K, N764K, D796Y, N856K, Q954H, N969K, and L981F), whereas 12 mutations (∆H69, ∆V70, T95I, G142D, ∆Y144, ∆Y145, K417N, T478K, N501Y, D614G, H655Y, and P681H) were shared with other four VOCs (Figure 1 ). Based on coexisting mutations, we also classified 309 strains of the Omicron variant into 60 different groups, each group representing a different set of coexisting S glycoprotein mutations ( Table 3 Presence of more than 35 mutations in the S glycoprotein, especially in the NTD and RDB of the S1 subunit, of the Omicron variant has again fuelled the fears of COVID-19 around the world. The S glycoprotein, which mediates viral attachment to ACE2 receptor and entry into the host cell, is subdivided into two functional subunits, known as S1 and S2, which form non-covalent interaction after being cleaved by furin during synthesis [3, 4] . The RBD and NTD are the two crucial domains of the S1 subunit that are responsible for interacting with host cell receptor (ACE2) and recognizing various attachment factors, respectively [3] [4] [5] [6] [7] [8] [9] . The fusion mechanism is housed in the S2 subunit, which undergoes large-scale conformational changes to force fusion of the virus and host membranes, allowing genome delivery and initiation of infection [10] . As RBD domain is immuno-dominant and also required for ACE2 attachment, any mutation in RBD domain could affect the neutralization efficacy of antibodies generated in convalescent and vaccinated individual as well as the binding affinity of the S glycoprotein to ACE2 receptor [4, [11] [12] [13] [14] . There are 37 dominant mutations in the S glycoprotein of the Omicron variant, raising concerns about whether it is more infectious or pathogenic than other four VOCs, and whether it can evade the natural immunity or vaccineinduced immunity. Despite the lack of definitive immunological and clinical data, we can provide preliminary indications on pathogenicity, transmissibility, and immune evasion capabilities of the Omicron variant based on the known impacts of previously identified mutations. Twelve mutations of the Omicron variant (∆H69, ∆V70, T95I, G142D, ∆Y144, ∆Y145, K417N, T478K, N501Y, D614G, H655Y, and P681H) overlap with those in the Alpha, Beta, Gamma, and Delta. All these mutations have previously been linked with high transmissibility, increased viral binding affinity, and immune evasion [15, [20] [21] [22] . Higher transmissibility and increased immune evasion are anticipated from the Omicron variant if these overlapping VOCs mutations maintain their known effects. The functional implications of the remaining 25 mutations (A67V, ∆V143, ∆N211, L212I, ins214EPE, G339D, S371L, S373P, S375F, N440K, G446S, S477N, E484A, Q493R, G496S, Q498R, Y505H, T547K, N679K, N764K, D796Y, N856K, Q954H, N969K, and L981F) of the Omicron variant are unknown, leaving a lot of questions about how the whole set of mutations may affect viral fitness and vulnerability to natural and vaccine-mediated immunity. Several studies on epitope mapping and antibody foot printing have showed that serum neutralizing antibodies of infected and vaccinated individuals mainly target RBD domain of the S glycoprotein [11, [23] [24] [25] [26] . Role of 4 RBD mutations of the Omicron variant (K417N, K477N, T478K, and E484A) have previously been described in the context of immune evasion. The K417N, previously detected in Beta and Delta plus, has been shown to reduce the neutralization efficacy of some monoclonal antibodies [27, 28] . Residues E484 and T478 are the part of immuno-dominant site of RBD [11] [12] [13] . The E484K, previously observed in Beta and Gamma, has been shown escape antibody neutralization, and also been found to emerge as escape mutation during exposure to monoclonal antibodies and convalescent plasma [29] [30] [31] . Four mutant viruses with E484A, E484D, E484G and E484K were also found to be resistant against neutralization by each of the four convalescent sera tested [32]. The E484Q has been shown to reduce serum neutralizing antibody titers [33] [34] [35] . However, the T478K, previously detected in Delta, does not affect neutralization by monoclonal antibodies [16] .The K477N has been shown to confer resistant against monoclonal antibodies, but not convalescent plasma [32] . Therefore, the presence of three immune escape mutations K417N, K477N and E484A in the RBD is likely to improve the immune evasion ability of the Omicron variant. Although RBD domain is immuno dominant, NTD of the S glycoprotein can also elicit antibody response upon infection and vaccination [8' 36, 37] . The NTD domain contains an 'antigenic supersite' which comprises of N-terminus (residues 14-20), a β-hairpin (residues 140-158) and a loop (residues 245-264) [8] . Among 11 mutations of the NTD of the Omicron variant, 4 mutations (G142D, ∆V143, ∆Y144, ∆Y145) reside within the β-hairpin region of the antigenic supersite and likely to contribute immune evasion significantly. A recent study has illustrated the functional significance of all the RBD mutations of SARS-CoV-2 on ACE2 binding [38] . Among the 15 mutations found within the RBD domain of the Omicron variant, 4 mutations (G339D, N440K, T478K and N501Y) were demonstrated to enhance the affinity of RBD towards ACE2, whereas rest 11 mutations (S371L, S373P, S375F, K417N, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, and Y505H) were demonstrated to reduce the affinity of RBD for ACE2 . Notably, mutations at Q493, Q498 and N501 are very crucial for RBD and ACE2 interactions because residues Q493, Q498 and N501 of RBD participate in polar contact networks involving the ACE2 interaction hotspot residues K31 and K353. Amino acid substitutions with nonpolar amino acid at these sites enhance the affinity of RBD to ACE2. However, in the Omicron variant, glutamine (Q) substitution with more polar amino acid arginine (R) at positions 493 and 498 is suspected to reduce the affinity of RBD to ACE2. In contrast, substitution of asparagine at position 501 with less polar amino acid tyrosine will enhance the affinity of RDB to ACE2. The overall affinity of RBD of the Omicron variant to ACE2 will be determined by the magnitude of 4 affinity enhancing mutations and 11 affinity reducing mutations. The Omicron variant is also expected to maintain the high transmissibility due to the presence of D614G and P681H mutations which were previously described as a key mutation for enhance transmissibility of the virus [22, 39] . Currently, it is not clear whether the Omicron has higher transmissibility than the Delta. Although preliminary data suggests that the Omicron variant is spreading rapidly against a backdrop of ongoing Delta variant infection and natural as well as vaccine-induced immunity, indicating high transmissibility and potency to make breakthrough infections. If the current trend continues, omicron will supplant delta as the most common variation in South Africa and other part of the world very rapidly and may lead to another wave of COVID-19. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 5, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 5, 2021. ; https://doi.org/10.1101/2021.12.04.21267284 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Health Organization (WHO) Classification of Omicron (B.1.1.529): SARS-CoV-2 Variant of Concern. World Health Organization (WHO) Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. cell Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses. Nature microbiology A pneumonia outbreak associated with a new coronavirus of probable bat origin AXL is a candidate receptor for SARS-CoV-2 that promotes infection of pulmonary and bronchial epithelial cells N-terminal domain antigenic mapping reveals a site of vulnerability for SARS-CoV-2. Cell Lectins enhance SARS-CoV-2 infection and influence neutralizing antibodies Tectonic conformational changes of a coronavirus spike glycoprotein promote membrane fusion Mapping neutralizing and immunodominant sites on the SARS-CoV-2 spike receptor-binding domain by structure-guided high-resolution serology SARS-CoV-2 RBD antibodies that maximize breadth and resistance to escape Broad sarbecovirus neutralization by a human monoclonal antibody Cross-neutralization of SARS-CoV-2 by a human monoclonal SARS-CoV antibody SARS-CoV-2 variants, spike mutations and immune escape Molecular basis of immune evasion by the delta and kappa SARS-CoV-2 variants Global initiative on sharing all influenza data-from vision to reality Comprehensive analysis of genomic diversity of SARS-CoV-2 in different geographic regions of India: an endeavour to classify Indian SARS-CoV-2 strains on the basis of co-existing mutations Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic Complete mapping of mutations to the SARS-CoV-2 spike receptor-binding domain that escape antibody recognition. Cell host & microbe Mutation D614G increases SARS-CoV-2 transmission. Signal Transduction and Targeted Therapy Supplementary Table 1: Name of the Omicron variants download from GISAID 1. hCoV-19/Botswana/R42B5_BHP_AAC25114 11-20 30. hCoV-19/South Africa/NICD hCoV-19/South Africa/CERI-KRISP-K032190/2021|EPI_ISL_6810482|2021-11-17 129. hCoV-19/South Africa/CERI-KRISP-K032204/2021|EPI_ISL_6699738|2021-11-17 130. hCoV-19/South Africa/CERI-KRISP-K032188/2021|EPI_ISL_6699728|2021-11-17 131. hCoV-19/South Africa/CERI-KRISP-K032214/2021|EPI_ISL_6699744|2021-11-17 132. hCoV-19/South Africa/CERI-KRISP-K032189/2021|EPI_ISL_6699729|2021-11-17 133. hCoV-19/South Africa/CERI 2|2021-11-26 198. hCoV-19/Netherlands/NH-RIVM-71078/2021|EPI_ISL_6841613.2|2021-11-26 199 Reunion/PIMIT_Om1 We would like to acknowledge the scientists, researchers and laboratory staffs in India for their valued contribution in SARS-CoV-2 genome sequencing and deposition in GISAID. We would also like to applaud GISAID consortium for allowing us the open access to the deposited SARS-CoV-2 sequences.