key: cord-0258153-18c7d8o9 authors: Rima Hajjo, Dima A. Sabbah; Bardaweel, Sanaa K. title: Emerging SARS-CoV-2 Lineages in Middle Eastern Jordan with Increasing Mutations Near Antibody Recognition Sites date: 2021-02-12 journal: nan DOI: 10.1101/2021.02.09.21251052 sha: 68c68b82ce71f82755f556ca2de3ae743393f3cd doc_id: 258153 cord_uid: 18c7d8o9 nan . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 12, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice. Tracking SARS-CoV-2 single nucleotide variants (SNVs) over time and across different geographies is of paramount significance to understand the pandemic dynamics in depth, design better diagnostics and develop effective vaccines. SNVs data is used to construct a phylogenetic tree that to trace changes to the origin (i.e., reference sequence that first appeared in Wuhan, China). In order to identify the dominant SARS-CoV-2 genetic lineages in Jordan, we downloaded 581 complete viral genomes from the Global Initiative on Sharing All Influenza Data (GISAID)(1) database (http://gisaid.org) on 31 January 2020 and estimated a maximum likelihood tree for these data. We processed the data according to the Methods described by Rambaut el al(2) , and retained 556 sequences, which had at least 95% coverage of the reference genome (WIV04/MN996528.1) after trimming the 5′-and 3′-untranslated regions, and excluding sequences with > 5% ambiguous base calls (Ns). All Viral lineages were defined by pangolin(3). In order to identify SNVs on the nucleotide and amino acid level, each nucleotide sequence was aligned to the reference genome using CoVsurver(4). Our analysis revealed that the viral genetic variant landscape is constantly changing with time. New viral clades and lineages appeared in Jordan during the last few months such as the GR strain which now constitutes > 85 % of all analyzed viruses from Jordan. However, some clades which were dominant during the initial weeks of the pandemics such as the V, S and G strains have vanished or became insignificant in numbers. The same observation is true for lineages; lineage B.1.1.312 appeared first in August 2020, and is now the most widely spread viral lineage in Jordan based on public data available at GISAID, and lineage B.1.1.7 is continually growing ( Figure 1A ). However, lineages such as B.28, B.1.457 and B.1 continue to decline. Our analysis also confirmed the presence of 37 viruses of the B.1.1.7 lineage known as UK variant VUI-202012/01 that appeared in the GR clade and contained the N501Y mutation in . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 12, 2021. ; https://doi.org/10.1101/2021.02.09.21251052 doi: medRxiv preprint the receptor binding domain. Some viruses had concerning combinations of these mutations with deletions altering the spike protein surface. Other amino acid (AA) mutations have been identified in the spike protein of viruses from Jordan over time, however, 10 mutations in particular ( Figure 1B ) were more concerning than others since they happened in the receptor binding domain (RBD) of the spike protein ( Figure 1C ), which is fundamental for SARS-CoV-2 entry to human cells (i.e., increased or decreased ability to enter human cells), antibody recognition and vaccine development. Other mutations that appeared in the RBD of viruses from Jordan, but disappeared from most recent viruses, were N481K and N501I. Mutation N501I also occurred in an important amino acid that is involved in antibody recognition. Although this amino acid is also mutated in the genetic strains of the UK and South Africa, but the Jordanian mutation resulted in a different amino acid substitution from asparagine (N) to isoleucine (I) instead of tyrosine (Y) that is found in However, evidence is lacking whether the amino acid change from N to I results in the same phenotype caused by Y substitutions. Since both N and Y are polar uncharged amino acids, they may bind to antibodies similarly. However, isoleucine is a non-polar/hydrophobic amino acid that may affect binding interactions with antibodies. These hypotheses should be studied further for their importance. Interestingly, other mutations that have been recently flagged by GISAID and the scientific community as critical mutations such as: S477N (part of large Melbourne outbreak from clade GR and some Central European clade GH clusters), N439K (long lasting UK outbreak with clade G and European spillover), and Y453F (mink adaptation and part of limited outbreaks) were not identified in the viruses from Jordan. All details on materials, methods, . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 12, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 genomic analysis and mutational results are provided in Supporting Information and in Supplemental Tables S1 and S2. All raw data and results files are deposited on GitHub and available for researchers to access (https://github.com/rhajjo/Hajjo_EID.git). According to daily official reports published by the Jordanian Ministry of Health, Jordanians were showing more disease symptoms starting from September 2020 with increased need to intensive care and/or ventilators. Jordan also witnessed an increased disease transmission during September through December of 2020. This was accompanied by changes in the clade and lineage during that time too; suggesting that emergent clades and lineages could have changed the virulence and pathogenicity of the virus, or causing different clinical presentation. Nonetheless, this remains an interesting observation that requires follow up studies to verify whether the increased disease transmissibility during the Fall and early Winter has been a mere result of both societal behaviors and governmental relaxed control measures, or due to the appearance of specific viral clades and lineages. We gratefully acknowledge all the Authors from the Originating laboratories responsible for obtaining the specimens and the Submitting laboratories where genetic sequence data were . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 12, 2021. ; https://doi.org/10.1101/2021.02.09.21251052 doi: medRxiv preprint RH generated the idea and performed formal analysis of epidemiology and genetic data, and wrote the manuscript. DS and SB participated in data procurement, in critical scientific discussions, and helped in manuscript edits. Raw data files and results of the genomic analyses are provided on github. The authors will make these data publicly available and supply additional details and a readme files upon request. https://github.com/rhajjo/Hajjo_EID.git The authors declare no competing financial interest. Supplementary Information for materials and methods. Global initiative on sharing all influenza data -from vision to reality European Centre for Disease Prevention and Control (ECDC) A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 12, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 12, 2021. ; https://doi.org/10.1101/2021.02.09.21251052 doi: medRxiv preprint