key: cord-0846368-s62whlg5 authors: Souza, U.J.B.; Santos, R.N.; Belmok, A.; Melo, F.L.; Galvão, J.D.; Damasceno, S.B.; Rezende, T.C.V.; Andrade, M.S.; Ribeiro, B.M.; Ribeiro Junior, J.C.; Carvalho, R.F.; Santos, I.G.C.; Oliveira, M.S.; Spilki, F.R.; Campos, F.S. title: Detection of potential new SARS-CoV-2 Gamma-related lineage in Tocantins shows the spread and ongoing evolution of P.1 in Brazil date: 2021-06-30 journal: bioRxiv DOI: 10.1101/2021.06.30.450617 sha: 569bab8629e831e6722037fe922c556049a3561d doc_id: 846368 cord_uid: s62whlg5 After more than a year of the pandemic situation of COVID-19, the United Kingdom (UK), South Africa, and Brazil became the epicenter of new lineages of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Variants of Concern (VOCs) were identified through a continuous genomic surveillance global effort, the B.1.1.7 (Alpha), B.1.351 (Beta), B.1.617.2 (Delta), and P.1 (Gamma) harboring a constellation set of mutations. This research aims to: (i) report the predominance of the Gamma (P.1) lineage presenting the epidemiological situation of the SARS-CoV-2 genomic surveillance at the state of Tocantins, and (ii) describe the emergence of possible new mutations and viral variants with the potential new lineage (P1-related) represented by 8 genomes from the Tocantins harboring the mutation L106F in ORF3a. At the moment, 6,687 SARS-CoV-2 genomes from GISAID carry this mutation. The whole-genome sequencing has an important role in understanding the evolution and genomic diversity of SARS-CoV-2, thus, the continuous monitoring will help in the control measures and restrictions imposed by the secretary of health of the state to prevent the spread of variants. Since 2019, the host spillover of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1] , caused the biggest pandemic of the 21st century with severe impacts on health, economy and social life. To understand the SARS-CoV-2 variants the genomic surveillance tool was applied to track cases providing important clues to the complex virus-host relationship [2, 3] . In Brazil specially, the Corona-ômica-BR is conducting efforts and human resources to track the viral spread and contribute to public health authorities. The emergence of viruses may be associated with a lack of public interventions, non-scientific communications, social inequality and for now, there is a slow vaccination [4, 5] . The international public efforts have been concerned about the emergence of virus variants leading to changes in viral fitness. These mutations may increase transmissibility, enhance escape from the human immune response, or otherwise alter biologically important phenotypes [6] . The global emergence of several variants of concern (VOCs), recently re-named by WHO [7] . We hereby report the complete genome sequences and phylogenetic analysis of 24 SARS-CoV-2 genomes from Tocantins, North region of Brazil including our P.1-related sequences. Tocantins is a state in Brazil located in the north of the country. It is a region with intense traffic of people, connecting the north, northeast and central-west regions of Brazil between frontiers with more six states: Maranhão, Piauí, Bahia, Goiás, Mato Grosso and Pará. Thus, the daily/seasonal travelling dynamics favors the introduction and spread of new SARS-CoV-2 lineages in the state. We analyzed in both spatiotemporal and phylogenetic contexts 24 SARS-CoV-2 samples collected between May 26, 2021 and June 01, 2021 from 12 municipalities of the state of Tocantins. Previous SARS-CoV-2 genomes available in GISAID [8] [9] [10] indicated that the P.1 variant that emerged during November 2020 has rapidly and dominantly increased its prevalence in Brazil. In Tocantins state, P.1 was first detected in February, 2021. Since then, there was a fast replacement by P.1 lineage, the latter being present in 75% of the sampling sequenced for this report. Recently, new mutations were observed in the P.1 variant, giving rise to a new P.1.2 sub lineage initially identified in Rio de Janeiro [12] (Figura 3) . Together, Tocantins genomes harbor the main mutations that characterize the P1 variant (including 11 mutations in the spike protein, such as E484K, N501Y, D614G) [11] . Furthermore, we detect a non-synonymous mutation in ORF3a (L106F) in 8 genomes (which appears to be increasing in frequency) and a non-synonymous mutation in ORF1b (M2260I) that is unprecedented. Importantly these both mutations appear in the same seven genomes, reinforcing our finding. Moreover, L106F appears in 6,687 GISAID genomes, sustaining the convergent character of this change. Another feature of these sequences is a deletion of nine nucleotides in ORF1a (del 11288 to 11296) in all genomes excluding three amino acids (S-G-F) from these proteins. Below we show a phylogenetic reconstruction of 851 genomes retrieved from GISAID plus our 24 sequenced genomes from the Tocantins (Figure 4) . The maximum likelihood tree shows the genomes with L106F mutation mentioned. Besides the alignment and phylogenetic analysis, we observed that the genomes described here formed a monophyletic branch with samples from Sao Paulo (the majority) and In this report, we showed the predominance of the P.1 variant in the Tocantins. Moreover, we detect for the first time a potential new SARS-CoV-2 Gamma-related lineage through the sequencing of samples. We also show the spread and evolution of P.1 in Brazil. These findings reinforce the importance of continuous genomic surveillance in the state of Tocantins aiming to monitor and prevent the dispersion of variants. We reinforce the importance of genomic surveillance to support public health decisions and strategies that are advocated by the state health department, in order to prevent the spread of SARS-CoV-2 variants. and medaka (https://github.com/nanoporetech/medaka) for consensus sequence generation. Pango lineages were attributed to the newly assembled genomes using the Pangolin v3.1.5 software tool (https://pangolin.cog-uk.io/) [14] . In order to construct a phylogenetic tree we first aligned the 24 genomes recently sequenced and the 34 genomes deposited in GISAID with MAFFT v.7.480 [15] . The resulting alignment were subject to Maximum Likelihood phylogenetic analysis with IQ-TREE v.2.1.2 [16] under the Generalized Time Reversible GRT model of nucleotide substitution with empirical base frequencies (+F) and invariable sites (+I), as selected by the ModelFinder software. Bootstrap support was calculated with 10,000 tree replicates. The tree was visualized and edited using iTOL [17] . Likelihood tree was generated using IQ-TREE v.2.1.2 [16] with 10,000 bootstraps as branch support. The tree was visualized and edited using iTOL [17] . All Bioinformatic and phylogenetic analyses were performed using the computational infrastructure of the Bioinformatics and Biotechnology Laboratory (LABINFTEC). We would like to thank all the authors and administrators of the GISAID database, which allowed this study of genomic epidemiology to be conducted properly. A full list acknowledging the authors publishing data used in this study can be found in the following file: Supplementary The proximal origin of SARS-CoV-2 Genomic Epidemiology of SARS-CoV-2 Infection During the Initial Pandemic Wave and Association With Disease Severity Rapid implementation of SARS-CoV-2 sequencing to investigate cases of health-care associated COVID-19: a prospective genomic surveillance study Evolution and epidemic spread of SARS-CoV-2 in Brazil Sequence Analysis of 20,453 Severe Acute Respiratory Syndrome Coronavirus 2 Genomes from the Houston Metropolitan Area Identifies the Emergence and Widespread Distribution of Multiple Isolates of All Major Variants of Concern Whole-genome sequencing of SARS-CoV-2 reveals the detection of G614 variant in Pakistan Genomic characterization of a novel SARS-CoV-2 lineage from Rio de Janeiro, Brazil Genomic surveillance of SARS-CoV-2 tracks early interstate transmission of P.1 lineage and diversification within P.2 clade in Brazil D155Y Substitution of SARS-CoV-2 ORF3a Weakens Binding with Caveolin-1: An in silico Study COVID-19 epidemic in the Brazilian state of Amazonas was driven by long-term persistence of endemic SARS-CoV-2 lineages and the recent emergence of the new Variant of Concern P.1 Genomic Surveillance of SARS-CoV-2 in the State of Rio de Janeiro, Brazil: a technical briefing Minimap2: pairwise alignment for nucleotide sequences A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology MAFFT multiple sequence alignment software version 7: improvements in performance and usability IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation