key: cord-0255899-wm66fbg8 authors: Naveca, F. G.; Nascimento, V.; Souza, V.; Corado, A. L.; Nascimento, F.; Silva, G.; Mejia, M. C.; Brandao, M. J.; Costa, A.; Duarte, D.; Pessoa, K.; Jesus, M.; Goncalves, L.; Fernandes, C.; Mattos, T.; Abdalla, L.; Santos, J. H.; Martins, A.; Chui, F. M.; Val, F. F.; Melo, G. C. d.; Simao, M. X.; Sampaio, V. d. S.; Mourao, M. P.; Lacerda, M. V.; Batista, E. L.; Magalhaes, A. L.; Dabilla, N.; Pereira, L. C. G.; Vinhal, F.; Miyajima, F.; Dias, F. S.; dos Santos, E. R.; Coelho, D.; Ferraz, M.; Lins, R.; Wallau, G. L.; Delatorre, E.; Gräf, T.; Siqueira, M. M.; Resende, P. C.; Bello, G. title: Spread of Gamma (P.1) sub-lineages carrying Spike mutations close to the furin cleavage site and deletions in the N-terminal domain drives ongoing transmission of SARS-CoV-2 in Amazonas, Brazil date: 2021-09-15 journal: nan DOI: 10.1101/2021.09.12.21263453 sha: 8a62cad0fea76d1f58c0b7589a8101822748a110 doc_id: 255899 cord_uid: wm66fbg8 The Amazonas was one of the most heavily affected Brazilian states by the COVID-19 epidemic. Despite a large number of infected people, particularly during the second wave associated with the spread of the Variant of Concern (VOC) Gamma (lineage P.1), SARS-CoV-2 continues to circulate in the Amazonas. To understand how SARS-CoV-2 persisted in a human population with a high immunity barrier, we generated 1,188 SARS-CoV-2 whole-genome sequences from individuals diagnosed in the Amazonas state from 1st January to 6th July 2021, of which 38 were vaccine breakthrough infections. Our study reveals a sharp increase in the relative prevalence of Gamma plus (P.1+) variants, designated as Pango Lineages P.1.3 to P.1.6, harboring two types of additional Spike changes: deletions in the N-terminal (NTD) domain (particularly {Delta}144 or {Delta}141-144) associated with resistance to anti-NTD neutralizing antibodies or mutations at the S1/S2 junction (N679K or P681H) that probably enhance the binding affinity to the furin cleavage site, as suggested by our molecular dynamics simulations. As lineages P.1.4 (S:N679K) and P.1.6 (S:P681H) expanded (Re > 1) from March to July 2021, the lineage P.1 declined (Re < 1) and the median Ct value of SARS-CoV-2 positive cases in Amazonas significantly decreases. Still, we found no overrepresentation of P.1+ variants among breakthrough cases of fully vaccinated patients (71%) in comparison to unvaccinated individuals (93%). This evidence supports that the ongoing endemic transmission of SARS-CoV-2 in the Amazonas is driven by the spread of new local Gamma/P.1 sub-lineages that are more transmissible, although not more efficient to evade vaccine-elicited immunity than the parental VOC. Finally, as SARS-CoV-2 continues to spread in human populations with a declining density of susceptible hosts, the risk of selecting new variants with higher infectivity are expected to increase. The Amazonas was one of the most heavily affected Brazilian states by the COVID-19 epidemic. Despite a large number of infected people, particularly during the second wave associated with the spread of the Variant of Concern (VOC) Gamma (lineage P.1), SARS-CoV-2 continues to circulate in the Amazonas. To understand how SARS-CoV-2 persisted in a human population with a high immunity barrier, we generated 1,188 SARS-CoV-2 whole-genome sequences from individuals diagnosed in the Amazonas state from 1st January to 6th July 2021, of which 38 were vaccine breakthrough infections. Our study reveals a sharp increase in the relative prevalence of Gamma plus (P.1+) variants, designated as Pango Lineages P.1.3 to P.1.6, harboring two types of additional Spike changes: deletions in the N-terminal (NTD) domain (particularly 144 or 141-144) associated with resistance to anti-NTD neutralizing antibodies or mutations at the S1/S2 junction (N679K or P681H) that probably enhance the binding affinity to the furin cleavage site, as suggested by our molecular dynamics simulations. As lineages P. 1.4 (S:N679K) and P.1.6 (S:P681H) expanded (Re > 1) from March to July 2021, the lineage P.1 declined (Re < 1) and the median Ct value of SARS-CoV-2 positive cases in Amazonas significantly decreases. Still, we found no overrepresentation of P.1+ variants among breakthrough cases of fully vaccinated patients (71%) in comparison to unvaccinated individuals (93%). This evidence supports that the ongoing endemic The Amazonas was one of the most heavily affected Brazilian states by the COVID-19 epidemic, and by 31st July 2021, 13,531 deaths had been reported (1) . The COVID-19 epidemic in Amazonas was characterized by two waves of exponential growth (Figure 1 A) . The first one started in March 2020 and peaked around early May 2020, and was primarily associated with the introduction and dissemination of lineages B.1.195 and B.1.1.28 (2) . The second one started in December 2020 and peaked around early February 2021 and was associated with the local emergence and rapid spread of the Gamma/P.1 lineage, a more transmissible SARS-CoV-2 Variant of Concern (VOC) firstly detected in Japanese travelers returning from the Amazonas State, Brazil (2) (3) (4) . Since mid-February 2021, the number of SARS-CoV-2 deaths dropped and then remained roughly stable (7-day average <20) between May and July 2021. Despite many infected people in the two waves of the COVID-19 pandemic and the relatively high percentage of partial (32%) and fully (18%) vaccinated individuals in the Amazonas state on 31st July 2021 (5) , the virus continues to circulate at a roughly steady-state level of ~500 SARS-CoV-2 positive cases per day (7-day rolling average) from early May to mid-July, 2021 (6). This endemic pattern of virus transmission supports the assumption that population immunity acquired in Amazonas, either through vaccinations or natural SARS-CoV-2 infection, is sufficient to prevent new exponential growth but not to stop the spread of the virus. We hypothesize that the endemic transmission of the VOC Gamma/P.1 in the Amazonas state allowed the rise of secondgeneration variants presenting a higher herd-immunity threshold than the parental virus. To test this hypothesis, we combined phylodynamic approaches with epidemiological data to track mutations accumulated in SARS-CoV-2 whole-genomes recovered from individuals living in the Amazonas state diagnosed between January and July 2021. We demonstrate that persistent SARS-CoV-2 circulation in Amazonas was associated with a sharp increase in the relative prevalence of local P.1 plus (P.1+) variants harboring deletions in the N-terminal domain (NTD) or mutations at the S1/S2 junction of the Spike protein, while the parental P.1 lineage has been through a progressive extinction process since May 2021. New P.1+ lineages are more transmissible than the parental one, but natural and vaccine derived immunity are contributing to limit the spread of these lineages in the Amazonas state. The ongoing evolution of lineage P.1 in the Amazonas state, Brazil. To better understand the recent evolution of VOC P. 1 substitutions and deletions in the S protein in addition to those that define lineage P.1, most of them present in less than ten samples (Supplementary Table 1 ). AA deletions covering in the NTD region ( 144, 141-144, and 138-143) and substitutions at the S1/S2 junction (N679K, P681H, and P681R), however, were particularly prevalent and sharply increase from January to May 2021 (Figure 1 B) but in a much lower prevalence than in the Amazonas (Figure 1 C) . . CC-BY-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. ; Figure 1 . Temporal distribution and genetic diversity of SARS-CoV-2-positive samples from the Amazonas state during 2021. A, Graph depicting the temporal evolution of SARI cases and SARI deaths based on the date of symptom onset (source, http://info.gripe.fiocruz.br) as a proxy for the COVID-19 epidemic curve in Amazonas state, along with the number of SARS-CoV-2 wholegenome sequences generated between January and July 2021. B, Relative frequency of different P.1 lineage variants among SARS-CoV-2 positive cases sequenced in the Amazonas. C, Relative frequency of different P.1 lineage variants among SARS-CoV-2 P.1 Brazilian sequences detected outside the Amazonas states. To determine if the most frequent P.1 mutations detected in Brazil resulted from independent convergent mutations events, we combined the Amazonian P. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. ; supported (aLRT = 76-99%) monophyletic P.1+ sub-clades in Amazonas; some of which received the Pango Lineage designation P.1.3 to P.1.6 ( Figure 2 ). Lineage P.1.3 comprises most P.1+ 141-144 sequences from the Amazonas state (n = 29/33, 88%). Lineage P.1.4 comprises most P.1+N679K sequences from the Amazonas state (n = 187/197, 95%) and three P.1+N679K sequences from Rio de Janeiro. Lineage P.1.5 comprises the remaining P.1+N679K sequences from the Amazonas state (n = 10/197, 5%) and two P.1+N679K sequences from Roraima and São Paulo states. Lineage P.1.6 comprises most P.1+P681H sequences from the Amazonas state (n = 208/209, 99%) and one P.1+P681H sequence from Rio de Janeiro. The fourth lineage, designated as P.1+ 144AM, comprises about half of P.1+ 144 sequences from Amazonas (n = 12/29, 41%). Our analyses also revealed two major well-supported (aLRT = 76-99%) P.1+ lineages designated as P.1.7 and P.1.8 that comprises most P.1+P681H (n = 227/234, 97%) and P.1+P681R (n = 13/20, 65%) sequences detected outside the Amazonas state, respectively. Analysis of the mutational profile reveals a variable number of lineagedefining mutations ranging from one in P.1.4 to nine in P.1.8 (Table 1) is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. ; As expected, phylogeographic reconstructions traced the Amazonas state as the most probable source location (PSP = 1) of lineages P.1.3, P.1.4, P.1.5, P.1.6, and P.1 + 144AM. These P.1 sub-lineages remained mostly restricted to the Amazonas state, except for a few sporadic disseminations to the Rio de Janeiro (lineages P.1.4 and P.1.6), Roraima (lineage P.1.5) and São Paulo (lineage P.1.5) states (Figures 3 B, is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. ; Figure 3 . Temporal structure and phylogeographic reconstruction of the P.1+NTDdel, P.1+N679K, and P.1+P681H clades. A, root-to-tip regression of genetic divergence against dates of sample collection. P.1 sequences were colored green, while each P.1 subclade carrying deletions or additional mutations in S protein was colored following the legend. Time-resolved maximum clade credibility phylogenies of each P.1 subclade defined in the ML analysis: B, P.1.4; C, P.1.6; D, P.1.7; E, minor clades P.1.3, P.1.5, P.1.8 and P.1+ 141-144AM. Tips and branches colors indicate the sampling state and the most probable inferred state of the nodes, respectively, as indicated in the legend for each tree. Bayesian posterior probabilities are indicated in key branches. All horizontal branch lengths are time-scaled, and the tree was automatically rooted under the assumption of the molecular clock model. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The potential impact of S1/S2 mutations on Spike cleavability The S1/S2 furin cleavage site (681PRRAR|S686), which is found in a loop region between residues 675-692, is of great importance for SARS-CoV-2 Spike protein structural changes during the initial steps of host cell infection and that influence the invasion success and transmission of the virus (9, 10). The specificity of furin for a polybasic substrate is partly due to the high negative charge distributed close to the active site and mutations S:P681H/R, that exchange neutral/non-polar residues with basic amino acids close to the cleavage site, enhance the S1/S2 cleavability (11) (12) (13) (14) (15) . We hypothesize that mutation S:N679K may also benefit the enzyme-substrate coupling. To test this hypothesis, metadynamics simulations that modeled the dissociation of the substrateenzyme complex were used to estimate the relative binding affinity between the wildtype (WT)/mutant peptides mimicking the S protein loop and the furin enzyme. Our analyses predict that replacement of a non-polar residue (P) by a polar one (H) provides a modest increase in affinity as compared to the WT peptide, whereas having a positively charged residue at either position 679 (K) or 681 (R) leads to a dramatic enhancement in binding affinity to the furin binding site (Figure 6 ). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Most genetic changes identified in P.1+ lineages also appear in other VOCs and are predicted or known to affect virus infectivity, immune escape, or both. Different NTD deletions present in VOCs Alpha ( 144), Beta ( 241-243), and Delta ( 157-158) confer resistance to neutralizing antibodies (NAb) directed against the NTD antigenic supersite; while the parental VOC Gamma that lack NTD deletions is more sensitive to those NAb (16) (17) (18) . The two most common NTD deletions in P.1+ Amazonian lineages ( 144 and 141-144) confer resistance to anti-NTD NAb and have been shown to emerge in vivo in long-term infections following therapy with convalescent plasma (19, 20) and during acute infections following production of autologous anti-NTD antibodies (21) . Interestingly, NTD deletions 141 and 142 were among the selected forecasted mutations that may contribute to evolution of VOCs according to a recent study (22) . (11) (12) (13) (14) (15) . Mutation S:N679K has not been previously associated with VOCs/VOIs, but the change for an additional basic residue close to the furin cleavage site might enhance the S1/S2 cleavability. Furin is an endoproteinase whose substrate is a polybasic structural motif of the type R-X-K/R-R↓ (where ↓ represents the peptide bond to be cleaved) (23, 24) . In the S protein of SARS-CoV-2, furin recognizes the motif 681-PRRAR↓S-686 which is found in a long loop region between residues 675-692. The specificity of furin for a polybasic substrate is due in part to the high negative charge distributed close to the active site (Figure 6 A) . Thus, our hypothesis is that the binding between furin and the loops of variants P681H, P681R and N679K has greater stability due to the exchange of neutral/non-polar residues for residues with a basic character. To test our hypothesis, metadynamics simulations were used to estimate the binding strength between the . CC-BY-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. ; https://doi.org/10.1101/2021.09.12.21263453 doi: medRxiv preprint wildtype (WT)/mutant peptides mimicking the S protein loop and the furin enzyme. These simulations modeled the dissociation of the substrate-enzyme complex and computed their relative binding affinity. The dissociation was freely conducted along the CVs-defined pathway, providing the potential of mean force (PMF) barrier between the WT complex and the pre-complex states. For metadynamics simulations in which the employed CV is protein -peptide distance, the PMF is equivalent to the free energy of binding (ΔG). shows the free energy landscape (FEL) for the unbinding mechanism of the WT and mutant peptides. The FEL was described through the PMF, depicting the phase space sampled from the CVs. As it can be seen, the three PMFs display a welldefined energy minimum, and metastable states during the dissociation process. It is worth noting that these values may differ quantitatively if the full S protein head was used. Additionally, the calculated values refer to energies obtained within the determined Gaussian parameters to the set of compounds under study. Thus, these energies should be used as a measure to compare the binding free energies among them, and not as absolute free energies. From the PMF it is inferred that all mutants present more favorable binding energy, and therefore, higher affinity, as compared to the WT peptide. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. ; https://doi.org/10.1101/2021.09.12.21263453 doi: medRxiv preprint Amazonas probably already exceeded the herd immunity threshold of most P.1 variants, except for those more transmissible lineages P.1.4 and P.1.6. In summary, our study confirms that endemic transmission of SARS-CoV-2 after the second COVID-19 epidemic wave in the Amazonas state has been associated with the continuous evolution of the VOC P.1 through the acquisition of either Spike mutations at the S1/S2 junction (N679K or P681H) or NTD deletions that probably increased viral infectivity or resistance against antiviral immunity. The steady-state level of SARS-CoV- is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. ; was assessed by the approximate likelihood-ratio test based on the Shimodaira-Hasegawa-like procedure (SH-aLRT) with 1,000 replicates. The temporal signal of the P.1 and P.1+ sequences was assessed from the ML tree by performing a regression analysis of the root-to-tip divergence against sampling time using TempEst (31 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. ; Modeling the relative binding strength of N679K, P681H and P681R variants to furin. The furin cleavage site in the Spike protein of SARS-CoV-2 is found in a disordered region, exposed to the solvent. Due to the high flexibility of this loop, the experimentally resolved structures lack atomic coordinates in this region, even when mutations are introduced to express the protein in its pre-fusion conformation. Starting from the crystallographic structure of an inhibitor bound to human furin (PDB ID: 6HLB) (40) , the 679-NSPRRARS-686 loop region was built analogously in the enzymeinteracting conformation (Supplementary Figure 1 A) . To mimic the complete loop linked to the Spike protein, the modeled motif was used as a folding seed (rotamers kept fixed) and the remaining loop residues were modeled using the Rosetta Remodel protocol (41) (Supplementary Figure 1 B) is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. ; https://doi.org/10.1101/2021.09.12.21263453 doi: medRxiv preprint The authors wish to thank all the health care workers and scientists who have worked hard to deal with this pandemic threat, the GISAID team, and all the EpiCoV database's submitters. The GISAID acknowledgment table containing sequences used in this study is shown in Supplementary Table 2 . We also appreciate the support of Genomic is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 15, 2021. ; Fundação de Vigilância em Saúde do Amazonas COVID-19 in Amazonas, Brazil, was driven by the persistence of endemic lineages and P.1 emergence Novel SARS-CoV-2 Variant Identified in Travelers from Brazil to Japan Governo do Estado do Amazonas, Fundação de Vigilância em Saúde do Amazonas. 2021. Vacinômetro -COVID-19 Governo do Estado do Amazonas, Fundação de Vigilância em Saúde do Amazonas Network oboFC-GS. 2021. The Ongoing Evolution of Variants of Concern and Interest of SARS-CoV-2 in Brazil Revealed by Estimating epidemiologic dynamics from cross-sectional viral load distributions Loss of furin cleavage site attenuates SARS-CoV-2 pathogenesis The furin cleavage site of SARS-CoV-2 spike protein is a key determinant for transmission due to enhanced replication in airway cells Functional evaluation of proteolytic activation for the SARS-CoV-2 variant B.1.1.7: role of the P681H mutation The SARS-CoV-2 variants associated with infections in India, B.1.617, show enhanced spike cleavage by furin Delta spike P681R mutation enhances SARS-CoV-2 fitness over Alpha variant SARS-CoV-2 spike P681R mutation, a hallmark of the Delta variant, enhances viral fusogenicity and pathogenicity SARS-CoV-2 B.1.617.2 Delta variant replication and immune evasion Reduced sensitivity of SARS-CoV-2 variant Delta to antibody neutralization Recurrent deletions in the SARS-CoV-2 spike glycoprotein drive antibody escape Analysis of SARS-CoV-2 variant mutations reveals neutralization escape mechanisms and the ability to use ACE2 receptors from additional species Case Study: Prolonged Infectious SARS-CoV-2 Shedding from an Asymptomatic Immunocompromised Individual with Emergence of multiple SARS-CoV-2 antibody escape variants in an immunocompromised host undergoing convalescent plasma treatment High-throughput, single-copy sequencing reveals SARS-CoV-2 spike variants coincident with mounting humoral immunity during acute COVID-19 Predicting the mutational drivers of future SARS-CoV-2 variants of concern Structure of the unliganded form of the proprotein convertase furin suggests activation by a substrate-induced mechanism The crystal structure of the proprotein processing proteinase furin explains its stringent specificity SARS-CoV-2 variants of concern have acquired mutations associated with an increased spike cleavage Genomic and phylogenetic characterisation of an imported case of SARS-CoV-2 in Amazonas State, Brazil A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology MAFFT multiple sequence alignment software version 7: improvements in performance and usability IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies ModelFinder: fast model selection for accurate phylogenetic estimates Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen) Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10 Many-core algorithms for statistical phylogenetics Bayesian coalescent inference of past population dynamics from molecular sequences Bayesian phylogeography finds its roots Bayesian analysis of elapsed times in continuoustime Markov chains Posterior summarisation in Bayesian phylogenetics using Tracer 1.7. Syst Biol Birth-death skyline plot reveals temporal changes of epidemic spread in HIV and hepatitis C virus (HCV) BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis Design, Synthesis, and Characterization of Macrocyclic Inhibitors of the Proprotein Convertase Furin RosettaRemodel: a generalized framework for flexible backbone protein design The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design Prediction of structures of zinc-binding proteins through explicit modeling of metal coordination geometry Design of a novel globular protein fold with atomic-level accuracy Escaping free-energy minima FGN and MMS contributed to laboratory management and obtaining financial support. VN, VS, ALC, FN, GS, MJ, AC, DD, KP, MJB, LG, ELRB, ALAM, ND, LCGP, FV All authors have declared that no conflicts of interest exist. All the SARS-CoV-2 genomes generated and analyzed in this study are available at the EpiCov database in GISAID (https://www.gisaid.org/). The list of accession IDs may be found in the attached supplementary information material file.