key: cord-0881151-wfjmrl80 authors: Colson, P.; Levasseur, A.; Gautret, P.; Fenollar, F.; Hoang, V.-T.; Delerce, J.; Bitam, I.; Saile, R.; Maaloum, M.; Padane, A.; Bedotto, M.; Brechard, L.; Bossi, V.; Ben Khedher, M.; Chaudet, H.; Million, M.; Tissot-Dupont, H.; Lagier, J. C.; Mboup, S.; Fournier, P.-E.; Raoult, D. title: Introduction into the Marseille geographical area of a mild SARS-CoV-2 variant originating from sub-Saharan Africa date: 2020-12-24 journal: nan DOI: 10.1101/2020.12.23.20248758 sha: 21bb92a30dc3020aa84a5b4dc3c6c95bec484fb9 doc_id: 881151 cord_uid: wfjmrl80 Background: In Marseille, France, the COVID-19 incidence evolved unusually with several successive epidemic episodes. The second outbreak started in July, was associated with North Africa, and involved travelers and an outbreak on passenger ships. This suggested the involvement of a new viral variant. Methods: We sequenced the genomes from 916 SARS-CoV-2 strains from COVID-19 patients in our institute. The patients' demographic and clinical features were compared according to the infecting viral variant. Results: From June 26th to August 14th, we identified a new viral variant (Marseille-1). Based on genome sequences (n = 89) or specific qPCR (n = 53), 142 patients infected with this variant were detected. It is characterized by a combination of 10 mutations located in the nsp2, nsp3, nsp12, S, ORF3a, ORF8 and N/ORF14 genes. We identified Senegal and Gambia, where the virus had been transferred from China and Europe in February-April as the sources of the Marseille-1 variant, which then most likely reached Marseille through Maghreb when French borders reopened. In France, this variant apparently remained almost limited to Marseille. In addition, it was significantly associated with a milder disease compared to clade 20A ancestor strains. Conclusion: Our results demonstrate that SARS-CoV-2 can genetically diversify rapidly; its variants can diffuse internationally and cause successive outbreaks. Marseille. In addition, it was significantly associated with a milder disease compared to clade 20A ancestor strains. Our results demonstrate that SARS-CoV-2 can genetically diversify rapidly, its variants can diffuse internationally and cause successive outbreaks. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint The SARS-CoV-2 virus emerged in humans in Wuhan, China, in December 2019, prior to speading worldwide. In China and Europe, the epidemic had a bell shape typical of a respiratory virus (https://covid19-country-overviews.ecdc.europa.eu/) [1, 2] . Surprisingly, in the countries that closed their borders, the evolution varied: in some countries, no other epidemic was detected whereas in others, new epidemic waves occurred, caused by new variants [3] . Among the new sources that may explain the occurrence of different epidemics according to geographical zones, the role of intensive animal breeding like mink farming in Denmark [4] remains a mystery. In Marseille, the bell-shaped curve ended in May, but new cases and then an atypically-shaped epidemic reappeared upon the border reopening. The reopening of borders with Maghreb occurred despite the fact that a very active COVID-19 outbreak was ongoing in Algeria. Interestingly, the first cases of the July epidemic had direct or indirect contacts with passengers from ferries coming from Tunisia or Algeria, which led us to suspect that this variant had an African origin. In our institute (Méditerranée Infection Institute [IHU]) in Marseille, Southern France, we investigated the viral genotypes from patients diagnosed in Marseille using genomic sequencing and genotype-specific PCR. Then, we also tested patients' specimens from Algerian, Moroccan and Senegalese residents for the presence of a new variant that we named Marseille-1. In IHU in Marseille, France, we have carried out SARS-CoV-2 RNA testing using real-time reverse transcription-PCR (qPCR) since the end of January 2020, as previously described [1, 5] . The numbers of tests and cases were daily monitored since the first positive diagnosis on 02/27/2020 [2] (https://www.mediterranee-infection.com/covid-19/). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint 5 Whole genome sequencing of SARS-CoV-2 genome was performed from nasopharyngeal samples tested between June and August 2020 at IHU. Specimens with a cycle threshold value (Ct) lower than 20 were selected in priority, and those with a Ct between 20 and 30 were included secondarily to cover the study period more comprehensively. The study was approved by the ethical committee of the University Hospital Institute Méditerranée Infection (N°: 2020-016-2). Viral genomes were obtained using next-generation sequencing (NGS) and the Illumina technology (Illumina Inc., San Diego, CA, USA), as previously described [2, 6] . Viral RNA was extracted from 200 µL of nasopharyngeal swab fluid using the EZ1 Virus Mini Kit v2.0, and was reverse transcribed using SuperScript IV (ThermoFisher Scientific, Waltham, MA, USA) prior to cDNA second strand synthesis with Klenow Fragment DNA polymerase (New England Biolabs, Beverly, MA, USA). The generated DNA was purified using Agencourt AMPure XP beads (Beckman Coulter, Villepinte, France) and sequenced using the Illumina Nextera XT Paired end strategy on a MiSeq instrument. Genome consensus sequences were generated with the CLC Genomics workbench v.7 by mapping on the SARS-CoV-2 genome GenBank accession no. NC_045512.2 (Wuhan-Hu-1 isolate) with the following thresholds: 0.8 for coverage and 0.9 for similarity. SARS-CoV-2 sequences obtained in our institute have been submitted to the GISAID database. Sequences from complete genomes were analyzed using the Nextclade web-tool (https://clades.nextstrain.org/) [7] . Clades were defined based on the occurrence of at least five genomes sharing the same pattern of mutations. These genome sequences were compared to those available in the GISAID database (https://www.gisaid.org/). Phylogenetic trees were reconstructed by using Nextclade and visualized with iTOL (https://itol.embl.de/). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint 6 For the specimens with Ct values > 30 or those with Ct values < 30 but from which genome sequences were not obtained, we attempted to identify those harboring the Marseille-1 variant using RT-PCR targeting a fragment of the nucleocapsid-encoding gene harboring two mutations separated by 17 nucleotides concurrently present in the Marseille-1 variant. RT-PCR was performed with the PrimerF1 (forward, 5'-TCTACGCAGAAGGGAGCAGA-3') and PrimerR1 (reverse, 5'-GGAGAAGTTCCCCTACTGCTG-3') primers, and the QuantiNova SYBR Green RT-PCR kit (Qiagen, Hilden, Germany). In order to evaluate whether the Marseille-1 variant was also prevalent in these countries, this PCR system was also applied to 97 SARS-Cov-2 specimens from COVID-19-positive residents from Senegal, 278 from Algeria and 94 from Morocco. All specimens had been sampled in October and November 2020. From February 29 th to August 31 st , 2020, the demographic and clinical features of the patients infected with the Marseille-1 variant were compared to those of the patients infected during episode 1 with 20A variants. Statistical tests were done using R 4.0.2 (https://cran.r-project.org/bin/windows/base/): Chi2 or Fisher's exact test for qualitative variables, and Student's t-test for quantitative variables. A p<0.05 was considered statistically significant. The study was approved by the ethical committee of the Méditerranée Infection Institute (N°: 2020-016-2). Access to the patients' biological and registry data issued from the hospital information system was approved by the data protection committee of Assistance is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint Protection Regulation registry under number RGPD/APHM 2019-73. The Marseille-1 variant emerged in week 27, when it accounted for 100% of sequenced genomes (Figure 1 As of December 17 th , 2020, 405,070 tests were performed for SARS-CoV-2 infection for 289,689 patients, of whom 25,446 (8.8%) were positive (Figure 2) . We have diagnosed 6,855 patients during episode 1 and 18,591 during episode 2. We obtained SARS-CoV-2 fulllength genome sequences from 916 patients (submitted to the GISAID database). These included 429 genomes from episode 2, which were added to 487 genomes from episode 1 11 . Time-scaled phylogeny enabled differentiating ten clusters, identified as variants Marseille-1 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint 8 to Marseille-10, encompassing at least 5 genomes each [3] . Between July and August, the Phylogeny recontruction using genomic sequences available in GISAID showed that Marseille-1 variants belonged to a cluster that comprised almost only sequences from sub-Saharan Africa including from Senegal and Gambia as well as from the Marseille area is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint 9 followed human routes through Mali, Tunisia or Egypt to Europe [10] . In addition, the emergence of Marseille-1 variants in Marseille followed the resumption in week 27 of maritime and air connections between Maghreb and Marseille (Supplementary Table 1 ). However, despite the diffusion of the Marseille-1 epidemic in the population from Marseille, it seemingly barely spread outside the city, its prevalence rapidly decreased, being <10% for week 31 and it disappeared at the end of August. gene, amino acid substitution S1931I). By comparison with the Wuhan-Hu-1 strain, two additional mutations were noted, including A20268G (nsp15, synonymous) and C28833U (N gene, S187L; and ORF14, H34Y) ( Table 1 ). Only amino acid substitutions V126F and L274F in the Nsp2 protein were found in the CoV-GLUE replacements database (http://cov-glue.cvr.gla.ac.uk/#/replacement) [11] , in 123 and 308 GISAID genomes, respectively. The two additional mutations C22088U, corresponding to substitution L176F in the S protein and G5378A, corresponding to substitution G887S in the Nsp3 protein, were found in 612 and 9 GISAID genomes, respectively. In addition, we observed the successive occurrence of two additional mutations, is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint respectively, of the 38 full-length genomes obtained in Marseille. Moreover, 28 other mutations were found in between 1 and 6 of these Marseille-1 variant genomes, raising the total number of mutated nucleotide positions to 42 (Table 1) . Interestingly, amino acid substitution C102F in ORF8 disrupts a disulfide bond close to the end of this 121 amino acidlong protein, which might alter its function, though as-yet undetermined. Their most recent identified ancestors are genomes from Senegal and Gambia that either harbored none of the 10 hallmark mutations, or C1625U, or C1625U associated to C25886U (Figure 1) . This strongly suggests the evolution of Marseille-1 ancestors in these countries through the successive occurrence of these mutations. We compared the characteristics of 336 patients infected between March and April 2020 with clade 20A strains and 81 patients infected with the Marseille-1 variant ( Table 2) . The patients infected with the Marseille-1 variant were more frequently to be male and younger than those infected with clade 20A strains from episode 1. Of the 417 patients, 56 were hospitalized. The hospitalization rate was lower in patients infected with the Marseille-1 variant. Ten patients died and five were transferred to intensive care unit, all of whom were infected with 20A variants. Clinical symptoms were available in 320 patients ( Table 2) . Patients infected with the Marseille-1 variant suffered less frequently from dyspnea and hypoxemia. In contrast, rhinitis, anosmia and ageusia were not significantly different between patients infected with either of the two variants. Overall, the Marseille-1 variant exhibited a milder phenotype [13] . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint Here we demonstrate that two SARS-CoV-2 epidemic phenomena occurred in France. The first one, from March to May 2020, exhibited a usual evolution for a respiratory viral infection, and was similar to the one observed in China. In contrast, following an almost total disappearance of SARS-CoV-2 diagnoses, the second epidemic episode evolved in successive or overlapping waves. These waves resulted from the occurrence of 10 viral variants exhibiting substantial genetic diversity between each other. Among them, the Marseille-1 variant caused a short outbreak than ran from July to August 2020, and remained essentially restricted to the Marseille area in France. In addition, our study demonstrated that the Marseille-1 variant was present in sub-Saharan (Senegal, Gambia) and North Africa (Tunisia, Algeria, Morocco). This variant is most likely a descendant from clade 20A strains transferred to sub-Saharan Africa by French travelers during episode 1 [10, 14] , prior to genetically evolving onsite and being later brought back to Maghreb and then to Marseille by travelers. The Marseille-1 variant was associated with a milder clinical outcome and a lower epidemic potential, and no associated death were observed. In addition, its epidemic potential was lower, and no case of re-infection with this variant was detected, contrasting with what we observed with other variants [15] . Europe [4] . Under these conditions, what generally occurs is speciation [16] . As a matter of is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint fact, the existence of animal reservoirs, infected during the first episode, may explain the differences in epidemic curves observed among countries. However, the consequences of viral variant selection in massive animal groups, and subsequent human infections, remain to be determined, especially because the immunity acquired by patients during episode 1 may not be protective against a re-infection with another variant. [15] Then, when international borders reopened and travels resumed, the reconnection of these isolated ecosystems where different variants had developed generated new outbreaks in areas that were the most exposed to incoming populations. This was in particular the case for Marseille where daily passenger air and boat traffic exchanges with Maghreb occur. Marseille is alas familiar with boat importation of epidemics from the South, notably plague and cholera [17] . Therefore, our results confirm that SARS-CoV-2 is able to genetically diversify rapidly, its variants to diffuse internationally with travelers and cause successive outbreaks, even in populations beforehand exposed to the original virus. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. The authors declare no competing interests. Funding sources had no role in the design and conduct of the study; collection, management, analysis and interpretation of the data; and preparation, review, or approval of the manuscript. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint Temporal and age distributions of SARS-CoV-2 and other coronaviruses, southeastern France Ultrarapid diagnosis, microscope imaging, genome sequencing, and culture isolation of SARS-CoV-2 Genome sequence analysis enabled to decipher the atypical evolution of COVID-19 epidemics in Marseille Spatial transmission risk during the 2007-2010 Q fever epidemic in The Netherlands: Analysis of the farm-to-farm and farm-to-resident transmission Outcomes of 3,737 COVID-19 patients treated with hydroxychloroquine/azithromycin and other regimens in Marseille, France: A retrospective analysis Genomic diversity and evolution of coronavirus (SARS-CoV-2) in France from 309 COVID-19-infected patients Nextstrain: real-time tracking of pathogen evolution What could explain the late emergence of COVID-19 in Africa? Transmission of SARS-COV-2 from China to Europe and West Africa: a detailed phylogenetic analysis CoV-GLUE: A Web Application for Tracking SARS-CoV-2 Genomic Variation A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Different pattern of the second outbreak of COVID-19 in Marseille, France Importations of COVID-19 into African countries and risk of onward spread Evidence of SARS-CoV-2 reinfection with a different genotype On the origin of species Yersinia pestis: the Natural History of Plague The Phyre2 web portal for protein modeling, prediction and analysis It is made available under a perpetuity.is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprintThe copyright holder for this this version posted December 24, 2020. ; https://doi.org/10.1101/2020.12.23.20248758 doi: medRxiv preprint