key: cord-0792435-ry3ytced authors: Siqueira, Juliana D; Goes, Livia R; Alves, Brunna M; de Carvalho, Pedro S; Cicala, Claudia; Arthos, James; Viola, João P B; de Melo, Andréia C; Soares, Marcelo A title: SARS-CoV-2 genomic analyses in cancer patients reveal elevated intrahost genetic diversity date: 2021-02-16 journal: Virus Evol DOI: 10.1093/ve/veab013 sha: 005069987d6da899ff339d18228661e48ade28b7 doc_id: 792435 cord_uid: ry3ytced Numerous factors have been identified to influence susceptibility to SARS-CoV-2 infection and disease severity. Cancer patients are more prone to clinically evolve to more severe COVID-19 conditions, but the determinants of such a more severe outcome remain largely unknown. We have determined the full-length SARS-CoV-2 genomic sequences of cancer patients and healthcare workers (non-cancer controls) by deep sequencing and investigated the within-host viral population of each infection, quantifying intrahost genetic diversity. Naso- and oropharyngeal SARS-CoV-2(+) swabs from 57 cancer patients and 14 healthcare workers from the Brazilian National Cancer Institute were collected in April–May 2020. Complete genome amplification using ARTIC network V3 multiplex primers was performed followed by next-generation sequencing. Assemblies were conducted in Geneious R11, where consensus sequences were extracted and intrahost single nucleotide variants were identified. Maximum likelihood phylogenetic analysis was performed using PhyMLv.3.0 and lineages were classified using Pangolin and CoV-GLUE. Phylogenetic analysis showed that all but one strain belonged to clade B1.1. Four genetically linked mutations known as the globally dominant SARS-CoV-2 haplotype (C241T, C3037T, C14408T and A23403G) were found in the majority of consensus sequences. SNV signatures of previously characterized Brazilian genomes were also observed in most samples. Another 85 SNVs were found at a lower frequency (1.4-19.7%) among the consensus sequences. Cancer patients displayed a significantly higher intrahost viral genetic diversity compared to healthcare workers. This difference was independent of SARS-CoV-2 Ct values obtained at the diagnostic tests, which did not differ between the two groups. The most common nucleotide changes of intrahost SNVs in both groups were consistent with APOBEC and ADAR activities. Intrahost genetic diversity in cancer patients was not associated with disease severity, use of corticosteroids, or use of antivirals, characteristics that could influence viral diversity. Moreover, the presence of metastasis, either in general or specifically in the lung, was not associated with intrahost diversity among cancer patients. Cancer patients carried significantly higher numbers of minor variants compared to non-cancer counterparts. Further studies on SARS-CoV-2 diversity in especially vulnerable patients will shed light onto the understanding of the basis of COVID-19 different outcomes in humans. In December 2019, a new form of pneumonia was described in patients with severe acute respiratory syndrome in the city of Wuhan, province of Hubei, China (Li et al. 2020) . Soon after, a new beta-coronavirus was identified as the causative agent of that disease (Wu et al. 2020 ). The new virus was named Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), and the disease was called Coronavirus Disease 2019 (COVID-19) (Wu et al. 2020 ). Since its initial discovery, COVID-19 has become a pandemic of catastrophic proportions, with almost 100 million confirmed cases of viral infection and two million deaths worldwide (https://www.worldometers.info/coronavirus/, last accessed on January 20 th , 2021). Numerous demographic, clinical, genetic, and behavioral factors have been identified to influence susceptibility to SARS-CoV-2 infection and, among those infected, the severity of the disease, including the risk of death. Those factors include age, sex (Asselta et al. 2020) , genetic loci of certain cytokines/chemokines and the ABO blood system group (Ellinghaus et al. 2020; Kirtipal and Bharadwaj 2020) , smoking history (Gallus et al. 2020) , obesity and underlying comorbidities such as diabetes, hypertension, lung diseases (Singh et al. 2020; Tahvildari et al. 2020 ) and cancer (Lee et al. 2020; Yang et al. 2020; Zhang et al. 2020 ). Among cancer patients, those with malignancies of hematological origin have been reported as particularly vulnerable to COVID-19 (Willan et al. 2020 ). SARS-CoV-2 is a single-stranded RNA virus that replicates using an RNA-dependent RNA polymerase. As such, the virus is subjected to nucleotide sequence changes, and has evolved through molecular evolution and founder effects during its explosive spread throughout the globe. Virus replication rates directly impact the accumulation of mutations in the virus genome, enabling the existence of a viral quasispecies (a swarm of different, yet highly related, viral entities) within an infected host. Although within-host variations of SARS-CoV-2 have been documented (Jary et al. 2020; Shen et al. 2020) , the impact of https://mc.manuscriptcentral.com/vevolu underlying comorbidities that promote persistent viral RNA detection and shedding on virus evolution remains to be elucidated. Moreover, viral genetic variation, as a source of novel mutations, may hinder future therapeutic antiviral and vaccine strategies targeting COVID-19, by the selection of drug-resistant and vaccine escape mutants (Fung and Liu 2019) . In the present work, we have determined the full-length SARS-CoV-2 genomic sequences of 57 cancer patients and 14 healthcare workers (non-cancer controls) employing next-generation sequencing (NGS) and analyzed their epidemiological relatedness and lineage classification. This approach also allowed us to study the within-host viral population of each infection, quantify intrahost viral genetic diversity and characterize specific genetic changes with potential to impact SARS-CoV-2 biology. Finally, we have also assessed associations between viral diversity and patients' clinical and laboratory characteristics, thereby identifying determinant factors of viral evolution in this particular group of patients. Fifty-seven cancer patients followed at the Brazilian National Cancer Institute (INCA), Rio de Janeiro, Brazil, and 14 healthcare workers (HCW) diagnosed with COVID-19 between April 7 th and May 5 th 2020, early in the COVID-19 pandemic in Rio de Janeiro, were included in this study. SARS-CoV-2 infection was diagnosed through naso-and oropharyngeal swab specimens using RT-qPCR following the U.S. Centers for Disease Control and Prevention (CDC) protocol (Centers for Disease Control and Prevention 2020). All participants agreed to be enrolled in the study and signed an informed consent. Participants' data were treated anonymously. This study was approved by the Brazilian National Commission for Ethics in Research (CONEP) (approval number: CAAE 30608220.8.0000.5274). https://mc.manuscriptcentral.com/vevolu SARS-CoV-2 nucleic acid isolation, amplification and sequencing Naso-and oropharyngeal swabs were collected and placed into a conical tube containing 2 ml of viral transport medium (VTM, Thermo Fisher Scientific, Waltham, MA denatured PhiX DNA (sequencing control) and sequenced in a MiSeq platform (2× 251 cycles paired-end run; Illumina). New PCR reactions using combinations of the primers described above were carried out to cover regions with low coverage for each sample. Positive products were purified and sequenced by Sanger using the BigDye Terminator kit (Thermo Fisher Scientific) in an automated 3130XL Genetic Analyzer (Thermo Fisher Scientific). Sequences were edited and assembled with SeqMan v.7.0.0 (DNAStar Inc., Madison, WI). All analyses were conducted using Geneious R11 software (Biomatters, Auckland, New Zealand), where the reads were trimmed to achieve an error rate below 0.1% and assembled to the Wuhan-Hu-1 reference sequence genome (GenBank number MN908947). A minimum mapping quality of 30 was required, providing a 99.9% confidence level that the mapping is correct. Additionally, all assemblies were visually inspected to evaluate the mapped reads and consequently to ensure the quality of the consensus generated and single https://mc.manuscriptcentral.com/vevolu nucleotide variation (SNVs) analysis. Consensus sequences representing SARS-CoV-2 near full-length genomes were extracted for each sample and aligned to the Wuhan-Hu-1 reference sequence genome. Nucleotide variations in relation to the reference sequence were identified and classified as SNVs. Intrahost single nucleotide variation (iSNV) was defined as a variation with a frequency greater than 2% and depth coverage by at least 500 reads. iSNVs were manually verified, and the intrahost viral genetic diversity rate was calculated as the number of nucleotide substitutions with a frequency greater than 2% for the given sample divided by the number of positions with depth coverage greater than 500 times multiplied by 10 -4 (substitutions/site x 10 -4 ). The 2% threshold applied in this study was chosen based on previous studies on HIV, in which this threshold was able to distinguish between variants consistently detected in different sequencing replicates from spurious variants (Alves et al. 2017; Dudley et al. 2014 ). For SARS-CoV-2 lineage classification, consensus genomes were submitted to Pangolin software (https://github.com/cov-lineages/pangolin, downloaded on June 10 th , 2020) and to CoV-GLUE lineage system (http://cov-glue.cvr.gla.ac.uk/#/home, accessed on June 10 th , 2020) (Singer et al. 2020) , both based on the nomenclature proposed by Rambaut et al (Rambaut et al. 2020 ). An alignment including the consensus sequences generated and genomes from Brazilian sequences available on the GISAID Database classified as B1, B1.1 and the Brazilian clusters B1.1-BR/ B1.1-EU/BR (Supplementary Table S1 ) were submitted to a maximum likelihood phylogenic reconstruction using PhyML v.3.0 and the best model of nucleotide substitution was defined with Model Generator (GTR) to investigate the sublineage classification of the study sequences (Guindon et al. 2010; Keane et al. 2006; Resende et al. 2020 ). Furthermore, a phylogenetic analysis that included the generated consensus sequences along with all SARS-CoV-2 sequences from Rio de Janeiro state https://mc.manuscriptcentral.com/vevolu (Brazil) presently available at GISAID (https://www.epicov.org/epi3/frontend, accessed on July 27 th , 2020, Supplementary Table S1) was performed in order to investigate epidemiological relatedness of sequences. Clinical categorical variables were compared between cancer patients and HCWs using chi-square and Fisher's exact tests. Mann-Whitney two-tailed test was used to compare intrahost diversity (substitutions/site x 10 -4 ) between cancer patients and HCWs and between Summarized demographic and clinical characteristics of the patients and healthcare workers from whom SARS-CoV-2 sequences were studied can be seen in Table 1 . Among patients, the median age was 61 years and most of them (72%) had solid malignancies, 16% of patients used corticosteroids and 14% used oseltamivir previously or during COVID-19 diagnosis specimen collection. Among healthcare workers (HCW), the median age was 40 years and most (86%) were female. The most prevalent COVID-19 symptoms among patients were cough, fever and dyspnea. Death from COVID-19 occurred in 33.3% of the cases. For HCW, cough and coryza were the mainly reported COVID-19 symptoms (85.7% each), and https://mc.manuscriptcentral.com/vevolu all subjects recovered from the disease, with no deaths reported. No difference was found in sex distribution between the two groups (p = 0.118), but HCW had a lower median age when compared to cancer patients (p < 0.001). Death occurred in 38.6% of the cancer patients, but no deaths occurred among HCWs (p = 0.0029). Some COVID-19-related symptoms at diagnosis also differed between the two groups (Table 1) . A total of 27,433,528 reads were obtained from sequencing, with an average of 382,118 reads per sample, ranging from 217,922 to 631,796 reads. Reads of each sample were assembled with Wuhan-Hu-1 reference genome with a minimum mapping quality of 30 Phred and the average depth coverage obtained was 1,468 (465-2,530). The coverage was heterogeneous across the genome but was similar among the samples ( Supplementary Fig. S1 ). Consensus sequences containing more than 97.9% of the SARS-CoV-2 complete genome were generated from all 57 cancer patients and 14 HCW samples. SARS-CoV-2 genome sequence submission to the Pangolin and CoV-GLUE algorithms resulted in the same lineage classification in all cases, defining all but one virus belonging to clade B1.1, while the remaining sequence was classified as B.1. A phylogenetic analysis of the viruses together with sequences previously defined as Brazilian circulating strains B1.1-BR and B1.1-EU/BR showed that most B1.1 genomes generated in this study clustered with B1.1-BR sequences (Fig. 1A ) (Resende et al. 2020) . A phylogenetic tree including all local SARS-CoV-2 sequences isolated from patients residing in the state of Rio de Janeiro available at the GISAID database (accessed on July 27th, 2020, Supplementary Table S1) was performed to investigate potential epidemiological linkage between samples ( Fig. 1B) . We noted that some of the viruses sequenced at INCA clustered in clades containing identical sequences, suggesting a transmission link between the study subjects. In some instances, both cancer patients and HCW were involved in those epidemiological https://mc.manuscriptcentral.com/vevolu clusters. Although in some cases sequences from outside the hospital were also identical to viruses from our series, therefore not excluding the possibility of community transmission, the most likely scenario for those cases is a nosocomial transmission between patients and/or HCW. Overall, 95 single nucleotide variations (SNVs) and three deletions were found across the SARS-CoV-2 consensuses analyzed ( Supplementary Fig. S2 ). Four genetically linked mutations previously described as the globally dominant haplotype in April 2020 were found in the majority of our consensus sequences: C241T (100%; 5'UTR region), C3037T (98.6%; silent mutation), C14408T (100%; resulting in P4715L/P323L amino acid change in ORF1ab) and A23403G (100%; resulting in D614G amino acid change in S) (Korber et al. 2020) . The next-generation sequencing method used for studying viruses allowed us to assess the intrahost SNVs (iSNVs) that compose each subject's viral within-host population. iSNVs were distributed at 160 genome positions and all iSNVs present in overlapping regions of PCR fragments were concordant in both fragments. Five of them were observed in more than one sample, of which only one was found in epidemiologically linked samples. Of the 160 iSNVs, 140 were already observed in unrelated strains isolated from different countries/regions of the globe according to the Nexstrain, GESS and CovGLUE databases (Fang et al. 2020; Hadfield et al. 2018; Singer et al. 2020) . The most frequent variations were missense (96 positions), silent variations were observed at 63 genome positions and one iSNV position was in a non-coding region ( Supplementary Fig. S3 ). Nonsense changes were not observed. The absence of nonsense changes, coupled to the observations of missense mutations which appear in unrelated viruses across the globe suggests biological significance to these changes (e.g., immune escape or increase in fitness). The missense mutations also appear to have risen independently in different patients of the study and in those from abroad, as these viruses are not related by recent common ancestry. The ratio of nonsynonymous to synonymous intrahost variations was 1.49 and for most ORFs (ORF1a, ORF1b, S, ORF3a, M and N) this ratio was greater than 1.25. All but one iSNV with intrahost frequency greater than 20% were found exclusively in cancer patients' samples (Supplementary Fig. S4 and Supplementary Table S3 ). The number of iSNVs across the viral genome can be visualized in Fig. 2 and their coverage is shown in Supplementary Fig. S5 . Interestingly, patients displayed a significantly higher intrahost viral genetic diversity when compared to HCW (p = 0.009; Fig. 3A ) and remained significant even after outlier subjects with higher virus diversity were excluded from the analysis (p = 0.029; Fig. 3B ). Viral genetic diversity within each individual ORF was compared between the two groups, but no differences were found after correction for multiple comparisons. https://mc.manuscriptcentral.com/vevolu As the within-host genetic diversity of viruses is commonly associated with viral replication, we have evaluated the correlation of the within-host population diversity in our subjects with the Ct values obtained in the RT-PCR swab tests of the same samples. Ct values work as a proxy for SARS-CoV-2 viral load in samples and are expected to be inversely correlated with viral diversity and replication. Surprisingly, however, Ct of the samples did not inversely correlate with viral diversity, but rather showed a positive correlation, despite having a low r s value, below 0.5 (Fig. 3C) . This was also true when patients' samples were analysed separately (r s = 0.490; p = 0.001; data not shown). Of note, no significant differences were observed when Ct values were compared between the two groups (p = 0.175). No correlation was observed when comparing viral genetic diversity and Ct when only Ct values lower than 20 were considered (Fig. 3D ). Despite the above-mentioned age difference observed between HCW and cancer patients, age did not correlate with viral genetic diversity Table 2 ). We also assessed the potential association of cancer patients with hematological malignancies compared to those with solid cancers, but no association was found (p = 0.722; Nucleotide changes across the genome at intrahost level can be visualized in Fig. 4A . Cytidine-to-thymidine/uridine (C-to-T(U)) and adenosine-to-inosine/guanosine (A-to-I(G)) transitions that are characteristic of APOBEC and ADAR activities (Smith and Sowden 1996; Vieira and Soares 2013) are highlighted. C-to-T was the most frequent iSNV observed in both cancer patients and HCWs (Fig. 4B) . The distribution of these changes did not differ between the two groups studied (data not shown). The biology of SARS-CoV-2 infection in humans is striking to infectious disease clinicians worldwide, because no viral infection has been previously seen with such an enormous range of phenotypic outcomes, from no symptoms to severe respiratory distress and death. Most of this physiological variance, however, has been attributed to host genetic and behavioral factors. Numerous characteristics have been associated with susceptibility to SARS-CoV-2 infection and disease severity among infected subjects, and underlying comorbidities seem to play a major role in unfavorable disease outcomes. Chronic noncommunicable diseases such as cancer are among those conditions. Cancer patients have been reported to be more prone to SARS-CoV-2 infection and to clinically evolve to more severe conditions upon infection (Lee et al. 2020; Yang et al. 2020; Zhang et al. 2020 ), but the determinants of these severe outcomes remain largely unknown. In this study we have evaluated the near full-length sequences of SARS-CoV-2 infecting cancer inpatients in one of the largest public cancer hospitals in South America, the Brazilian National Cancer Institute, and compared these sequences with those generated from healthcare professionals from the same institution. These complete SARS-CoV-2 genomes showed signatures characteristic of the virus that spread globally and is currently the predominant strain (Korber et al. 2020 ). All but one virus also belonged to clade B1.1, which is the clade primarily circulating in the Americas. The viral genomes also displayed sequence https://mc.manuscriptcentral.com/vevolu features of other already characterized Brazilian viruses, consistent with the hypothesis of local, community transmission rather than virus importation from abroad. In fact, the timeframe of the analyzed infections (from April 7 th to May 5 th , 2020) is consistent with a period in Brazil where community virus transmission was already established and ongoing (Candido et al. 2020) . Moreover, as a public and free hospital in the Brazilian Public Health System, INCA is also likely to admit patients with low socioeconomic resources who are mostly unable to travel abroad and most likely acquired viral infections from local sources. We explored the evolutionary and phylogenetic relationships between the SARS-CoV-2 sequences of the studied samples. Upon a phylogenetic inference with viral sequences isolated from other infected subjects residing in the state of Rio de Janeiro (the same geographic location of the study site), we found that almost half of the sequences from our subjects lie in clusters with sequences from other patients and/or from HCW. Some of the consensus sequences within each cluster were identical, suggesting a direct epidemiological link between those groups of patients/HCW. Some sequences retrieved from the database representing subjects from the community outside the hospital were also identical to some hospital-based sequences, ruling out the possibility of completely excluding transmission from outside the hospital. However, the most parsimonious explanation is nosocomial transmission in those cases. Indeed, the subjects' samples were collected at a time in Brazil when tests for SARS-CoV-2 infection were not easily accessible, and inpatients and HCW had to wait several days for a test result, thus presenting a risk for further transmission. Single nucleotide variations were found across the entire SARS-CoV-2 genome. The spike (S) D614G mutation, found in all samples analyzed, has been associated with higher viral titers, suggesting increased viral infectivity (Korber et al. 2020) . Other variations were also found in different regions of the spike protein, including a 12-bp in-frame deletion that harbors part of the signal peptide and the predicted cleavage site in the beginning of S. As expected, the P323L change in the RNA-dependent RNA polymerase (RdRp), genetically linked to D614G, was also found in all our samples. In silico analysis showed that P323L may impact the protein secondary structure, leading to a reduction in its molecular flexibility (Begum et al. 2020) . However, the phenotypic impact of these mutations is still poorly understood. Numerous other missense mutations were found that warrant further investigation concerning their phenotypes. The most striking observation of our intrahost population variation analysis was that cancer patients carried significantly higher numbers of minor variants when compared to noncancer counterparts. This difference was independent of, and unrelated to the Ct values obtained at the diagnostic tests, which did not differ between the two groups. Despite cancer cases with metastatic sites have been associated with COVID-19 related death (de Melo et al. 2020), we did not find any association between intrahost diversity and metastasis or disease severity (requirement for ICU, death by any cause or COVID-19-related). Unexpectedly, this difference was also not related to the use of corticosteroids (which could lower their immunity status), use of oseltamivir (which was used by some patients to overcome a potential H1N1 infection before the COVID-19 diagnosis), neither associated with the type of primary malignancy (solid tumor vs. hematologic tumors). Despite conflicting data existing in the literature, the hematologic cancer patients infected with SARS-CoV-2 herein analyzed did not show an increased chance of COVID-19 severe outcomes when compared to those with solid tumors (de Melo et al. 2020) . Surprisingly, Ct values (as a proxy to viral load) not only did not inversely correlate with virus diversity, but showed a weak positive correlation, albeit with a low r s coefficient and did not remain significant when comparing only Ct values below 20. It is well established that naso-and oropharyngeal swabs are not the best types of sample for detecting SARS-CoV-2, compared to sputum for example, which contain a larger amount of viral genetic material (Mohammadi et al. 2020) . Our data underscore the possibility that the variation in the viral population that we see is not generated in the naso-or oropharynx, but rather more https://mc.manuscriptcentral.com/vevolu distally in the respiratory tract (lungs) or even in other tissues such as the gut. Reports on the comparative expression of the virus' cellular receptor ACE2 support the idea that those other tissues might be relevant sources of viral replication and, consequently, sites where diversity emerges (Hikmet et al. 2020; Lamers et al. 2020) . C-to-T intrahost transition was the most prevalent iSNV found in both groups studied. This change is characteristic of RNA editing by APOBEC enzymes (Smith and Sowden 1996; Vieira and Soares 2013) and has been reported by other groups when comparing SARS-CoV-2 strains (Simmonds 2020) and other coronaviruses (Di Giorgio et al. 2020) . Despite the fact we found SARS-CoV-2 within-host population variation in cancer patients, we do not know the mechanism(s) by which, or the anatomical site(s) where this variation is generated. In addition, other limitations of our study are evident, such as the age and disease severity differences between the two studied groups. Nevertheless, by generating a higher number of distinct variants, the virus can explore wider areas of the sequence landscape and test variants with different regulatory and structural changes. Variation may impact tissue tropism, protein expression and function, stability, immune escape, drug resistance and pathogenicity. Further studies on SARS-CoV-2 diversity, especially in vulnerable patients with underlying comorbidities will shed light on our understanding of the wide spectrum of disease outcomes associated with COVID-19 in humans. shared. A table containing all genome sequences used in this article and the respective information can be found on Supplementary Table S1 . Access to sequencing data generated in this study is available in GISAID under IDs EPI_ISL_513513-513583 and in Sequence Read Archive (SRA) under project number PRJNA657032. This work was supported by grants to MAS from Brazilian Research Concil 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 Science Foundation (E-26/202.894/2017) and to JPBV from Brazilian Research Concil (307042/2017-0) and Carlos Chagas Filho Rio de Janeiro State Science Foundation (202.640 Author contributions JDS, LRG, BMA and MAS conceived and designed the study. JDS, LRG and BMA optimized all reagents and performed the sequencing and analysis. ACM collected clinical data. CC, JA, JPBV, ACM and MAS provide expert advice on experimental planning and data interpretation Characterization of HIV-1 Near Full-Length Proviral Genome Quasispecies from Patients with Undetectable Viral Load Undergoing First-Line HAART Therapy ACE2 and TMPRSS2 variants and expression as candidates to sex and country differences in COVID-19 severity in Italy Specific mutations in SARS-CoV2 RNA dependent RNA polymerase and helicase alter protein structure, dynamics and thus function: Effect on viral RNA replication Evolution and epidemic spread of SARS-CoV-2 in Brazil CDC 2019-Novel Coronavirus (2019-nCoV) Real-Time RT-PCR Diagnostic Panel Cancer inpatient with COVID-19: a report from the Brazilian National Cancer Institute Evidence for host-dependent RNA editing in the transcriptome of SARS-CoV-2 Cross-clade simultaneous HIV drug resistance genotyping for reverse transcriptase, protease, and integrase inhibitor mutations by Illumina MiSeq Genomewide Association Study of Severe Covid-19 with Respiratory Failure GESS: a database of global evaluation of SARS-CoV-2/hCoV-19 sequences Human Coronavirus: Host-Pathogen Interaction No double-edged sword and no doubt about the relation between smoking and COVID-19 severity New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0' Nextstrain: real-time tracking of pathogen evolution The protein expression profile of ACE2 in human tissues Evolution of viral quasispecies during SARS-CoV-2 infection Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified Interleukin 6 polymorphisms as an indicator of COVID-19 severity in humans Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus SARS-CoV-2 productively infects human gut enterocytes COVID-19 mortality in patients with cancer on chemotherapy or other anticancer treatments: a prospective cohort study Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia SARS-CoV-2 detection in different respiratory sites: A systematic review and meta-analysis A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Genomic surveillance of SARS-CoV-2 reveals community transmission of a major lineage during the early pandemic phase in Brazil Genomic Diversity of Severe Acute Respiratory Syndrome-Coronavirus 2 in Patients With Coronavirus Disease Rampant C-->U Hypermutation in the Genomes of SARS-CoV-2 and Other Coronaviruses: Causes and Consequences for Their Short-and Long-Term Evolutionary Trajectories', mSphere CoV-GLUE: A Web Application for Tracking SARS-CoV-2 Genomic Variation Prevalence of co-morbidities and their association with mortality in patients with COVID-19: A systematic review and meta-analysis Base-modification mRNA editing through deamination -the good, the bad and the unregulated' Clinical Features, Diagnosis, and Treatment of COVID-19 in Hospitalized Patients: A Systematic Review of Case Reports and Case Series The role of cytidine deaminases on innate immune responses against human viral infections Care of haematology patients in a COVID-19 epidemic A new coronavirus associated with human respiratory disease in China Clinical characteristics, outcomes, and risk factors for mortality in patients with cancer and COVID-19 in Hubei, China: a multicentre, retrospective, cohort study Clinical characteristics of COVID-19-infected cancer patients: a retrospective case study in three hospitals within Wuhan Intrahost single nucleotide variants (iSNVs) according to the nucleotide change. Distribution of nucleotide changes across the genome for all samples (n = 71) is shown (A) Genome coordinates are relative to the SARS-CoV-2 Wuhan-Hu-1 reference sequence (GenBank acc.# MN908947). Average frequency of each type of nucleotide change found for cancer patients (red) We would like to thank all the participants of the INCA COVID-19 Task Force, clinical staff and patients from the Brazilian National Cancer Institute (INCA) for providing conditions and samples that enabled the conduction of this study. We also thank PhD. RenataOlício for providing support to Sanger DNA sequencing. We thank Thiago Muramatsu for the computational analysis support. We kindly acknowledge GISAID Database (https://www.gisaid.org/), the authors and laboratories for the SARS-CoV-2 genomes data https://mc.manuscriptcentral.com/vevolu