key: cord-0771801-nan1pxe2 authors: Pandey, U.; Yee, R.; Precit, M.; Bootwalla, M.; Ryutov, A.; Shen, L.; Maglinte, D. T.; Ostrow, D.; Biegel, J. A.; Judkins, A. R.; Bender, J. M.; Gai, X.; Dien Bard, J. title: Pediatric COVID-19 in Southern California: clinical features and viral genetic diversity date: 2020-06-01 journal: nan DOI: 10.1101/2020.05.28.20104539 sha: 7b4ce42075cebc8fbb7e98276194fbb530aa2940 doc_id: 771801 cord_uid: nan1pxe2 Clinical presentation of COVID-19 in children remains under investigation. In this manuscript, we present a summary of clinical findings from the first 23 cases of COVID-19 in children at Children's Hospital Los Angeles. Considering the paucity of genomics data for circulating SARS-CoV-2 isolates in Southern California, we also present an overview of the viral genetic diversity in isolates obtained from these patients and compare them to isolates from other parts of the United States and globally. Our study presents much needed clinical and viral genomics data pertaining to COVID-19 in children. children and assessed the genomic variations and epidemiology of the viral isolates using wholegenome sequencing (WGS) directly from clinical specimens. During an 8-week period we identified 35 pediatric patients with confirmed COVID-19, of which 22 (62.9 %) were seen at outpatient clinics and 13 (37.1%) were admitted to the hospital upon presenting to the emergency department. Demographics and clinical presentation are summarized in Table 1 . The median age of the 35 patients was 12.5 years (range: 18 days to 18.5 years) with a male predominance (20/35, 57.1%). Median time to discharge of hospitalized patients was 4.0 days. While most reports suggest that SARS-CoV-2 causes asymptomatic to mild infections in children making them an important link in community-based viral transmission 8 , over half of our cohort was symptomatic. Among 20 patients with available medical history, 14 (66.7%) were symptomatic with the common symptoms being fever (57 %), congestion (36%), cough (36%), and shortness of breath (29%). Other observed symptoms included wheezing, chest pain, rhinorrhea, diarrhea, sore throat, and headache. Three of 4 patients (75%) with chest-imaging showed opacities in the lungs. Five patients (14%) required oxygen supplementation, of which 3 (60%) had a chronic condition. No patients reported travel history, however, four had direct contact to individuals with COVID-19. No death was observed in our cohort. The median viral load obtained from all positive results was 1.6 x 10 6 copies/mL (range: 2.7 x 10 2 to 2.8 x 10 7 copies/mL). The median SARS-CoV-2 viral load was higher in symptomatic than asymptomatic patients (2.4x10 7 vs 1.2x10 4 copies/mL, p=0.02). All patients <5 years old had higher viral loads (1.5x10 8 vs 7.4 x10 5 copies/mL, p=0.04) and were symptomatic, corroborating the findings of prior studies demonstrating correlation between disease severity, viral load and younger age in children 8, 9 . No difference in viral load was observed between those with chronic underlying conditions and those without (1.6x10 6 vs 5.0 x10 6 copies/mL, p=0. 3 Interestingly, one co-infection with human metapneumovirus was observed in a young infant 10 . Three of 5 patients with repeated testing were persistently positive for SARS-CoV-2 RNA for up to 16 days. In our cohort, one child with standard risk B-cell acute lymphoblastic leukemia (ALL) was treated with a 4-day course of hydroxychloroquine. The patient was initially asymptomatic but developed symptoms 3 weeks into his hospital stay and was started on hydroxychloroquine with a plan for close cardiac monitoring. Within four days, his symptoms completely resolved, however, PCR continues to be positive 6 weeks after initial positive. Table 1 ). These variants were located in the 5'UTR (n=1), pp1a (n=42), pp1ab (n=16), S (n=12), ORF3a (n=4), E (n=1), M (n=3), ORF6 (n=2), ORF7a (n=2), ORF8 (n=2), N (n=9) and stemloop II of the 3'UTR (n=2) (Fig.1a, 1b) . Of the 97 unique SNVs, 56 were non-synonymous, 30 were synonymous, and 3 were intergenic (Fig.1b) . The predominance of non-synonymous variations across different ORFs has been previously documented, and highlights the evolution of SARS-CoV-2 during the course of the pandemic 12 . Five of 7 IN/DELs caused a frame-shift mutation in pp1a while 2 were present in the S protein. Examination of ORFs pp1a/ab and S with the highest number of SNVs for positive selection using Ka/Ks ratio, lacked statistical support (p≥0.05). Notably, the recently described D614G mutation in the S protein 13 , caused by nucleotide G-to-A substitution at position 23,403 in the Wuhan reference strain NC_045512.2 ( Fig. 1a) , was present in 33/35 (94.3%) isolates. This mutation was shown to be rapidly fixed in isolates from Europe and North America and has been postulated to play an important role in viral egress and enhancement of interaction between receptor-binding-domain of the S protein with viral entry receptor ACE2. Estimated evolutionary rate calculated using metadata from each isolate was 6.4 × 10 -4 substitutions per site per year or 19.1 substitutions per year (Fig.1c , Extended Data Fig.1 ). Our findings are concordant with the mutation rate of 6.0 × 10 -4 substitutions per site per year reported by a recent study after analyzing 7,666 high-quality SARS-CoV-2 genomes from the GISAID database 14 . Remarkably, the mutation rate of SARS-CoV-2 is comparable to other RNA viruses, despite coronaviruses possessing the ability to encode a 3'-5'exoribonuclease -ExoN (nsp14) to proofread the complementary strand during genome replication, thus enhancing the fidelity of RNA-dependent RNA polymerase (RdRP) compared to other coronaviruses viruses [14] [15] [16] . The inferred time to most recent common ancestor (TMRCA) based on the molecular clock analysis of these isolates was 2019-12-04, which is comparable to the TMRCA of 2019-12-06, based on the analysis of 4,085 global isolates available in Nextstrain 17 , previously published data 14 , and with the start of the pandemic. Pairwise difference of just 8.9 variations per isolate between the Wuhan isolate and our isolates provides further support that these viruses share a recent common ancestor. Comparison of the 35 CHLA isolates to 966 SARS-CoV-2 genomes from the US and globally revealed that CHLA isolates clustered predominantly with other isolates from the US, but also with isolates from Europe and Australia (Fig.2) . Two isolates from a sibling pair clustered together indicating familial transmission. This diversity among the CHLA isolates points to multiple potential introductions of the virus in Southern California from across the US and the world. The idea of linking viral genetic diversity to disease severity is intriguing. Studies examining viral genomes during the Ebola 2013-2016 epidemic identified a single non-synonymous mutation in the viral glycoprotein, which increased its infectivity and severity in humans 4 . Whether genomic diversity in SARS-CoV-2 genome predicts disease severity remains to be determined. In our cohort, comparison of viral genomes did not identify variations solely present in symptomatic or asymptomatic patients (Fig. 1a) . In fact, only 6 of 97 variations across the viral genome were present in more than 5 isolates regardless of the disease phenotype of the patient (Extended Data Table 1 ). Absence of shared variations determining disease phenotype points to host factors being the primary determinant of disease severity. Similar findings have been reported by a recent study of adult patients examining viral genomes from 112 patients 12 . Examination of clinical features in our cohort suggests that presentation of COVID-19 in children is multifaceted. We observed higher disease severity in younger children and disease manifestation correlated with viral load. Unlike previously suggested, the majority of our patients were symptomatic when tested, however, one-third were asymptomatic. These findings have direct implications for infection control within the hospital as it highlights the importance of screening patients before hospital admittance to investigate asymptomatic shedding and to avoid exposures. Sequencing of the viral genomes provided a glimpse into the viral genetic diversity in the circulating strains of SARS-CoV-2 in Southern California, which has thus far been lacking. We observed limited variations between the isolates. Nevertheless, the majority of these variations led to an amino-acid change in the viral protein, possibly indicating an on-going adaptation of the virus in human population. Most importantly, no variation was associated with . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 1, 2020. . disease manifestation. Our study presents the first pediatric cohort examining clinical, molecular, and epidemiological characteristics of pediatric COVID-19 infections in the US. We identified all positive patients tested at Children's Hospital Los Angeles for detection of SARS-CoV-2 from nasopharyngeal swabs submitted between March 13 to May 11, 2020. A total of 35 patients were enrolled in this study. Demographic data including age, gender, location of admission, coexisting conditions, antimicrobial therapy, modes of oxygen supplementation, history of travel and contacts, clinical signs and symptoms (e.g. fever, congestion, cough, shortness of breath, wheezing, chest pain, rhinorrhea, diarrhea, headache, sore throat, and change in smell and taste), and radiographic findings (e.g. chest X-ray and chest computed tomography) were obtained from the electronic medical record. Nasopharyngeal swabs were sent to the Clinical Virology Laboratory at Children's Hospital Los Angeles. Total nucleic acid was extracted from the samples using the NucliSENS easyMag (bioMerieux, France) and qRT-PCR was performed using the CDC 2019-Novel Coronavirus (2019-nCoV) Real-Time RT-PCR assay that has been granted emergency use authorization (EUA) by the U.S. Food and Drug Administration. A positive result for SARS-CoV-2 detection was determined by amplification of both N1 and N2 viral targets using a cut-off of Ct value < 40. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 1, 2020. . CleanPlex SARS-CoV-2 Research and Surveillance NGS Panel 18 . Briefly, cDNA was synthesized by combining 11 µL of sample, with 3 µL of RT primer Mix DP and incubated for 5 minutes at 650C. 5 µL of RT Buffer DP and 1 µL of RT Enzyme were then added to the mix and incubated for 10 minutes at 80°C then for 80 minutes at 42°C. cDNA was then purified using 2.2X beads-to-sample ratio of CleanMag Magnetic Beads and 70% Ethanol. Separate multiplex-PCR reactions were then setup for primer pool 1 and 2 using 5 µL of purified cDNA, 2 µL of nuclease-free water, 2 µL of 5X mPCR Mix, and 1 µL of 10X SARS-CoV-2 Primer Pool 1/2. PCR conditions used were as follows: initial denaturation -95°C for 10 minutes, 10 cycles of denaturation (98°C for 15 seconds) and annealing/extension (60°C for 5 minutes), hold at 100°C. Reactions for each primer pool were then combined and purification was performed using 1.3X beads-to-sample ratio of CleanMag Magnetic Beads and 70% Ethanol. Digestion reaction was then setup and incubated for 10 minutes at 37°C using 10 µL of purified PCR product, 7 µL of nuclease-free water, 2 µL of CP Digestion Buffer and 1 µL of CP Digestion Reagent. Digested libraries were then purified using 1.3X beads-to-sample ratio of CleanMag Magnetic Beads and 70% Ethanol. Second PCR reaction was then setup using 10 µL of purified libraries, 18 µL of nuclease-free water, 8 µL of 5X Second PCR Mix, and 2 µL each of i5 and i7 Dual-Indexed PCR Primer for Illumina. PCR conditions used were as follows: initial denaturation -95°C for 10 minutes, 25 cycles of denaturation (98°C for 15 seconds) and annealing/extension (60°C for 75 seconds), hold at 10°C. 1X beads-to-sample ratio of CleanMag Magnetic Beads and 70% Ethanol was then used for purification to obtain the final library. Libraries were quantified using the Agilent TapeStation High Sensitivity D1000 screen tape assay. Libraries were normalized to approximately 7nM, re-quantified and pooled to a final . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 1, 2020. Nucleotide sequences were aligned with NovoAlign. Coverage profiles, variant calls and consensus genomes were generated using an in-house software system -LUBA. Consensus sequences were built by adjusting the reference genome at high allele frequency SNV and indel loci. Base quality adjusted pileup was generated, and the alternative bases and indels that accounted for more than 50% of the pileup were inserted into the reference sequence. Consensus genomes obtained for the 35 CHLA isolates was compared to the Wuhan isolate Rate estimation and visualization was performed using bioinformatics tools provided by Nextstrain 17 . MSA of the 35 CHLA isolates and the Wuhan isolate (NC_045512.2) 3 was generated using MAFFT (version 7.453) 19 . A Maximum likelihood tree using Bayesian information criteria was generated with IQ-TREE (version 1.15.0) 22 using GTR substitution model. The resulting rate estimation and phylogeny was then time-resolved using TreeTime 23 and visualized using auspice 17 . Differences of Ct values were compared using Mann-Whitney test. P-value for selection was calculated using Fisher's exact test for selection in MEGA. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 1, 2020. . https://doi.org/10.1101/2020.05.28.20104539 doi: medRxiv preprint Review Board under IRB CHLA-16-00429. Further information on research design is available in the Nature Research Reporting Summary linked to this article. The data shown in the manuscript are available upon request from the corresponding author. Nucleotide sequences of all 35 CHLA isolates have been submitted to NCBI and GISAID. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 1, 2020. . https://doi.org/10.1101/2020.05.28.20104539 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 1, 2020. . https://doi.org/10.1101/2020.05.28.20104539 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 1, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 1, 2020. T orf1ab missense pp1a 1 NC045512 9214 C T orf1ab synonymous pp1a 1 NC045512 9541 C A orf1ab synonymous pp1a 1 NC045512 9857 C T orf1ab synonymous pp1a 1 NC045512 9858 TA T orf1ab frame-shift pp1a 1 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 1, 2020. G A orf1ab missense pp1a 1 NC045512 10416 C _ orf1ab frame-shift pp1a 1 NC045512 10540 G T orf1ab missense pp1a 1 NC045512 11224 C T orf1ab synonymous pp1a 2 NC045512 11824 C T orf1ab synonymous pp1a 1 NC045512 11893 G T orf1ab missense pp1a 2 NC045512 11967 C T orf1ab missense pp1a 2 NC045512 12202 G T orf1ab missense . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 1, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 1, 2020. . https://doi.org/10.1101/2020.05.28.20104539 doi: medRxiv preprint Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention Kawasaki-like disease: emerging complication during the COVID-19 pandemic Gastrointestinal features in children with COVID-19: an observation of varied presentation in eight children. The Lancet Child & Adolescent Health 0 Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak Ebola Virus Glycoprotein with Increased Infectivity Dominated the 2013-2016 Epidemic COVID-19 Web version GISAID -Next hCoV-19 App COVID-19 in Children: Initial Characterization of the Pediatric Disease SARS-CoV-2 Viral Load in Upper Respiratory Specimens of Infected Patients A pneumonia outbreak associated with a new coronavirus of probable bat origin Viral and host factors related to the clinical outcome of COVID-19 Spike mutation pipeline reveals the emergence of a more transmissible form of SARS-CoV-2 Emergence of genomic diversity and recurrent mutations in SARS-CoV-2. Infection Discovery of an RNA virus 3′→5′ exoribonuclease that is critically involved in coronavirus RNA synthesis Nextstrain: real-time tracking of pathogen evolution Highly sensitive and full-genome interrogation of SARS-CoV-2 using multiplexed PCR enrichment followed by next-generation sequencing Recent developments in the MAFFT multiple sequence alignment program Molecular Evolutionary Genetics Analysis across Computing Platforms