key: cord-301251-6f2nzvhz authors: Cheemarla, N. R.; Brito, A. F.; Fauver, J. R.; Alpert, T.; Vogels, C. B. F.; Omer, S. B.; Ko, A.; Grubaugh, N. D.; Landry, M. L.; Foxman, E. F. title: Host response-based screening to identify undiagnosed cases of COVID-19and expand testing capacity date: 2020-06-05 journal: medRxiv : the preprint server for health sciences DOI: 10.1101/2020.06.04.20109306 sha: doc_id: 301251 cord_uid: 6f2nzvhz The COVID-19 pandemic has created unprecedented challenges in diagnostic testing. At the beginning of the epidemic, a confluence of factors resulted in delayed deployment of PCR-based diagnostic tests, resulting in lack of testing of individuals with symptoms of the disease. Although these tests are now more widely available, it is estimated that a three- to ten-fold increase in testing capacity will be required to ensure adequate surveillance as communities reopen(1). In response to these challenges, we evaluated potential roles of host-response based screening in the diagnosis of COVID-19. Previous work from our group showed that the nasopharyngeal (NP) level of CXCL10, a protein produced as part of the host response to viral infection, is a sensitive predictor of respiratory virus infection across a wide spectrum of viruses(2). Here, we show that NP CXCL10 is elevated during SARS-CoV-2 infection and use a CXCL10-based screening strategy to identify four undiagnosed cases of COVID-19 in Connecticut in early March. In a second set of samples tested at the Yale New Haven Hospital, we show that NP CXCL10 had excellent performance as a rule-out test (NPV 0.99, 95% C.I. 0.985-0.997). Our results demonstrate how biomarker-based screening could be used to leverage existing PCR testing capacity to rapidly enable widespread testing for COVID-19. businesses reopen 1 . While PCR-based detection of viral genomes is the current gold standard for 23 diagnosis of SARS-CoV2 and is highly specific, this method requires trained laboratory 24 personnel, has multiple steps, and uses costly reagents subject to supply chain disruption. Based 25 on previous work, we hypothesized that an elevated level of the interferon inducible protein 26 CXCL10 in the nasopharynx could serve as a sensitive screen for patients with an active 27 respiratory virus infection including SARS-CoV2 2 . If so, this test could be used as a pre-screen 28 to direct PCR testing to patients most likely to be SARS-CoV2 positive, allowing accurate 29 diagnosis while preserving PCR testing capacity. Protein measurements lend themselves to 30 rapid, high throughput, and point-of care diagnostic methodologies 3,4 , so this combined strategy 31 offers the potential to have a first step which is sensitive and convenient (CXCL10 assay) 32 followed by a more specific test (PCR) for a subset of screen-positive patients. (Table S1 ). Of 642 NP samples, 376 were negative for all 42 viruses on the panel, but the SARS-CoV-2 status was unknown. 43 44 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20109306 doi: medRxiv preprint Previously, we found that the CXCL10 level in the NP swab viral transport medium was a useful 45 indicator of infection with many common respiratory viruses 2 , suggesting that this measurement 46 could be used to indicate which of these 376 samples were most likely to contain a virus not 47 detected by the RVP. NP CXCL10 is expected to be undetectable in healthy subjects. First, to 48 assess whether CXCL10 could be used as a biomarker for SARS-CoV-2 infection, we measured 49 CXCL10 concentration in 39 confirmed SARS-CoV-2 positive NP swab specimens from the 50 third week of March and compared to levels in the RVP-negative samples from March 3rd to 51 14th. SARS-CoV-2 positive samples showed an average CXCL10 concentration of 681pg/ml 52 (Fig 1b) . Notably, the two SARS-CoV-2 positive samples with lowest CXCL10 levels also had 53 very low virus RNA concentrations by RT-qPCR (Ct values for N1 were 36.4 and 34.9). In 54 contrast, the majority of RVP-negative samples had CXCL10 levels at or below the limit of 55 detection, with only 8.5% (32/376) of the samples having CXCL10 levels above a threshold 56 value of 100 pg/ml (Fig 1b) . We hypothesized that some of these samples might be from patients 57 with undiagnosed COVID-19. 58 59 Next, we examined the medical records for patient clinical features associated with the RVP-60 negative, CXCL10-positive samples (Fig 1c) . The majority of samples (18/32) were from 61 patients 65 years of age or older. These patients displayed typical symptoms of COVID-19 and 62 other respiratory infections with the most common being fever, cough, and evidence of 63 pneumonia and/or hypoxemia. Most patients in this age group were admitted to the hospital. 64 RT-qPCR. Four of the 32 samples contained SARS-CoV-2 RNA, and this result was confirmed 68 by the YNHH clinical virology laboratory (Fig 1d) . To evaluate the performance of CXCL10 as 69 a pre-screen for virus-positive samples, we also tested the 344 CXCL10-low samples for SARS-70 CoV-2 RNA, and all were negative. For completeness, we also tested the 266 RVP-positive 71 samples from this time period for SARS-CoV-2 RNA and all were negative. Thus, the four 72 samples identified in the screen were the only SARS-CoV-2-positive among the 642 NP samples 73 tested by RVP in this time frame, and are among the first SARS-CoV-2-positive samples in our 74 The identified SARS-CoV-2-positive samples were collected on March 10 th -14th. Three of these 77 samples were from adults, two of whom were subsequently hospitalized for COVID-19. 78 Surprisingly, one sample was from a young child seen as an outpatient. Clinical features 79 associated with each patient presentation are indicated by the letters A-D on Figure 1c . 80 To gain further insight into the epidemiology of these early cases of SARS-CoV-2, we 82 performed whole-genome sequencing and phylogenetic analysis ( Fig. 2 ; data can be visualized 83 at: https://nextstrain.org/community/grubaughlab/CT-SARS-CoV-2/paper3). The full list of 84 genomes included in this analysis and details of genome coverage can be found in Table S2 . 85 Interestingly, the four SARS-CoV-2 isolates were genetically distinct. Previous work by our 86 group described a cluster of early cases in Connecticut closely related to strains from 87 Washington State (lineage A1), including the SARS-CoV-2 sequenced from one of these patients 88 (genome Yale-009) 8 . The other newly sequenced positive cases from this screen (genomes Yale-89 011, 040 and 151) belong to a distinct lineage (B1). Yale-040 groups within the sub-lineage 90 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20109306 doi: medRxiv preprint B1.11, which has Western Europe, likely the United Kingdom, as the most probable origin. 91 Finally, the genomes Yale-040 and 011 are closely related to isolates from New York state but 92 group in different clades (Fig. 2) , indicating that SARS-CoV-2 entered Connecticut via multiple 93 independent lines of transmission by early March. 94 The fact that all four SARS-CoV-2-positive samples were among the CXCL10-high group (Fig 96 1d ) suggested that a CXCL10-based screening strategy might be an efficient way to leverage 97 existing PCR tests to greatly expand testing capacity, due to the high negative predictive value of 98 a negative NP CXCL10 screen. In other words, we would have found all SARS-CoV-2-positive 99 samples even if we had not performed PCR testing on the >90% of samples that were below the 100 100pg/ml CXCL10 cutoff. To further evaluate the performance of this strategy (biomarker based 101 screen followed by PCR test), we next tested a second group of samples -all samples sent to the 102 YNHH laboratory for COVID-19 testing on a single day in March. We chose March 20th, 2020, 103 since we were able to obtain residual samples for all 144 NP swab samples for which SARS-104 CoV-2 PCR testing was ordered on that date. Since these samples were not pre-filtered to 105 exclude patients with other viral infections, we expected to see a higher rate of CXCL10-positive 106 samples in this sample set than in the RVP-negative samples. We measured CXCL10 (Fig 3c) . Considering all 144 samples evaluated for suspected 113 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. Finally, to better understand the possible uses of CXCL10 in the diagnosis of COVID-19, we 118 studied the correlation between NP CXCL10 level and biological variables using all 59 of the 119 SARS-CoV-2-positive samples in this study (4 from March 3rd to 14th, 17 diagnosed on March 120 20th, and 39 positive control from other dates during the third week of March) (Fig 4a, b) . 121 interferon stimulated gene (ISG), a gene highly induced by interferon signaling or by activation 123 of intracellular sensors for viral RNA within infected cells 9,10 . Recent studies of innate immune 124 response to SARS-CoV-2 indicate that the virus triggers a robust antiviral interferon response in 125 the airway 11-14 . Therefore, we reasoned that NP CXCL10 level might reflect the level of active 126 viral replication, the stimulus for CXCL10 production. Consistently, across all SARS-CoV-2+ 127 samples in the study, viral load was positively associated with NP CXCL10 (Fig 4c) . 128 We also considered that NP CXCL10 level might indicate more robust antiviral responses in the 130 nasal mucosa and therefore might correlate with biological variables associated with lower 131 illness severity. Samples in this study were from patients of a wide age range (Fig 4a, b) , 132 enabling analysis of the relationship between viral load and age. Pearson correlation analysis 133 showed a significant negative association between NP CXCL10 and age, with higher NP 134 CXCL10 associated with younger age (Fig 4d) . One explanation for the trend towards higher 135 CXCL10 levels in younger patients might be that younger patients had higher NP viral loads; 136 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20109306 doi: medRxiv preprint however, this did not appear to be the case as there was no clear correlation between age and 137 viral RNA level in this sample set (Fig 4e) . An alternate explanation could be that that the 138 antiviral response in the upper respiratory tract is more robust in young patients compared to 139 older patients during SARS-CoV-2 infection. It is also possible that high NP CXCL10 tracks 140 with lower severity of illness; however, since all older patients in this sample set had more 141 severe symptoms and were more likely to be hospitalized (Fig 4b) it was not possible to unlink 142 age from illness severity in this analysis. Further work will be needed to examine this more fully. 143 Taken together, the results reported here indicate the potential for host-response based screening 145 to help solve current challenges in diagnostic testing presented by the SARS-CoV-2 pandemic. 146 One major challenge is the need to rapidly expand testing capacity to support surveillance as 147 social distancing measures ease. We show that NP CXCL10 level had a very high negative 148 predictive value for SARS-CoV-2 infection in patients with a range of symptoms, in which the 149 COVID-19 prevalence was 12% (17/144) (Fig 3) . This indicates that, if used as a pre-screen in a 150 similar patient population, this test could potentially eliminate the need for ~80% of PCR tests by 151 screening out samples very likely to be negative for the virus. Although this strategy may not 152 capture every positive case, there also may be some cases in which biomarker based testing is 153 more sensitive than PCR. Based on longitudinal studies, the sensitivity of PCR-based testing can 154 vary considerably even on sequential days in the same patient 15-17 . It is possible that some of the 155 29 RVP-negative, CXCL10 positive patients in Figure 1 that tested negative for SARS-CoV-2 156 did indeed have COVID-19; however due to study design it is not possible to follow up with This study focused on symptomatic patients. In future studies, it will be important to assess 160 biomarker performance in other populations, particularly if the intention is to screen populations 161 who may have low NP viral loads (e.g. asymptomatic subjects.) Previous studies have shown 162 that induction of host interferon stimulated genes occurs in the nasal mucosa of asymptomatic 163 subjects with respiratory virus infection, but the level of induction may be lower 2,18-21 . For 164 evaluating such subjects, it will be important to use a CXCL10 assay with a lower limit of 165 detection than the one used here. trigger for CXCL10 production is viral replication, it is possible that this biomarker could serve 175 as a correlate of infectivity, which could be assessed using viral culture. Finding a biomarker to 176 distinguish whether PCR-positivity signifies live/infectious virus will be particularly useful as 177 very sensitive but non-quantitative tests for viral genetic material gain more widespread use, 178 such as in-home point of care tests. 179 180 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20109306 doi: medRxiv preprint In conclusion, we provide evidence based on early cases of COVID-19 in our region that host 181 biomarker-based screening has tremendous potential to solve some of the current challenges 182 presented by the COVID-19 pandemic, including rapidly increasing testing capacity. While 183 diagnostics that detect viral RNA are highly specific, these tests are complex, relatively 184 expensive, and subject to supply chain interruption. In contrast, immunoassay can provide 185 inexpensive, high throughput testing and can be easily adapted to point of care testing. While we 186 previously identified several proteins highly induced during the antiviral interferon response, we 187 focused on NP CXCL10 for this study as it is a well-known molecule for which validated 188 detection antibodies and automated detection platforms already exist. While more study is 189 needed, the work presented here demonstrates the great potential for biomarker-based screening 190 to enable rapid expansion of testing capacity by directing existing testing to samples most likely 191 to be virus-positive. 192 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.04.20109306 doi: medRxiv preprint conditions were reverse transcription for 10 minutes at 55°C, initial denaturation for 1 min at 216 95°C, followed by 40 cycles of 10 seconds at 95°C and 20 seconds at 55°C on the Biorad CFX96 217 qPCR machine (Biorad, Hercules, CA, USA). PCR-positive samples were confirmed by the 218 YNHH clinical laboratories as described above. 219 220 Human CXCL10 was measured in duplicate for each sample using the R&D Human 222 Network was used to monitor each sequencing run. Runs were stopped when sufficient depth of 235 coverage was achieved to accurately generate a consensus sequence. Following the completion 236 of each sequencing run, raw reads (.fast5 files) were basecalled using Guppy high-accuracy 237 model (v3.5.1, ONT, Oxford, UK). Basecalled FASTQ files were used as input into the ARTIC 238 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. To infer the evolutionary history and origins of the early sampled SARS-CoV-2 genomes, we 244 performed phylogenetic analysis using a nextstrain pipeline 32 . Along its workflow, sequences 245 were aligned using MAFFT 33 , and the phylogeny was inferred using a Maximum Likelihood 246 approach implemented on IQTree 34 , with GTR substitution model. Ancestral continental origins 247 were inferred as discrete characters using TreeTime 35 . Finally, the phylogenetic data 248 visualization was obtained using Auspice 32 . (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. . All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. . (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 5, 2020. Nextstrain: real-time tracking of pathogen evolution MAFFT multiple sequence alignment software version 7: 360 improvements in performance and usability IQ-TREE 2: New Models and Efficient Methods for Phylogenetic 363 Inference in the Genomic Era Maximum-likelihood phylodynamic 366 analysis We would like to thank Maureen Owen, Robin Gardner, Greta Edelman, Acknowledgements of authors of the genomes used in this study can be found in Table S3 .All rights reserved. No reuse allowed without permission.(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. 1 / 2 0 / 2 0 3 / 1 / 2 0 3 / 2 / 2 0 3 / 3 / 2 0 3 / 4 / 2 0 3 / 5 / 2 0 3 / 6 / 2 0 3 / 7 / 2 0 3 / 8 / 2 0 3 / 9 / 2 0 3 / 1 0 / 2 0 3 / 1 1 / 2 0 3 / 1 2 / 2 0 3 / 1 3 / 2 0 3 / 1 4 / 2 0 3 / 1 5 / 2 0 3 / 1 6 / 2 0 3 / 1 7 / 2 0 3 / 1 8 / 2 0 3 / 1 9 / 2 0 3 / 2 0 / 2 0 3 / 2 1 / 2 0 3 / 2 2 / 2 0 3 / 2 3 / 2 0 3 / 2 4 / 2 0 3 / 2 5 / 2 0 3 / 2 6 / 2 0 3 / 2 7 / 2 0