key: cord-1029017-sfexbh15 authors: Johnson, A. J.; Zhou, S.; Hoops, S. L.; Hillmann, B.; Schomaker, M.; Kincaid, R.; Daniel, J.; Beckman, K.; Yohe, S.; Nelson, A. C. title: Saliva testing is accurate for early-stage and presymptomatic COVID-19 date: 2021-03-04 journal: nan DOI: 10.1101/2021.03.03.21252830 sha: b9e2960fe6b820d7b1479b4da9aa080558a43d02 doc_id: 1029017 cord_uid: sfexbh15 Although nasopharyngeal (NP) samples have been considered the gold standard for COVID-19 testing, variability in viral load across different anatomical sites could theoretically cause NP samples to be less sensitive than saliva or nasal samples in certain cases. Self-collected samples also have logistical advantages over NP samples, making them amenable to population-scale screening. To evaluate sampling alternatives for population screening, we collected NP, saliva, and nasal samples from two cohorts with varied levels and types of symptoms. In a mixed cohort of 60 symptomatic and asymptomatic participants, we found that saliva had 88% concordance with NP when tested in the same testing lab (n = 41), and 68% concordance when tested in different testing labs (n = 19). In a second cohort of 20 participants hospitalized for COVID-19, saliva had 74% concordance with NP tested in the same testing lab, but detected virus in two participants that tested negative with NP on the same day. Medical record review showed that the saliva-based testing sensitivity was related to the timing of symptom onset and disease stage. We find that no sample site will be perfectly sensitive for COVID-19 testing in all situations, and the significance of negative results will always need to be determined in the context of clinical signs and symptoms. Saliva retained high clinical sensitivity while allowing easier collection, minimizing the exposure of healthcare workers and need for personal protective equipment, and making it a viable option for population-scale testing. Throughout the COVID-19 pandemic, the spread of infection has significantly outpaced laboratory testing to identify SARS-CoV-2. Seroprevalence studies performed in March-May of 2020 in the United States suggested that the number of infections were perhaps 10-fold greater than confirmed laboratory diagnoses 1 . This scenario significantly challenged public health efforts to monitor and contain the spread of disease. Nasopharyngeal (NP) samples have been considered the gold standard for COVID-19 testing. However, alternative samples like selfcollected saliva offer advantages for population-scale screening and may perform well in specific clinical situations. A number of studies comparing saliva, oral, and/or nasal samples with NP samples have reported heterogeneity in sensitivity or positive percent agreement (PPA) ranging from 66%-98% [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] ; this heterogeneous performance is likely impacted by differences in patient populations and methods of sample collection and processing. For example, optimization of saliva sample processing within one institution improved the performance of this sample type across two sequential studies 3, 5 . More importantly, clinical test performance is dependent on pre-analytic variables such as collection timing relative to the patient's disease course and anatomic site of collection. A study of inpatients at an advanced disease stage showed that lower respiratory samples (bronchoalveolar lavage) were more frequently positive (93%) than pharyngeal (32%) or nasal (63%) samples 13 . Further studies of samples from different anatomic sites at different points in disease course are necessary to better understand how these variables impact clinical test performance. Here, we acquired patient-collected saliva and anterior nasal research specimens for comparison with provider-collected NP samples in both outpatient and inpatient settings. The clinical context of specimen collection, the timing of sample collection during disease course, . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 4, 2021. ; https://doi.org/10.1101/2021.03.03.21252830 doi: medRxiv preprint and the analytical performance of different molecular tests were assessed for their impact on the test result agreement between different anatomical sites. Cohort 1 study participants, presenting with both symptomatic and asymptomatic concerns for COVID-19 were enrolled opportunistically from a population receiving a nasopharyngeal (NP) COVID-19 test in outpatient screening or emergency department (ED) settings. Due to limited testing resources available during the study, we relied on NP results from routine clinical testing. Nasal and saliva samples were prospectively collected and bio-banked for retrospective testing. In Cohort 1, liquid saliva samples mixed with buffer and nasal swabs in buffer were transported at room temperature and stored at -20C, according to instructions from the manufacturer. Selected research samples were thawed at room temperature immediately prior to analysis on the CLIA-LDT. In Cohort 1, NP swabs were collected, handled, and processed immediately according to the normal course of clinical testing through the healthcare system (1A = CLIA-LDT; 1B = commercial assays). In Cohort 2, all samples were collected simultaneously within 48 hours of admission. The three sample types were transported together at room temperature to the CLIA-LDT testing facility, and processed within 24 hours. A reverse transcription polymerase chain reaction (RT-PCR) based on primer-probe sets for the SARS-CoV-2 N gene (N1 and N2 targets) and human control ribonuclease P (RP) published by the United States Centers for Disease Control was validated for clinical use 14 Medical records from telehealth, clinic, or hospital visits were reviewed for relevant symptoms including: loss of taste or smell, shortness of breath, cough, sore throat, fatigue, diarrhea, nausea or vomiting, loss of appetite, chest pain, and myalgia or headache. Physician and nursing notes immediately prior to the initial testing date and over the potentially symptomatic period (approximately 10-20 days after testing positive) were reviewed. Subjective symptoms were coded as either present ("Yes") or absent ("No") or as ("Mild", "Moderate", or "Severe") when the medical record stated severity. In cases where an individual reported at least one symptom; the symptoms that were not reported were considered absent ("No"). In cases where there was no report of symptoms; symptoms were coded as "NA". Objective signs of interest were defined as elevated body temperature (fever) and decreased oxygen saturation based on documentation in the medical record. Data abstraction from the medical records was completed by one study author and reviewed for accuracy by a second author. Group mean variation between NP, saliva, and nasal samples were assessed with the Kruskal-Wallis rank sum test. Correlations between methods were assessed using Pearson correlation coefficient. Bland-Altman analysis was used to assess method agreement between saliva or nasal samples relative to NP. PPA was calculated as the proportion of comparative method positives where the test method was positive. Overall percent agreement (OPA) was calculated . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 4, 2021. ; https://doi.org/10.1101/2021.03.03.21252830 doi: medRxiv preprint as the proportion of tests where the test and comparative method agreed. To generate a symptom heat-map, symptoms were re-coded following the COVID-19 probability score (P(COVID)) prediction equation by Menni et al. 15 . Mild cough was re-coded as "No" and only severe fatigue was coded as "Yes". Heat map rows are clustered by similarity using the complete linkage method and columns are sorted by cohort, method concordance, P(COVID), age, and sex. Average cycle threshold (Ct) was calculated for NP, saliva, and nasal samples as the mean Ct for N1 and N2. Relative viral load was calculated as ((2^(RP-N1)) + (2^(RP-N2)))/2. Data analysis and data visualization was completed using R version 3.4.3 16 Two distinct patient cohorts were included in the study ( . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. N2_Saliva tests with Ct values (n) 13 8 14 N1 and N2 refer to primer-probe sets for the SARS-CoV-2 N gene (N1 and N2 targets). NP: Nasal pharyngeal. Clinical NP testing results in Cohort 1A (n=41) identified 16 positive and 25 negative patients. Banked saliva and nasal samples showed 87.5% PPA and 95.1% OPA, respectively, compared to NP (Table 2 and Supplemental Table 1 ). Saliva and nasal results were 100% concordant in Cohort 1A. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Table 2 ). Simultaneous collection and processing of all samples for Cohort 2 (n=20) demonstrated 16 positive NP, 14 positive saliva, and 11 positive nasal samples. The PPA ranged from 69%-82% and the OPA from 65%-75% for all pairwise comparisons (Table 2) . Saliva sampling identified 2 positive patients who tested negative by NP, and inversely NP testing identified 4 positive patients who tested negative by saliva (Supplemental Table 3 ). Nasal samples performed poorly in this patient cohort. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 4, 2021. ; https://doi.org/10.1101/2021.03.03.21252830 doi: medRxiv preprint Clinical sensitivity in Cohort 2 was calculated on the full cohort with clinical COVID-19 diagnosis (n=20) showing: NP = 80%, saliva = 70%, and nasal = 55%. Eighteen patients (90%) were positive for SARS-CoV-2 by at least one upper respiratory sample type. Therefore, analytical sensitivity was calculated using this number as a denominator, showing: NP = 89%, saliva = 78%, and nasal = 61%. Cycle threshold (Ct) values for each anatomic sample site were compared in Cohorts 1A and 2, in which all samples were analyzed on the same analytical platform. The interquartile ranges for both N1 and N2 were largely overlapping ( . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 4, 2021. ; https://doi.org/10.1101/2021.03.03.21252830 doi: medRxiv preprint N2 primers. The solid black line shows the mean bias for each comparison. Dashed lines represent the limits of agreement (95% confidence interval) around the mean bias (+/-1.96*standard deviation [SD] ). Solid grey line shows the linear relationship between the mean and difference. Pearson's correlation (r) and p-value (p) is reported for the correlation between the mean and the difference. Symptoms associated with COVID-19 were recorded from the medical record, scored, and used Figure 4 ). Interestingly, both were being re-tested due to persistent symptoms at 2 and 4 weeks, respectively, after onset of their laboratory-confirmed COVID-19. No clear pattern of symptoms within Cohort 2 was evident in relation to the concordant or discordant test results from the three anatomical sites (Figure 4) . In Cohort 1B, discrepancies between elevated P(COVID) scores and negative saliva tests (Supplemental Figure 3A ) occurred in three patients without objective fever or oxygen saturation abnormalities who had complete resolution of subjective symptoms in <48 hours. We note that the P(COVID) score is heavily weighted toward loss of taste or smell, a subjective symptom. Saliva and nasal samples demonstrated complete clinical agreement with NP samples in patients with low P(COVID) scores (<0.5) who were tested when initial symptoms developed; suggesting that saliva and nasal samples perform well in the population screening setting. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. We observed good PPA in Cohort 1A between the testing of banked saliva and nasal samples compared to the prior clinical NP result. Analytical variation was minimized in this cohort, with all samples tested using the same RT-PCR assay. Saliva and nasal samples showed complete agreement, and only two patients (of 16 total positive) had false negative saliva and nasal sample results compared to NP. These patients had previous laboratory-confirmed infections and were being re-tested due to persistent symptoms. The clinical NP sample for both of these patients had Ct values consistent with viral loads below the 95% confidence limit of detection for the clinical assay. Therefore, the false negative results in these cases could be due to degradation of the low viral load during the freeze-thaw cycle inherent to the pre-analytical . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Biologic heterogeneity regarding persistence and anatomic distribution of viral replication later in COVID-19 course was apparent in Cohort 2 (inpatient setting). An analytical advantage of The data from Cohort 1B are difficult to interpret confidently. The unintended use of different molecular tests for the clinical NP test was an analytical confounder. Though the limit of detection for the commercial platforms (250-500 viral copies/mL) is lower than the CLIA-LDT (1670 copies/mL), independent clinical quality assurance data from our laboratory (see Supplemental Materials) comparing replicate testing of NP samples between platforms showed PPA of 90-97%, a level of agreement aligned with published data using standard of care NP samples [27] [28] [29] . This suggests additional pre-analytical variables impacted the poor concordance . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. and staff). Finally, we want to thank the leadership and countless clinical laboratory staff from M Health Fairview including: Kylene Karnuth Seroprevalence of Antibodies to SARS-CoV-2 in 10 Sites in the United States Performance of oropharyngeal swab testing compared to nasopharyngeal swab testing for diagnosis of COVID-19 -United States Saliva or Nasopharyngeal Swab Specimens for Detection of SARS-CoV Sensitive detection and quantification of SARS-CoV-2 in saliva Challenges in use of saliva for detection of SARS CoV-2 RNA in symptomatic outpatients Performance of Severe Acute Respiratory Syndrome Coronavirus 2 Real-Time RT-PCR Tests on Oral Rinses and Saliva Samples Self-Collected Anterior Nasal and Saliva Specimens versus Health Care Worker-Collected Nasopharyngeal Swabs for the Molecular Detection of SARS-CoV-2 Self-Collected Oral Fluid and Nasal Swab Specimens Demonstrate Comparable Sensitivity to Clinician-Collected Nasopharyngeal Swab Specimens for the Detection of SARS-CoV-2 Saliva sample as a non-invasive specimen for the diagnosis of coronavirus disease 2019: a cross-sectional study Saliva as an Alternate Specimen Source for Detection of SARS-CoV-2 in Symptomatic Patients Using Cepheid Xpert Xpress SARS-CoV-2 Salivary Detection of COVID-19 Evaluating the use of posterior oropharyngeal saliva in a point-of-care assay for the detection of SARS-CoV-2 Detection of SARS-CoV-2 in Different Types of Clinical Specimens Analytical Validation of a COVID-19 qRT-PCR Detection Assay Using a 384-well Format and Three Extraction Methods Real-time tracking of self-reported symptoms to predict potential COVID-19 R: A Language and Environment for Statistical Computing. (R Foundation for Statistical Computing Elegant Graphics for Data Analysis Streamlined Plot Theme and Plot Annotations for 'ggplot2' Welcome to the Tidyverse Reshaping Data with the reshape Package Default Color Maps from 'matplotlib ggpubr: 'ggplot2' Based Publication Ready Plots Comparison of Saliva and Nasopharyngeal Swab Nucleic Acid Amplification Testing for Detection of SARS-CoV-2: A Systematic Review and Meta-analysis Saliva viral load is a dynamic unifying correlate of COVID-19 severity and mortality Comparison of Commercially Available and Laboratory-Developed Assays for In Vitro Detection of SARS-CoV-2 in Clinical Laboratories A Comparison of Five SARS-CoV-2 Molecular Assays With Clinical Correlations Comparison of Four Molecular In Vitro Diagnostic Assays for the Detection of SARS-CoV-2 in Nasopharyngeal Specimens Laboratory Diagnosis of COVID-19: Current Issues and Challenges The authors thank Beth Jorgenson, Sandra Tekmen, Krista Goldsmith, Andrew Snyder,Stephanie McGlone, and Jill Cordes for their efforts in coordination of sample collection. We thank Drs. Tyler Bold, Peter Southern and their laboratory staff for sample management and safety protocols. We want to acknowledge the significant efforts towards the COVID-19 testing