key: cord-0824672-y7df0tuo authors: Pollock, Nira R; Tran, Kristine; Jacobs, Jesica R; Cranston, Amber E; Smith, Sita; O’Kane, Claire Y; Roady, Tyler J; Moran, Anne; Scarry, Alison; Carroll, Melissa; Volinsky, Leila; Perez, Gloria; Patel, Pinal; Gabriel, Stacey; Lennon, Niall J; Madoff, Lawrence C; Brown, Catherine; Smole, Sandra C title: Performance and Operational Evaluation of the Access Bio CareStart Rapid Antigen Test in a High-throughput Drive-through Community Testing Site in Massachusetts date: 2021-05-26 journal: Open Forum Infect Dis DOI: 10.1093/ofid/ofab243 sha: 18a9846883bc8b56d5e75de20719fb2e53621f70 doc_id: 824672 cord_uid: y7df0tuo BACKGROUND: To facilitate deployment of point-of-care testing for SARS-CoV-2, we evaluated the Access Bio CareStart COVID-19 Antigen test in a high-throughput, drive-through, free community testing site using anterior nasal (AN) swab RT-PCR for clinical testing. METHODS: Consenting symptomatic and asymptomatic children (≤18 years) and adults received dual AN swabs. CareStart testing was performed with temperature/humidity monitoring. All tests had two independent reads to assess inter-operator agreement. Patients with positive CareStart results were called and instructed to isolate pending RT-PCR results. The paired RT-PCR result was the reference for sensitivity and specificity calculations. RESULTS: Of 1603 participants, 1245 adults and 253 children had paired RT-PCR/CareStart results and complete symptom data. 83% of adults and 87% of children were asymptomatic. CareStart sensitivity/specificity were 84.8% (95% confidence interval [CI] 71.1-93.7)/97.2% (92.0-99.4) and 85.7% (42.1-99.6)/89.5% (66.9-98.7) in adults and children, respectively, within 5 days of symptoms. Sensitivity/specificity were 50.0% (41.0-59.0)/99.1% (98.3-99.6) in asymptomatic adults and 51.4% (34.4-68.1)/97.8% (94.5-99.4) in asymptomatic children. Sensitivity in all 234 RT-PCR-positive people was 96.3% with cycle threshold (Ct) ≤25, 79.6% with Ct ≤30, and 61.4% with Ct ≤35. All 21 false positive CareStart tests had faint but normal bands. Inter-operator agreement was 99.5%. Operational challenges included identification of faint test bands and inconsistent swab elution volumes. CONCLUSIONS: CareStart had high sensitivity in people with Ct ≤25 and moderate sensitivity in symptomatic people overall. Specificity was unexpectedly lower in symptomatic versus asymptomatic people. Excellent inter-operator agreement was observed, but operational challenges indicate that operator training is warranted. Although nucleic acid amplification tests (NAATs) for SARS-CoV-2 can be highly sensitive and are being performed at high volumes in centralized laboratories worldwide (1, 2), global testing capacity (4) (5) (6) (7) (8) (9) (10) . However, variability in sensitivity estimates yielded from field studies of individual Ag RDTs (e.g., the Abbott BinaxNOW COVID-19 Ag Card, (6, 8, 10) ) have reinforced the fact that the performance of an Ag RDT must be established in the settings, conditions, and populations of intended use. Although nasopharyngeal (NP) sampling remains the reference method, anterior nasal (AN) sampling substantially increases testing access and acceptability, and a recent comparison study showed that sensitivity with self-collected nasal mid-turbinate swabs versus professionally-collected NP swab samples was similar (11) . The Access Bio CareStart COVID-19 Antigen (Ag) test has FDA EUA for AN swab samples (12) and can provide visually-read results at POC in 10 minutes. The potential for use of this test at large scale, and the paucity of data for test performance in asymptomatic individuals and in children, motivated A c c e p t e d M a n u s c r i p t 5 us to perform an implementation and performance evaluation in a high-volume, high-prevalence community testing site currently using AN swab RT-PCR for clinical testing. The study was performed from January 11-January 22, 2021, at the Lawrence General Hospital "Stop the Spread" drive-through testing site, which accommodates Massachusetts residents from the surrounding area. CareStart testing was performed under the site's CLIA waiver. No study-specific effort was made to recruit individuals to present to the testing site. Two of seven drive-through lanes were utilized for the study. Verbal consent for dual AN swabbing was obtained from adults and guardians of minors (with verbal assent for ages 7-17). Participants were informed of the Ag RDT results reporting plan (below). Presence or absence of symptoms (sore throat, cough, chills, body aches, shortness of breath, fever, runny nose, congestion, nausea, vomiting, diarrhea, loss of taste or smell) was recorded for each participant, including the date of symptom onset. Participants whose symptoms started on the day of testing were classified as Day 0. The study was reviewed by the Massachusetts Department of Public Health IRB and deemed not human subjects research. Verbal consent was obtained as above. Cars with consented patients were marked with a glass marker, notifying the specimen collector to collect two AN swabs rather than one. Swab collection details are in Supplementary Methods; in brief, collection involved swabbing both nostrils with each swab and operators alternated which swab was collected first (for RT-PCR vs. CareStart). CareStart swabs were captured in an empty A c c e p t e d M a n u s c r i p t 6 sterile tube and taken to the testing trailer by a designated "runner." Time of sample collection was recorded, and CareStart tests were initiated within an hour of collection. The test was performed by trained operators (Master's or PhD level laboratorians) according to the manufacturer instructions for use (IFU) (12) ; note that testing of individuals with symptoms >5 days or without symptoms is off-label. Details of kit storage, quality control, and testing and results Two different lots of CareStart kits were used for the study. Each operator was able to set up and read 20 tests per hour, and two operators were able to manage testing of samples coming from two drive-through lanes. 1493/1498 (99.7%) tests were initiated within 1 hour of collection (a window approved by the test manufacturer prior to study start); the median interval between sample collection and test initiation was 31 minutes (range 12-103 min) and the 5 tests performed at ≥1 hour were all negative (both Ag and paired RT-PCR results). All tests were read within the requisite 5 minute window per the EUA IFU (12) . Temperature and humidity in the testing trailer (Supplementary Methods) from 7:30AM-6:00PM ranged from 70.5-74.3°F and 11.7-40.9%, respectively. All testing and kit storage temperatures met manufacturer recommendations (12) . Of 1603 participants [excluding those with invalid or missing RT-PCR results (n=48) and those with missing clinical data (n=57)], 1498 had paired RT-PCR/CareStart results and complete symptom data, including 221 asymptomatic children, 1036 asymptomatic adults, 32 symptomatic children, and 209 A c c e p t e d M a n u s c r i p t 8 symptomatic adults. Symptomatic individuals were further classified by days (D) since symptom onset; both cutoffs of ≤5D and ≤7D of symptoms were evaluated given that the CareStart test EUA is for individuals within 5D of symptom onset (12) , but 7D is a window used by several other commercial Ag RDTs (3) . Clinical data for the study population are presented in Table 1 (demographics) and Supplementary Tables 1 and 2 (symptoms). CareStart performance in adults and children (≤18 years old) Sensitivity, specificity, PPV and NPV calculations for CareStart results vs. RT-PCR results as the reference, for each clinical subgroup, are presented in Table 2 . Tables with data for each subgroup are presented in Supplementary Table 3 . Sensitivity in adults with symptoms ≤5D was 84.8%, similar to that in the CareStart IFU (87.2%) (12) . Sensitivity in children with symptoms ≤5D was 85.7%. Specificity in these symptomatic adults and children were 97.2% and 89.5%, respectively. Relative to symptomatic individuals, sensitivity in asymptomatic adults and children was lower at 50.0% and 51.4%, respectively, while specificity was higher (99.1% and 97.8%, respectively). Operators noted that the volume of extraction buffer absorbed by the swab head was inconsistent, and that it was sometimes difficult to elute sufficient volume from the head of the swab for testing (a process that requires squeezing the sides of the extraction vial (12)). This issue required careful observation and, ultimately, experience to overcome. Additionally, operators noted that the polyester swab head did not seem completely stable (occasional apparent unravelling of the head surface, at an anecdotal rate of up to 5/200 tests per day). Coincidentally, the same swab brand was already in use for RT-PCR testing at this site, making it possible to confirm that this "unravelling" at the time of patient swabbing had already been observed over an extended time period with this particular swab. The operators found that a good deal of force was required to fit the caps on the extraction vials, and that the cap did not "click" as per the IFU, leading to concerns about spillage; they also found that peeling the foil off of the extraction vial was difficult and sometimes led to dripping of buffer and slippery gloves. Each skilled laboratorian was able to perform and read ~20 tests per hour; operators felt that throughput was limited by the short read time window (5') . No invalid CareStart test results were observed. The development of Ag RDTs offers the opportunity to dramatically expand COVID-19 testing capacity and also raises critical questions about how these tests could and should be used. Field evaluation of an Ag RDT at POC in the settings and populations of intended use can add tremendously to the performance data available in manufacturer package inserts and guide test A c c e p t e d M a n u s c r i p t 11 deployment. Gaps in performance data, particularly test performance in asymptomatic adults and both symptomatic and asymptomatic children, must be filled in order to optimally deploy Ag RDTs. Prior to this study, only minimal data for performance of the CareStart test in symptomatic individuals was available in the manufacturer's IFU (12) . In order to understand how well the CareStart RDT could perform in both symptomatic and asymptomatic adults and children in a realworld but also best-case testing scenario, we implemented the test at a high-volume community testing site already experienced in collecting AN samples for RT-PCR. The CareStart test was performed by trained laboratory personnel, with careful attention paid to sample collection, results documentation, and quality control. We found that the CareStart test had high sensitivity in individuals with highest viral burden (96.3% sensitive with paired PCR Ct value ≤25) and moderate sensitivity (84.8/85.7%) in symptomatic adults/children (≤5D of symptoms), respectively (acceptable per the FDA's target of ≥80% (14, 15)). Sensitivity in symptomatic individuals with ≤5D of symptoms (the time frame recommended in the EUA IFU; (12) ) was comparable to sensitivity in those with ≤7D of symptoms (the time frame recommended for some other Ag RDTs like BinaxNOW (16)). Sensitivity in asymptomatic adults and children was substantially lower than that in symptomatic individuals, which may correspond with the broad viral load distribution observed in this population (likely capturing early and late infections given unknown disease onset). Thus, the test does not appear to be optimal for ruling out SARS-CoV-2 infection in asymptomatic adults or children; use in serial testing programs and for testing of contacts of known cases deserves independent study. FDA does provide guidance for consideration of serial Ag testing if the sensitivity is lower, e.g. , 70% (14, 15) . Unexpectedly, we found that specificity of the CareStart test was lower in symptomatic people than in asymptomatic people: specificity in adults/children within 5D of symptoms were 97.2%/89.5% and in asymptomatic adults/children were 99.1%/97.8%, respectively. This pattern was not observed in our BinaxNOW study (100% specificity in people within 7D of symptoms, and 99.6%/99.0% specificity in asymptomatic adults/children, respectively (6)) nor in the Access Bio prospective AN swab study detailed in the CareStart EUA IFU (100% specificity in symptomatic individuals (12) ). This specificity is also lower than that observed in a number of other field studies of visually-read Ag RDTs, including the BinaxNOW, SD Biosensor SD Q, and Abbott PanBio RDTs (>99% for all, (4-10)). This finding might suggest a pre-analytical issue unique to this test or to this study (e.g., mucus on the swab, or the swab itself), but we did not see any obvious overrepresentation of either nasal congestion/rhinorrhea or visible blood/mucus on the swabs of those with false positive results. Cross-reactivity with another pathogen in symptomatic patients is another possible explanation. No unusual band morphologies were noted in the 21 false positive results, and all were faint positive bands. We noted occasional "unravelling" of the swab head which might have contributed; because A c c e p t e d M a n u s c r i p t 13 this issue was infrequent, we did not document when it occurred and thus are unable to correlate this event with false positive results. The overall variability we observed in absorption of extraction buffer by the swab head and subsequent elution volume may or may not have contributed to lower specificity. We note that the swab used for this study *SteriPack Sterile Polyester Spun Swab, 3" (Lakeland, Florida)] is the same swab that was used in the CareStart EUA study and will be included in the AN kit going forward. This same swab has been consistently used for RT-PCR testing over the past year at this site, with the same occasional "unravelling" noted at the time of sample collection, indicating that this does not appear to be a lot issue. Logistics of sample collection and testing in high volume at the site did require a window of time between collection and testing (median 31 minutes), but it was not possible to put each swab "immediately" into extraction buffer as stated in the IFU (12) , and our window of one hour between collection and testing was pre-approved by the test manufacturer. Test specificity will need further confirmation in future studies. Our study yielded some important operational findings relevant to test implementation. Inter-operator agreement on positive/negative results was near 100%, confirming that only one person is needed to read each test result. The main challenge to reading the test was distinguishing a faint positive band from a negative result; operators attributed this in part to the blue color of the faint positive band resembling a shadow, and recommended use of a strong light source in close proximity to the test device during test reading. The requirement for extraction of the swab in buffer introduced multiple operational challenges. The volume of extraction buffer absorbed by the swab head appeared to be inconsistent, and operators sometimes had difficulty eluting sufficient volume from the head of the swab for testing [by squeezing the sides of the extraction vial as per the IFU; (12) ]. This issue required careful observation and over time became easier for the operators. The occasional unravelling of the swab A c c e p t e d M a n u s c r i p t 14 head in buffer (anecdotally, up to 5/200 tests/day) is described above. Operators noted that it was difficult to peel the foil off the extraction vial while wearing gloves. Additionally, a residual, small drop of buffer on the inner lid of the foil sometimes made gloves slippery with buffer, and operators had difficulty fitting the caps securely on the extraction vials, both of which led to concern about dropping vials during the extraction step. Each skilled laboratorian in the study was able to perform and read ~20 tests per hour; although the test only takes 10' to perform, the short read time window (5') required frequent breaks in test setup and thus decreased throughput. In sum, these operational challenges indicate that dedicated operator training, beyond simply reading the IFU, is warranted for performance of this test to highlight potential failure modes. This recommendation for additional training is consistent with studies that have suggested that specific training in reading positive Ag RDT results may be needed to achieve high specificity (7, 8) , and others that have suggested that the level of training of the operator impacts Ag RDT clinical sensitivity (17). Our study had some limitations. We recognize that the comparator in our study was RT-PCR performed on an AN swab, as opposed to an NP swab, which is still considered the reference method by the FDA (18). This dual AN swab study design was also used for our recent BinaxNOW study (6) . Although AN swabs have had lower sensitivity than NP swabs in some studies, the sensitivity is highly dependent on the sampling technique and assay used (19) . The dry AN swab sampling method used in this study has been shown to have similar sensitivity to paired NP swabs in transport media (13) . We also note that a recent comparison study demonstrated that Ag RDT performance with nasal mid-turbinate swabs was similar to Ag RDT performance with NP swabs (11) . The time interval between sample collection and test initiation in this study is discussed above. Finally, we recognize that our symptomatic pediatric cohort was relatively small and thus the confidence intervals on all performance estimates relatively wide. A c c e p t e d M a n u s c r i p t 15 In summary, the Access Bio CareStart Ag RDT had high sensitivity in individuals with high viral burden (Ct ≤25) and moderate sensitivity in symptomatic individuals overall. Observed specificity was lower than estimates in the manufacturer IFU, slightly lower than some other visually-read Ag RDT products on the market, and unexpectedly lower in symptomatic versus asymptomatic individuals, warranting additional study. Excellent inter-operator agreement was observed, but operational challenges indicate that operator training is warranted to highlight possible test failure modes. Individual EUAs for Antigen Diagnostic Tests for SARS-CoV-2 Evaluation of the accuracy, ease of use and limit of detection of novel, rapid, antigen-detecting point-of-care diagnostics for SARS-CoV-2 Clinical evaluation of the Roche/SD Biosensor rapid antigen test with symptomatic, nonhospitalized patients in a municipal health service drive-through testing site Abbott BinaxNOW Rapid Antigen Test in a High-throughput Drive-through Community Testing Site in Massachusetts Performance characteristics of a rapid SARS-CoV-2 antigen detection assay at a public plaza testing site in San Francisco 2020. Field performance and public health response using the BinaxNOW TM Rapid SARS-CoV-2 antigen detection assay during community-based testing Field evaluation of a rapid antigen test (Panbio COVID-19 Ag Rapid Test Device) for COVID-19 diagnosis in primary healthcare centres Evaluation of Abbott BinaxNOW Rapid Antigen Test for SARS-CoV-2 Infection at Two Community-Based Testing Sites -Pima County Head-to-head comparison of SARS-CoV-2 antigen-detecting rapid test with self-collected anterior nasal swab versus professional-collected nasopharyngeal swab Package Insert for the Access Bio CareStart COVID-19 Antigen Test Policy for Coronavirus Disease-2019 Tests During the Public Health Emergency (Revised) Package Insert for the Abbott BinaxNOW COVID-19 Ag CARD Preliminary report from the Joint PHE Porton Down & University of Oxford SARS UK%20evaluation_PHE%20 and Drug Administration. FAQs on Testing for SARS-CoV-2 Performance of Saliva, Oropharyngeal Swabs, and Nasal Swabs for SARS-CoV-2 Molecular Detection: A Systematic Review and Meta-analysis Prevalence is the % of positive RT-PCR results in the population described A c c e p t e d M a n u s c r i p t A c c e p t e d M a n u s c r i p t 23 A c c e p t e d M a n u s c r i p t A c c e p t e d M a n u s c r i p t