key: cord-1043706-v78cthr8
authors: Verwilt, J.; Mestdagh, P.; Vandesompele, J.
title: Evaluation of efficiency and sensitivity of 1D and 2D sample pooling strategies for diagnostic screening purposes
date: 2020-07-20
journal: nan
DOI: 10.1101/2020.07.17.20152702
sha: 9cb0780b5aa3c2f0cfbecfe5c1f05c1162ed1272
doc_id: 1043706
cord_uid: v78cthr8

As SARS-CoV-2 continues to spread around the world while the pandemic lasts, testing facilities are forced to massively increment their testing capacities to handle the increasing number of samples. While sample pooling methods have been proposed or are effectively implemented in some labs, no systematic and large-scale simulations have been performed using real-life quantitative data from testing facilities. Here, we use anonymous data from 1632 positive cases to simulate and compare 1D and 2D pooling strategies. We show that the choice of pooling method and pool size is an intricate decision with a prevalence-dependent efficiency-sensitivity trade-off.

CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 20, 2020. . https://doi.org/10.1101/2020.07.17.20152702 doi: medRxiv preprint becomes undetectable, resulting in false negative observations 9-11 . A second 50 limitation is that an increase in sample manipulations augments the risk of cross-51 contamination and sample mix-ups, which can lead to false negatives and false 52 positives 6 . Other drawbacks are unique to specific strategies. The P-BEST pooling 53 protocol is very time consuming, even when using a pipetting robot 7 . The repeated 54 pooling method, on the other hand, suffers from a complicated re-pooling scheme 1 . 55 56 Although the number of preprints and peer-reviewed publications on pooling 57 strategies for COVID-19 PCR-based testing has increased rapidly throughout the 58 pandemic, some important insights are still lacking. First of all, the proposed optimal 59 pooling strategy is often based on a binary classification of samples as either positive 60 or negative. However, this Boolean approach is not in accordance with the real-world 61 situation and does not allow for investigating the dilution effect of pooling. Second, 62 when using a quantitative representation of the viral loads, these values need to 63 reflect real-life data, as patients present a very wide range of viral loads. This is 64 reflected in the wide spectrum of Cq values reported by PCR-based tests 12 . Finally, 65 as pooling is most effective for population-wide screening (where a very low 66 prevalence is expected) it is important to determine the performance of strategies 67 when encountering a low fraction of positive samples. 68

Here, we evaluate one-time (or 1D) pooling and two-dimensional (2D) pooling (using 70 practical microtiter plate format pool sizes) as promising strategies for massive, low-71 prevalence population screening using real-life RT-qPCR data from 1632 positive 72 samples. 73 74 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 20, 2020. . https://doi.org/10.1101/2020.07.17.20152702 doi: medRxiv preprint In order to evaluate the efficiency-gain of the adopted pooling strategies, we 77 calculated the number of tests that are needed to analyze all samples and divided 78 this number by the total number of samples. We calculated the median, minimum 79 and maximum prevalence-specific efficiency for each pooling strategy (Figure 1) . 80

First, we observe an inverse relationship between efficiency and prevalence over the 81 evaluated prevalence range from 0.01% to 10% for all pooling strategies. Second, 82

there is no single most efficient strategy, because this depends on the prevalence 83 (notice crossings in Figure 1 ). Until a prevalence of 0.36%, 1x24 is the most efficient 84 strategy, from 0.40% to 2.51% 16x24 becomes the most efficient, from 2.82% to 85 4.47% the most efficient strategy is 12x24 and from 5.01% to 10% 8x12 is the most 86 efficient strategy. However, at high prevalence all strategies show similar efficiency 87 gain. Third, strategies employing a larger pool size display a higher efficiency when 88 the prevalence is low, but as the prevalence increases, there is a tipping point for 89 each strategy at which its smaller pool size variant becomes more efficient. At very 90 low prevalence, the efficiency of a 1 x n pool size becomes n and that of a m x n pool 91 strategy becomes (m x n) / (m + n). As a general trend, 2D pooling methods are less 92 sensitive to changes in prevalence in comparison with 1D pooling methods. We 93 conclude that the most efficient pool size very much depends on the prevalence, but 94 2D pooling methods generally are most efficient when prevalence is higher than 95 0.4%. 96 97 Sensitivity decreases with lower prevalence 98 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 20, 2020. primarily see that sensitivity increases with increasing prevalence. There is a non-105 linear relationship between sensitivity and prevalence. Furthermore, since prevalence 106 is linked with efficiency and as a result indirectly linked with sensitivity, we note that 107 an increase in efficiency comes with a decrease in sensitivity. We also observe that 108 at a prevalence lower than 1%, there is an increased variation in sensitivity for 109 different simulation cohorts. Thus, the increased efficiency at low prevalence comes 110 with a low and pool size-dependent problematically variable sensitivity. In the first place, we note that the Cq value at which sensitivity loss starts to occur 120 only depends on the 1D pool size or largest dimension of the 2D pool; i.e. Cq value 121 of 35 for 1x4, 34 for 1x8, 33.4 for 1x12 and 8x12, 33 for 1x16 and 12x16 and 32.4 for 122 1x24 and 16x24. These Cq values are (as expected) identical to 123 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 20, 2020. . https://doi.org/10.1101/2020.07.17.20152702 doi: medRxiv preprint 7

Additionally, when Cq values larger than these cut-off 124 values are systematically included, the sensitivity drops exponentially. The rate at 125 which this reduction happens, decreases when prevalence increases. Finally, when 126 the prevalence is 10%, larger pool sizes result in a smaller reduction in sensitivity, 127 but for all other visualized prevalence values the sensitivity decreases with larger 128 pool size. Altogether, the extent to which low viral load samples contribute to the drop 129 in sensitivity depends on pool size and pooling strategy, although the sensitivity 130 decrease is most problematic when prevalence is low. results confirm the widely accepted idea that sample pooling methods show a higher 146 efficiency when prevalence is low 1-6,13 and that, for 1D and 2D pooling methods, as 147 prevalence increases, a threshold is reached after which smaller pool sizes become 148 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 20, 2020. . https://doi.org/10.1101/2020.07.17.20152702 doi: medRxiv preprint more efficient 1,6 . However, appraising the performance of a pooling method 149 exclusively by its efficiency would ignore one of the major drawbacks of pooling: loss 150 of sensitivity due to dilution of the target. This issue becomes most pertinent when 151 the viral load is low 9-11,13,14 . Our results confirm that all tested pooling methods suffer 152 from false negatives, to a variable degree (Figure 2 ). This loss in sensitivity across all 153 prevalence conditions generally precludes use of pooling for diagnostic testing of prevalence on efficiency as well as sensitivity presents an import challenge, 163 considering that, in order to make an informed decision on the preferential pooling 164 strategy, the prevalence has to be known. By nature, we cannot know the exact 165 prevalence before testing our samples, and as a result, the prevalence has to be 166 estimated. In general, we show that it is of extreme importance that an optimal 167 equilibrium between efficiency and sensitivity is achieved when deciding on the 168 pooling strategy and corresponding pool size. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 20, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 20, 2020. The sensitivity is calculated as: 242

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 20, 2020. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 20, 2020. . https://doi.org/10.1101/2020.07.17.20152702 doi: medRxiv preprint CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 20, 2020. . https://doi.org/10.1101/2020.07.17.20152702 doi: medRxiv preprint CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 20, 2020. . https://doi.org/10.1101/2020.07.17.20152702 doi: medRxiv preprint CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 20, 2020. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 20, 2020. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 20, 2020. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 20, 2020. . https://doi.org/10.1101/2020.07.17.20152702 doi: medRxiv preprint

Application of pooled testing in 271 screening and estimating the prevalence of Covid-19

Expanding Covid-19 Testing: Mathematical Guidelines for the 274

Optimal Sample Pool Size Given Positive Test Rate

Sequential informed pooling approach to detect

Smart Pooled 280 sample Testing for COVID-19: A Possible Solution for Sparsity of Test Kits

Efficient high throughput SARS-CoV-2 testing to detect 286 asymptomatic carriers

Tapestry: A Single-Round Smart Pooling Technique for 289