key: cord-0530748-hf4m0akq
authors: Cohen, Alejandro; Shlezinger, Nir; Solomon, Amit; Eldar, Yonina C.; M'edard, Muriel
title: Multi-Level Group Testing with Application to One-Shot Pooled COVID-19 Tests
date: 2020-10-12
journal: nan
DOI: nan
sha: 675063c4a8d414dbd949b4d8fa6101be05b61e91
doc_id: 530748
cord_uid: hf4m0akq

One of the main challenges in containing the Coronoavirus disease 2019 (COVID-19) pandemic stems from the difficulty in carrying out efficient mass diagnosis over large populations. The leading method to test for COVID-19 infection utilizes qualitative polymerase chain reaction, implemented using dedicated machinery which can simultaneously process a limited amount of samples. A candidate method to increase the test throughput is to examine pooled samples comprised of a mixture of samples from different patients. In this work we study pooling-based COVID-19 tests. We identify the specific requirements of COVID-19 testing, including the need to characterize the infection level and to operate in a one-shot fashion, which limit the application of traditional group-testing (GT) methods. We then propose a multi-level GT scheme, designed specifically to meet the unique requirements of COVID-19 tests, while exploiting the strength of GT theory to enable accurate recovery using much fewer tests than patients. Our numerical results demonstrate that multi-level GT reliably and efficiently detects the infection levels, while achieving improved accuracy over previously proposed one-shot COVID-19 pooled-testing methods.

and was utilized for pooled COVID-19 tests in [10] [11] [12] [13] [14] [15] .

The main difference between classical GT and CS is that GT deals with group detection problems, which results in binary variables, i.e., each subject can be either infected or not infected [8] , while CSbased methods result in real-valued variables. Although group testing is traditionally adaptive, requiring multiple sequential tests based on previous ones [16] , it can also be applied in a one-shot (non-adaptive) manner [18] . CS focuses on the one-shot recovery of sparse realvalued vectors from lower-dimensional linear projections, and thus each subject can take any real value number [17] . The additional domain knowledge of GT, namely, the more restricted domain over which it operates compared to CS, allows it in some applications to achieve improved recovery guarantees over CS, in terms of fewer measurements required, as suggested in [19] , [20] for the quantization of sparse signals. Nonetheless, in the context of COVID-19 testing, one is often interested not only in detecting whether or not a patient is infected, but also in some score on the level of the viral load [11] , as can be provided using CS tools. The fact that each of these mathematical frameworks has its own pros and cons for COVID-19 testing, motivates the design of a recovery method which combines GT with one-shot operation and multi-level detection, as in CS.

In this work we propose a multi-level GT recovery scheme for pooled COVID-19 testing, which is a GT-based method designed to account for the unique characteristics of pooled COVID-19 tests. The proposed technique extends GT schemes to detect multiple levels of viral load, building upon our previous results on combining GT with CS concepts and multi-level discretization in [19] . The resulting multi-level GT scheme operates in a one-shot manner, and is designed to avoid dilution due to mixing too many samples in each pool [21] .

We begin by identifying the specific requirements which arise from the setup of pooled COVID-19 testing. In light of these requirements, we derive the multi-level GT method, which is comprised of a dedicated testing matrix, i.e., pooling pattern, as well as a GT-based recovery method. Our numerical evaluations, which use the model proposed in [11] for pooled RT-qPCR testing, demonstrate that the proposed multi-level GT scheme reliably recovers the infection levels, while operating at a limited computational burden, and achieving improved accuracy over previously proposed CS-based approaches.

The rest of this paper is organized as follows: In Section II we review the pooled COVID-19 testing setup. Section III presents the proposed multi-level GT scheme. Section IV details the simulation study, and Section V provides concluding remarks.

Throughout the paper, we use boldface lower-case letters for vectors, e.g., x. Matrices are denoted with boldface upper-case letters, e.g., M . Sets are expressed with calligraphic letters, e.g., X , and X n is the nth order Cartesian power of X .

II. SYSTEM MODEL In this section we present the system model for which we derive the recovery algorithm described in Section III. As our main application is pooled COVID-19 tests, we begin by identifying the specific characteristics which arise from this application in Subsection II-A, based on which we present our problem formulation in Subsection II-B.

The common approach in testing for the presence of COVID-19 is based on the RT-qPCR method. Here, a sample is collected, most commonly swab-based. The presence of infection is then examined by RNA extraction via RT-qPCR measurements, quantifying the viral load in the sample. The RT-qPCR process is quite time consuming (on the typical order of several hours), and can simultaneously examine up to a given number of m inputs (commonly on the order of several tens of samples). This results in a major bottleneck, particularly when the number of patients, denoted by n, is much larger than m.

A candidate approach to reduce the test duration, which is considered in this paper, utilizes pooling [6] . Pooling is based on mixing the samples of groups of patients together, forming m mixed samples out of the overall n patients. Then, the presence of COVID-19 for each of the tested individuals is recovered from the mixed RT-qPCR measurements, either directly, i.e., in a one-shot fashion, or in an adaptive manner, which involves additional tests [22] . To formulate the problem of designing pooling-based recovery techniques for COVID-19 tests, we note the following characteristics of this testing procedure which have to be accounted for:

A1: The number of infected measurements, denoted by k, is much smaller than the number of tested individuals n. Typically k ≤ 0.1n, i.e., up to 10% of the tested population is infected. A2: One is interested not in only identifying whether a subject is infected or not, but also in some discrete score on the viral load. For example, possible outputs are no (no virus), low (borderline), mid (infected), and high (highly infected). A3: The RT-qPCR measurements are noisy, i.e., some level of random distortion is induced in the overall process, encapsulating the randomness in the acquisition of the samples, their mixing, and the RT-qPCR reading. A4: It is preferable to carry out one-shot tests, i.e., fully identify all subjects from a single RT-qPCR operation, without having to carry out additional tests based on the results. A5: There is a limit, denoted by L > 1, on the number of subjects which can be mixed together in a single measurement. A typical limit on the number of subjects mixed together is L = 32 [6] . Furthermore, the portion taken from each sample for the pooled measurements is identical, e.g., one cannot mix 30% from one patient with 70% from another patient into a single pool.

An illustration of the overall flow of pooled RT-qPCR-based COVID-19 testing along with the desired one-shot recovery operation is depicted in Fig. 1 . On the left side of the figure, we see the true viral loads of all n items. In particular, we see that the first item is infected in a medium level, the fourth item is infected in a low level, and all other items are not infected. Next, pooling is done based on a testing matrix, which is generated prior to obtaining the samples. For example, the first pooled test involves samples from the first, third, and fifth items. This results in a measurements vector, denoted by z. This vector is fed to the recovery algorithm, which is able to tell in one-shot that the first item is infected in a medium level, the fourth item is infected in a low level, and all other items are not infected.

Based on the characteristics of pooled COVID-19 tests detailed above, we consider the following problem: Let x ∈ R n + be a vector whose entries are the viral loads of the n patients. By A1 it holds that x is k-sparse, i.e., x 0 ≤ k. The pooling operation is represented by the matrix A ∈ {0, 1} m×n . Let l(i) ≤ L denote the number of subjects mixed together in the ith individual pool, i ∈ {1, . . . , m}. This implies that the ith row of A, denoted A T i , is l(i)-sparse by A5. The viral loads of the pooled samples are represented by the vector z ∈ R m + , whose entries are given by

where the factor 1 l(i) and the structure of A guarantee that identical portions are taken from each sample in a pool-test, as required in A5. The RT-qPCR measurements, denoted by y ∈ R m + , are given by some stochastic mapping f : R+ → R+ of z, which represents the distortion detailed in A3. We write the measurement as

To formulate our objective, we note that by A2, we are interested in recovering a discrete representation of the viral load. We thus define the discretization mapping Q : R+ → Q, where Q is a finite set containing the possible decisions. Our goal is thus to design an algorithm which maps the RT-qPCR measurements y into an estimate of the discretized viral loads, denoted byx ∈ Q n , for the objective of minimizing the error probability, defined as

The fact thatx is obtained directly from y indicates that the algorithm operates in a one-shot fashion, as required in A4.

To conclude, for the subset of infected items, of size k, from n, the goal in multi-level pooled COVID-19 tests is to design an m×n oneshot measurement matrix which guarantees that at most L subjects are mixed in each pool-test and a recovery algorithm, such that by observing z we can identify the subset of infected items and their discrete representation of the viral load.

III. MULTI-LEVEL GROUP TESTING In this section, we provide an efficient scheme which implements GT with multiple decisions. The sparsity assumption A1 implies that the recovery of pooled RT-qPCR tests can be treated as a sparse recovery problem, as was also noted in previous works on pooled COVID-19 testing [7] , [9] [10] [11] [12] [13] [14] [15] . Sparse recovery is typically studied under either the framework of GT [8] , or that of CS [17] . Broadly speaking, GT deals with sparse recovery of binary variables, i.e., it can recover whether a subject is infected or not. In order to evaluate the actual levels of each tested subject, as requested in A2, one would have to re-run the RT-qPCR test, violating requirement A4. The alternative approach of CS operates over the domain of real numbers, namely, it attempts to identify the exact cycle threshold or viral load for each subject, and thus tends to be less accurate compared to GT, as it does not exploit the fact that one is only interested in a discrete grade value by A2. This motivates the derivation of a dedicated algorithm for pooled COVID-19 recovery, which harnesses the strength of GT theory while extending it to the multi-level domain. The proposed multi-level GT method designed in light of the model assumptions detailed in the previous section, is presented in Subsection III-A, followed by a discussion in Subsection III-B.

Multi-level GT is comprised of three components: The design of the testing matrix A; the pooling operation; and the recovery algorithm which determines the discrete level associated with each subject based on the results of the pooled tests. We next elaborate on each of these components.

1) Testing Matrix: To determine the testing matrix A, we first set the number of the pool-tests m to be (1+ )k log 2 n, for some > 0. This is the sufficient number of pool-test for reliable recovery in GT using the optimal Maximum likelihood (ML) decoder [23] , [24] .

Once m is fixed, we proceed to setting A. The traditional GT method of generating A is to draw its elements in an i.i.d. fashion according to a Bernoulli distribution with parameter p. A Common choice for p in GT theory is p = 1 − 2 −1/k , for which the probability of each element in z to be zero is 1/2. This choice of p is obviously impractical, as k is unknown, so p is chosen using a rough approximation of k. A major drawback of this approach is that A5 is not necessarily satisfied. We therefore consider an alternative method of generating A, which forces the columns of A, as well as the rows of A, to be "typical". That is, we want every column/row to have exactly p · m ≤ L and p · n ones, respectively. Since in practical COVID-19 testing setups, one is interested in using a fixed deterministic matrix, rather than having to work with random matrices, we generate A once before the pooling starts. That is, the same testing matrix can be used for multiple pooling experiments.

2) Pooling: After the testing matrix A is generated, it determines which samples are pooled together. The pooling process is done as detailed in Section II.

3) Recovery Algorithm: The proposed recovery algorithm is given in Algorithm 1. It operates in two main steps as detailed next.

At the first step, the algorithm identifies efficiently all of the definitely defective (DD) items in two stages, without determining the infection level. In the first stage of the first step, the definitely not defective (DND) algorithm [25] is used (lines 2 and [14] [15] [16] [17] [18] [19] [20] [21] [22] . Recall that the number of pool-tests m, corresponding to the testing matrix A, is fixed to allow ML detection. The DND algorithm attempts to match the columns of A with the vector y. In particular, if column j of A has a non-zero entry while the correpodponding element in y is zero, the column is assumed not to correspond to a defective subject. This algorithm finds most of the subjects that are DND and drastically reduces the number of possible defective (PD) items. In practice, as demonstrated empirically in Fig. 2b , in the non-asymptomatic regime as we consider herein for COVID-19, the set of subjects declared as PD after this stage, denoted P, approximately satisfies |P| = O(k). The remaining n − |P| subjects are declared not defective. In the second stage of the first step, the ML algorithm [23] is used only over the smaller set of PD subjects P, to identify exactly the set D of k DD subjects (lines 3 and 23-25). The ML algorithm looks for a collection of k columns in A, for which y is most likely. In the formulation Q(·)0,1 is a binary quantization mapping, and in the ML rule we denote by K the set of actual defective subjects, and by Ω(P, k) the set of |P| k combinations of k defective subjects in P. In the second step, the algorithm estimates the infection level of each subject in the set of identified defective subjects D, in an iterative fashion (lines 4-13). Let D(i) be the i-th element of D, and AD denote the matrix created by taking the columns of D from A. For a test in which only one infected subject participates according to the testing matrix (lines 7-8), the algorithm can recover the viral load directly from the measurement (lines 9-10). To obtain a discrete score, the measured value is quantized using a threshold-based quantization mapping Q(·). Then the algorithm subtracts the viral load of that subject from all the tests in which it participates (line 11), and repeats until it recovers the infection levels of all declared infected subjects, denoted by S (line 6).

The novelty of our algorithm stems from the efficiency in the first step with few pool-tests, and the iterative process in the second step, return arg max D∈Ω(P,k) P (Q 0,1 (y) | A, K = D) 25: end procedure which determines the discrete infection level of each item. Here, we note a few remarks arising from the proposed multi-level GT scheme.

In Subsection III-A we describe how the testing matrix A is generated. The description involves a random generation procedure, for which the resulting matrix is not guaranteed to satisfy A5. The motivation for using such random procedures as in GT theory stems from their provable performance guarantees [8] . Once a typical testing matrix satisfying A5 is selected, one does not have to generate a new matrix for each group of n patients.

According to GT theory, for n i.i.d. tested individuals, the algorithm which maximizes the probability of finding the infected items (though not necessarily their levels) is the ML algorithm. However, its complexity is burdensome, as it has to consider n k options [26] , [27] . An efficient alternative is the DND algorithm, also known as column matching [18] , [25] , [28] , whose time complexity is O(kn log n) [24] . However, it requires a greater amount of pooled measurements compared to the ML algorithm in order to reliably identify the detective items. Our proposed multi-level GT method combines the concepts of DND with the ML algorithm, while extending them to operate over non-binary fields, i.e., recover multiple levels rather than just identifying which subject is defective. Performing DND on all n items using the number of tests set to allow ML detection, i.e., m = (1+ )k log 2 n, results in a smaller set of PD subjects P. Given P, the ML decoder has to consider significantly less options, |P| k , which is likely to be computationally feasible and considerably faster than considering all n k combinations. In the second step of the algorithm proposed, we analyze the tests in which only one infected subject participates according to A. This process, when restricted to operate over binary variables, was proposed as means to identify the DD subjects in GT [25] . Unlike [25] , we utilize this process in iterative fashion after the DD subjects are already detected to determine the viral load infection over the nonbinary field. By subtracting the viral load of each detected subject in the iterative algorithm, we reduce the number of pool-tests in [25] to the one required in the ML algorithm. In the case where it is possible to identify the number of infected subjects in a pool-test, by observing its measurement, we noticed empirically that it is possible to achieve the lower bound number of pool test suggested for GT using ML algorithm, i.e., m ≈ log 2 n k ≤ k log 2 (n/k) [18] , [23] . We leave this interesting case and the analytical and complexity analysis of the proposed iterative algorithm as future work.

We assessed the performance of our method using the RT-qPCRbased COVID-19 test model of [11] . The elements of the test matrix are chosen such that the rows and columns of A are typical. The number of items selected in the first example is n = 105, out of which k = 5 items are defective, and with 4 infection levels. The viral load of each defective item is drawn from a uniform distribution between [0, 1000] as in [11] . The infection level score is based on the following division of this interval into 4 regions: [0, 50) = no; [50, 300) = low (borderline); [300, 700) = mid; and > 700 = high.

The success probability of Algorithm 1 versus m is depicted in Fig. 2a , compared to the ML lower/upper bound on the number of tests m from GT theory, as described in [23] . We evaluate forms of the success probability measure: here an error is declared when there is at least a single defective subject who is not detected out of a set of n patients, as well as when there is at least one patient whose infection level is not correctly recovered. From the plot, we see that the success probability of finding the defective items and the success probability of finding the infection levels coincide. That is, whenever the defective set was recovered successfully, the correct infection levels were also estimated successfully, indicating the validity of step 2 of Algorithm 1. We also see that when the number of tests is at the ML upper bound, the probability of success approaches one. Fig. 2b shows the number of true positives, i.e. the number of defective subjects that are declared defective. We can see that when the number of tests m matches the upper bound, we have no false negatives with high probability. In the context of COVID-19, false negatives are considered to be the worst outcomes of a test. Fig. 2c illustrates the probability of the detected non-defective subjects after the DND stage in Algorithm 1. This is calculated as the number of non-defective items declared by the DND algorithm, divided by the total number of non-defective items. We see that when the number of tests is as the upper bound of ML dictates, the DND algorithm identifies ≈ 95 100 = 95% of the subjects as non-defective, i.e., |P| ≈ 2k. These will be candidates to be tested in the second stage of ML algorithm as PD, demonstrating the notable complexity reduction achieved by the two-step process.

The results in Figs. 2a-2c evaluate the error probability, but do not capture which forms of errors are produced when using Algorithm 1. Therefore, we depict in Fig. 3 the confusion matrix for the considered scenario, as well as when repeating the setup with a much larger amount of patients of n = 961, using merely m ∈ {70, 93} pooltests. The values (n, m) ∈ {(105, 45), (961, 93)} were also used in [11] , which applied CS tools for recovery. Observing Fig. 3 , we note that most of the errors reported in fact correspond to identifying lowlevel and mid-level subjects as mid and high, respectively. Such errors are much less harmful in COVID-19 tests compared to reporting noninfected subjects as defective, which occurs only ≈ 0.1% of the times for (n, m) = (105, 45) in Fig. 3a , which is similar to the results achieved in [11] for such setups. This behavior is more notable when jointly testing n = 961 subjects in Figs. 3b-3c. Comparing these results to [11] , it is noted that multi-level GT achieves improved false positive and false negative probabilities with only m = 70 testpools compared to that achieved using all CS methods examined in [11] with m = 93 test-pools. For instance, for (n, m) = (961, 93) [11] reported false positive probabilities varying from 0.1% to 0.8%, while the corresponding probability in Fig. 3b is 0.0%. This indicates the potential of multi-level GT in facilitating pooled testing of large numbers of subjects.

In this work we proposed a scheme coined multi-level GT for one-shot pooled COVID-19 tests. We first identified the unique characteristics and requirements of RT-qPCR-based COVID-19 tests. Based on these requirements, we designed multi-level GT to combine traditional GT methods with one-shot operation and multi-level outputs, while implementing a preliminary DND detection mechanism to facilitate recovery at reduced complexity. Our numerical evaluations demonstrate that multi-level GT reliably identifies the infection levels when examining a much smaller number of samples compared to the number of tested subjects.

COVID-19 epidemic in switzerland: on the importance of testing, contact tracing and isolation

Fair allocation of scarce medical resources in the time of Covid-19

An ultrasensitive, rapid, and portable coronavirus SARS-CoV-2 sequence detection method based on CRISPR-Cas12

SARS-CoV-2 on-the-spot virus detection directly from patients

Quantification of mRNA using real-time RT-PCR

Evaluation of COVID-19 RT-qPCR test in multi-sample pools

Boosting test-efficiency by pooled testing strategies for SARS-CoV-2

Group testing and sparse signal recovery

Large-scale implementation of pooled RNA extraction and RT-PCR for SARS-CoV-2 detection

Efficient high throughput SARS-CoV-2 testing to detect asymptomatic carriers

A compressed sensing approach to group-testing for COVID-19 detection

Low-cost and high-throughput testing of COVID-19 viruses and antibodies via compressed sensing: System concepts and computational experiments

Practical high-throughput, non-adaptive and noise-robust SARS-CoV-2 testing

Error correction codes for COVID-19 virus and antibody testing: Using pooled testing to increase test reliability

Noisy pooled PCR for virus testing

The detection of defective members of large populations

Compressed sensing: theory and applications

Non-adaptive group testing: Explicit bounds and novel algorithms

Serial quantization for sparse time sequences

Distributed quantization for sparse time sequences

Optimal pooling matrix design for group testing with dilution (row degree) constraints

A two-stage adaptive grouptesting procedure for estimating small proportions

Boolean compressed sensing and noisy group testing

Secure group testing

Group testing algorithms: Bounds and simulations

Information-theoretic and algorithmic thresholds for group testing

Group testing: an information theory perspective

Nonrandom binary superimposed codes