key: cord-0193951-hvv1u6qt
authors: Schumacher, Christoph; Taufer, Matthias
title: The Statistics of Noisy One-Stage Group Testing in Outbreaks
date: 2020-11-23
journal: nan
DOI: nan
sha: 45c82e8399fa3a1eeb11c0bbbd550f7dd8aba24d
doc_id: 193951
cord_uid: hvv1u6qt

In one-stage or non-adaptive group testing, instead of testing every sample unit individually, they are split, bundled in pools, and simultaneously tested. The results are then decoded to infer the states of the individual items. This combines advantages of adaptive pooled testing, i. e. saving resources and higher throughput, with those of individual testing, e. g. short detection time and lean laboratory organisation, and might be suitable for screening during outbreaks. We study the COMP and NCOMP decoding algorithms for non-adaptive pooling strategies based on maximally disjunct pooling matrices with constant row and column sums in the linear prevalence regime and in the presence of noisy measurements motivated by PCR tests. We calculate sensitivity, specificity, the probabilities of Type I and II errors, and the expected number of items with a positive result as well as the expected number of false positives and false negatives. We further provide estimates on the variance of the number of positive and false positive results. We conduct a thorough discussion of the calculations and bounds derived. Altogether, the article provides blueprints for screening strategies and tools to help decision makers to appropriately tune them in an outbreak.

Group testing addresses the problem of detecting a rare feature in a large population. By pooling sample units, one can often clear large subsets of the sample with a single test. After this first stage, classical group testing proceeds to retest items in positive pools. Thus, one splits the sample units into several pieces beforehand, pools only part of each item and keeps the rest for retesting in subsequent stages.

While this can saves resources, these adaptive testing strategies are timeconsuming and laborious which is one reason preventing their wide-spread implementation [Rob20] . A way to mitigate these drawbacks is one-stage or non-adaptive group testing: In order to avoid a second stage, one includes (parts of) each sample unit in several pools and exonerates every sample unit which appears in a pool that tests negative. This decoding strategy is called Combinatorial Orthogonal Matching Pursuit (COMP).

The tradeoff is a non-zero probability for false positive test results which occur if a negative sample unit is "shadowed" by positive items, that is, if each pool that contains the falsely positive item is contaminated by actually positive items. To minimize shadowing, we commit to particular pool designs referred to as multipools in [Täu20] , which are based on maximally disjunct pooling matrices. This means that each pair of sample units meets in at most one pool.

Still, the probability for falsely positive results has to be controlled for a reliable interpretation of the test results. To this end, we assume that every sample unit is independently infected with probability ρ. This scenario is often referred to the linear regime and is a natural assumption in population screening. We also account for measurement errors with a noise model inspired by biomedical testing which has been argued for in [ZRB20] and is given in formula (1.1). This can introduce false negative results which we counter by the error-correcting noisy COMP (NCOMP) algorithm.

In this setting, we provide formulas for sensitivity, specificity, the probabilities of Type I and II errors, and the expected number of positive, false positive and false negative results. We also provide bounds on the variance of the number of positive and false positive results in the noiseless case.

1.1. Motivation: screening via PCR. Real time reverse transcription polymerase chain reaction (RT-PCR or briefly PCR) is a biochemical procedure to identify certain DNA or RNA sequences and an important tool to detect infectious diseases. Its large scale use can be constrained by factors such as the availability of collection devices (swab kits), trained staff to take samples, and their protective equipment, the availability of reagents, the number of PCR machines, lab staff, and logistics.

In epidemiological scenarios, there are different regimes of PCR application to distinguish. Diagnostic testing happens in the clinical context with the goal to precisely measure the viral or bacterial load in a patient and inform clinical treatment. One wants to maximize accuracy and minimize detection time whereas an efficient use of resources or costs are of secondary concern. In contrast to that, screening takes place in a public health context, and the goal is to maximize the overall epidemiological or public health benefit with given resources. This typically means that one wants to prevent as many transmission events as possible -usually by identifying and isolating infectious carriers who might be pre-symptomatic or asymptomatic.

A screening strategy increasing the overall number of people tested could therefore be justified -even if it leads to a reduced accuracy of single tests since this could be compensated by frequent retesting. This might be achieved by a range of measures such as self-swabbing, testing saliva instead of nasopharyngeal swabs [Vog+20] , running the PCR for fewer cycles (testing for infectiousness instead of infection), optimization of the use of critical reagents in the lab, and also by adaptive and non-adaptive pooling, which we focus on here.

In the COVID-19 pandemic, large-scale screening has been suggested as an effective measure [Eur20] and pooled testing has been suggested as an approach to deal with scarce resources [Mal20; TRL20]. There has been some emphasis on two-stage, adaptive testing, where, after a first round of pooled tests, individuals in positive pools are assessed again [Liu+20; CC20].

One-stage strategies, where results become available after only one round of testing, have been suggested in [She+20; Täu20; Gho+20; ZRB20; PBJ20]. It has been shown that detection of the SARS-CoV-2 virus in pools of size 100 is possible, which promises massive improvements in throughput [Mut+20] . However, the implementation of pooling strategies will require a thorough understanding of the consequences such as possible tradeoffs in accuracy involved. In this paper, we aim to contribute towards such an understanding by investigating statistical measures associated with one-stage pooling strategies.

1.2. Non-adaptive group testing. We focus on non-adaptive or one-stage group testing where every person's sample is put into a number of pools according to a design matrix, all pools are tested in parallel, and the results are then decoded. We consider this as preferable to two-or multi-stage strategies since only one round of the PCR is required which offers shorter detection times and possibly a leaner organization of laboratory processes [Täu20] . In the context of the COVID-19 pandemic this is particularly important because the viral load of the SARS-CoV-2 virus and the infectiousness have been observed to be high in patients before and around symptom onset [He+20; To+20; Ada+20; Kup20]. Thus, every hour between the sample taken and the result returned matters. We consider only binary PCR, where results are "positive" or "not positive". There exists approaches on pool testing for COVID-19 where, using compressed sensing, also quantitative results of the PCR are used in the reconstruction [Gho+20; PBJ20]. If a patient is identified in a screening process, this should probably be followed up by an individual test for clinical purposes, but this would belong to the realm of diagnostics which is not the topic of this note.

Our pooling strategies will be based on design matrices with constant row and column sum and which have maximal disjunctness. Such designs have also been studied in combinatorics, where they are known as Steiner systems [CD07] and have been called multipools in [Täu20] . Constant row and column designs have been seen to be practical [Erl+15] , and disjunctness is directly related to the maximal number of infected items for which perfect reconstruction is mathematically guaranteed in a noiseless scenario, see [AJS19] .

We construct examples of such design matrices using linear Reed-Solomon Codes [RS60; KS64] and the Shifted Transversal Design [Thi06] , two constructions based on the same underlying algebraic principle. We will consider noisy measurements where the noise model (1.1) P (Pool tests negative | Pool contains k positive items) = (1 − p fp )p k fn depends on the number of true infected items in a pool and contains two parameters p fp and p fn , modulating false positive and false negative probabilities of a single measurement, respectively. This noise model is for instance argued for in [ZRB20] . We will investigate the "simple" COMP decoding algorithm, cf. [AJS19] for an overview, and its error-correcting brother, the NCOMP decoding algorithm [Cha+11; Cha+14] . Both are trivial to implement with minimal run-time and storage.

1.3. Statistical measures of non-adaptive group testing. In order to evaluate a testing strategy, several quantities can be considered. They may depend on p fn , p fp , the prevalence ρ ∈ (0, 1), and parameters such as the pool size q, the number m of pools each item participates in, and a parameter δ tuning the NCOMP algorithm. The first quantity is the compression ratio, that is the inverse of the average number of tests required per item. It describes the savings compared to individual testing. Note that in two-stage strategies, not only the average number of tests, but also the variance or standard deviation of the number of tests used for a given population size are relevant. This is because the number of tests needed in the second round is unknown and this process of re-assessing pools can create logistical challenges. In one-stage strategies, the number of tests per item is a fixed number and has zero variance. We consider this another advantage of non-adaptive versus adaptive testing.

The savings in tests are to be compared to possible sacrifices in accuracy. In the literature one finds investigations of:

• The maximal number of infected items which are guaranteed to be correctly identified if there is no noise, [Maz12] , • the minimal number of tests required to achieve asymptotically a full reconstruction in the sublinear prevalence regime, [AJS19] .

Both metrics might not be ideal for the application to screening -on the one hand because they do not take into account noise, on the other hand because they are either tailored towards worst-case scenarios or work in asymptotic limits and in the so-called sublinear regime where the portion of infected persons is assumed to tend to zero with growing population size. Instead, one would rather like to study the average performance or average number of false positive results [Maz12] . In the literature, one finds investigations where for a random draw of a fixed number of infected items quantities such as • and the number of false negative results T fn are simulated [She+20] . Such a fixed number of infected patients in a pool is a simplifications which ignores the true probabilistic structure of the infections. We believe that an approach better suited to inform decision making is to investigate for a given prevalence ρ.

• the sensitivity, that is the probability that an infected item is actually picked up by the testing strategy:

(1.2) sens = P(test result positive | patient infected),

• the specificity, that is the probability that a non-infected item is correctly identified as negative:

(1.3) spec = P(test result negative | patient not infected), In addition to that, there are two more quantities which matter from a public health perspective, since they tell an individual how reliable their result is. Indeed, there exist situations where a testing strategy has both high sensitivity and high specificity but still most positive results will be false positives, see Table 1 for a synthetic example. This phenomenon is also known as screening paradox and can be disadvantageous since patients might be reluctant to comply with public health measures based on these probabilities. Therefore, we also quantify:

• the Type I error, that is the probability that a positive test results turns out to be a false positive:

• the Type II error, that is the probability that a negative test result is a false negative:

We also calculate the expected number of positive results, false positive results, and false negative results.

Finally, we also estimate the variance of the number of positive and false positive tests in the noiseless case. We use the Efron-Stein inequality for this bound. Such an estimate is important because the number of false positive results seems to be a heavy tailed random variable: Most of the time, most items will be correctly identified, but in some rare cases, when the random number of infected items in the pools exceeds a certain threshold, a phase transition occurs and an overwhelming number of items in the test will be erroneously flagged as positive. Some authors suggest to treat this phenomenon as a "graceful failure" [Gho+20] which might still flag a local outbreak without specifying the infected individuals. However, in order to correctly flag this phenomenon, more knowledge on higher moments of the number of positive and false positive results is useful. Hence we provide this bound on the variance.

The rest of the article is structured as follows: In Section 2.1, the main definitions and notations are introduced. Section 2.2 contains the main results. After that, Section 3 contains the calculations of sensitivity, specificity and the probability of Type I and Type II errors. In Section 4, the bounds of the variance are proved, and Section 5 provides details on the construction of some non-adaptive pooling matrices of the form we consider.

2.1. Notation. We use notation inspired by the group testing literature: There are n items (e. g. nasophrygnal swabs) which can be infected or non-infected. The state of the items is a vector:

We assume that the X j are drawn independently from a Bernoulli distribution with infection probability or prevalence ρ ∈ (0, 1), where for practical purposes, ρ is assumed small. This is also called the linear prevalence regime in group testing.

The items are pooled into pools of size q such that every item participates in exactly m pools and such that no pair of items appears in more than one pool. In particular, the overall number of tests is t = mq and the compression ratio, the factor of improvement with respect to individual testing, is n/t.

Formally, the pooling can be described by the pooling matrix A ∈ {0, 1} t×n which encodes which item is put into which pool. We write A i,j = 1 if and only if item j enters into pool i. In particular, (AX) i is the number of positive items in pool i. In terms of the pooling matrix A, the above conditions mean that A is a multipool matrix in the sense of the following definition.

Definition 1. We call the matrix A ∈ {0, 1} t×n an (n, q, m)-multipool matrix [Täu20] or a (m − 1)-disjunct matrix with constant row and column sums, if the following three conditions hold:

(M1) The sum over every row is q.

(M2) The sum over every column is m.

(M3) The scalar product of any two columns is at most one.

In the language of Block designs, multipools are known as uniform 1-designs or Steiner systems and a maximal multipool with n = q 2 and m = q + 1 is a 2-design, cf. [CD07; Sti04] .

Definition 1 imposes constraints on the interplay of n, q, and m. (n, q, m)multipool matrices exist for instance if q is a prime number or a power of a prime, the overall number of items is n = q 2 , and m is not larger than q + 1. We will provide the details on this particular construction in Theorem 4 and Section 5.

A pool can test positive or negative. The pool test results are a vector

The testing process is assumed to be noisy according to the noise model

It depends on the number of positives in a pool as well as on two parameters p fp , p fn ∈ [0, 1], the false positive and false negative probability. The error model (2.1) is for instance argued for in [ZRB20] , and we note that the false positive probability p fp and false negative probability p fn will in practice depend on the pool size q, i. e. the dilution. Since we are reluctant to argue here for an error model incorporating dilution due to pool size, we treat p fp and p fn as parameters which will depend on q and need to be inferred from experiments.

The results of the pools are then decoded using the COMP or NCOMP decoding algorithm.

Definition 2. Let an (n, q, m)-multipool matrix be given. The Noisy Combinatorial Orthogonal Matching Pursuit decoder with parameter δ ∈ {0, 1, . . . , m}, abbreviated as NCOMP(δ), declares an item as tested positive if and only if at most δ of the m pools which contain item j are not tested positive:

In the special case δ = 0, when an item is declared positive if and only if all of its pools test positive, this decoder is simply called the Combinatorial Orthogonal Matching Pursuit decoder: COMP := NCOMP(0). If we do not want to specify the parameter δ, we simply write NCOMP.

Remark 3. COMP has been described by numerous authors where [KS64] seems to be the first occurrence. The names COMP and NCOMP themselves seem to have been coined in [Cha+11] . We refer to [AJS19] for a more thorough discussion. We also note that there exist other decoding algorithms in the mathematical literature such as the Definite Defective (DD) algorithm [ABJ14] and the algorithm in [Coj+20] which relies on random constant column designs and which has been shown to be information-theoretically optimal in the sublinear prevalence regime.

Furthermore, performance guarantees on COMP and DD in the sublinear regime have recently been investigated in [Geb+20] .

The decoded results are a vector

where Z j = 1 if COMP or NCOMP declares item j as positive and Z j = 0 otherwise.

Results. We first ensure the existence of pooling matrices as in Definition 1.

Theorem 4. Let the pool size q be a prime number or a power of a prime and let the total number of items be n = q 2 . Then, (n, q, m)-multipools exist if and only if the number of pools m an item participates in satisfies m ≤ q + 1.

Such matrices have been studied in the literature and similar structures have been suggested for pooling strategies. If q is a prime number, such matrices can be constructed by the Shifted Transversal Design [Thi06] . If q is a power of a prime, they can be constructed by Reed-Solomon codes and have been suggested for pooling e. g. in [Erl+15] . We provide details on the construction of such matrices and illustrations in Section 5.

Figure 1. This graph shows the Fano plane with seven points and seven lines such that every point is contained in exactly three lines and two lines intersect in exactly one point. Thus, interpreting points as items and lines as pools, the Fano plane describes a (7, 3, 3)-multipool which is not of the form provided by Theorem 4.

In the subsequent results, we only rely on the multipool structure, outlined in Definition 1. While Theorem 4 ensures that corresponding pooling matrices exist for particular n, q, m, there exist more, see Figure 1 . The following Theorems are valid beyond the restrictions imposed by the matrices considered in Theorem 4.

Theorem 5. Let ρ, q, m, δ, p fp and p fn be given. Then, for any suitable n, in any (n, q, m)-multipooling strategy with decoding by NCOMP(δ), the sensitivity is

and the specificity is

Remark 6. Note that when the pool size q and δ, that is the maximal number of negative pools an item can be in and still be flagged positive, are fixed, then the sensitivity is decreasing in the multiplicity m while the specificity is increasing in m. This follows from the inequality

which is elementary upon noticing that the left and right hand side denote the probability of obtaining at least δ heads when flipping a (1 − x)-biased coin m or m+1 times, respectively. This can also be graphically observed in Figures 2 and 3 . Figure 2 furthermore illustrates the error-correcting effect of the NCOMP algorithm in the presence of noise. We see that while the sensitivity always decreases with growing multiplicity m, passing from COMP to NCOMP can mitigate this effect. In conclusion, a good strategy in the presence of a non-negligible false negative probability p fn is to use NCOMP for high sensitivity and then boost the specificity by larger multiplicities.

We can now also provide expressions for the probabilities of Type I and Type II errors. Corollary 7. Let ρ, q, m, δ, p fp and p fn be given. Then, for any suitable n, in any (n, q, m)-multipooling strategy with decoding by NCOMP(δ), the probability of Type I errors is

, and the probability of Type II errors is

Remark 8. When decoding with the COMP decoder, i. e. δ = 0, (2.6) simplifies to

If we require the probability of Type I errors to be bounded by > 0, then this condition can be solved for m:

and provides a lower bound on the number of pools an item has to participate in. In the special case δ = 0 and p fp = p fn = 0, we have γ 1 = (1 − ρ) q−1 and recover the condition P(X j = 0 | Z j = 1) ≤ ε if and only if

Remark 9. Let us study the Type I error probabilities in more detail. In Figure 4 , we see that in noiseless testing Type I errors emerge with growing prevalence rate and rapidly grow to approach the curve f (ρ) = 1 − ρ for large ρ. This is due to the fact that in this regime, a majority of pools will contain at least one positive item. Due to these combinatorial false positives, the whole test will become a useless oracle flagging every item positive.

In Figure 5 , we add noise. In the presence of a non-zero false positive probabil- ity p fp , we observe another phenomenon, namely the screening paradox in which for small enough ρ, true positives are so rare that they are dominated by false positives arising from noisy measurements. In any case, we observe that both types of false positives can be reduced by larger multiplicity m.

Finally, we also emphasize that larger pool sizes negatively impact Type I error probabilities, cf. in the light of (2.7), the necessary m grows only logarithmically with growing pool size such that after all the compression ratio rapidly improves with larger pool sizes. Remark 11. A noteworthy observation is that Corollary 10 only relies on the three conditions in Definition 1, i. e. constant row sum, constant column sum, scalar product between columns at most one. Thus, these conditions alone already determine the expected number of positives, false positives and false negatives. In particular, imposing further conditions on the pooling matrices will not reduce the expected number of false positives in COMP and NCOMP. multiplicity m = 8, and decoding with COMP and NCOMP(1), respectively. We observe again the phase transition from small ρ, where non-adaptive testing works well, to moderate ρ where essentially all 256 items are flagged positive. We also see that higher multiplicities help delaying this transition to higher ρ, cf. focus on small ρ, we see that the expected number of positives follows the expected number of true positives before it starts diverging. This happens later for larger multiplicities, cf. Figure 9 . Finally, Figure 10 illustrates the interplay between the expected number of false positives and COMP and NCOMP(1): Passing from COMP to NCOMP(1) will increase the expected number of false positives, but in the presence of a non-negligible false negative probability, NCOMP(1) will also slash the expected number of false negatives close to 0.

In order to better understand the random variables T and T fp , we provide bounds on their variance. We restrict ourselves to the case δ = 0, i. e. decoding by COMP, and p fn = p fp = 0, that is noiseless testing. While it is possible to fix a particular matrix, and write down analytic expressions for this variance, we provide here a universal estimate on the variances which only relies on the multipool structure of Definition 1. Its proof uses the Efron-Stein estimate and is given in Section 4.

Theorem 13. If p fp = p fn = 0 (noiseless testing) and δ = 0 (decoding by COMP), in any (n, q, m)-multipool strategy, we have for the expectations of the number T of positive results and of the number T fp of false positive results:

Remark 14. Figure 11 illustrates the bounds (2.11) and (2.12) on the variances. Again, we see the transition from low variance at small prevalence ρ to huge variance at medium ρ until the decoding will flag essentially everyone positive, thus for large ρ, it will become a useless prediction. In this regime, T will have zero variance, and the variance T fp is essentially the variance of the binomial distribution of uninfected items. Figure 11 . The graphs illustrate the bounds (2.11) and (2.12) on the variance of T and T fp for noiseless testing in pools of size 16 taken from a sample with 256 items for different multiplicities m.

More interesting observations can be made by zooming in on small ρ, see Figure 12 . rather close. Since this difference exceeds our upper bound on Var[T fp ], we conclude that our bound on Var[T ] is far from sharp. However, the bound on Var[T fp ] seems to be closer to optimal since it remains near zero for small ρ and then grows with a steeper incline, carrying features of a phase transition. This effect becomes more prominent with increasing multiplicity m, cf. Figure 12 .

One could argue that a very large variance is preferable to a moderately large one. Indeed, in the latter case, one will have the odd run with a false positive result which is hard to identify, whereas in the first case, false positive results will come in rare batches with unusually large portions of positives. Such a pattern could be flagged as a "graceful failure" of the testing strategy which points to an outbreak without identifying the infected individuals. For this purpose, one would like to have a clear phase transition between perfect recovery and graceful failure which, considering Figure 12 , requires large multiplicity m.

Remark 15. Let us conclude our remarks with some information theoretic considerations. For the sake of the argument, we consider the case n = q 2 . One can think of two thresholds for ρ above which non-adaptive group testing might break down. The first one is the combinatorial or disjuncness threshold

If the prevalence ρ exceeds ρ disj , then in average more than half of all test runs will be confronted with a number of infected items for which the pooling matrix cannot guarantee perfect identification any more. However, we see in Figure 13 that identification still works rather well for moderately higher prevalences, i. e. the expected number of false positives remains small and the good news is that it is perfectly possible to overclock COMP beyond ρ disj . The second threshold is the information theoretic threshold

where h(x) = x log 2 (x) + (1 − x) log 2 (1 − x) denotes the entropy function, and we take its inverse on the interval (0, 1/2) where it is monotone. For prevalences ρ above ρ info , the average information contained in a binary string of length n with a portion ρ of ones will exceed the possible information that can be encoded in a binary string of length m · q, i. e. in the number of pools, thus imposing a hard limit on the maximal prevalence ρ for all possible compression ratios. We see that the number of false positives starts to blow up near ρ info . However, we can observe in the logarithmic plot in Figure 14 that even at the information theoretic maximal prevalence ρ info , the false positives are far less than one magnitude above the true positives. This means that while we would not recommend running nonadaptive multipool testing at prevalences near the information theoretic maximal prevalence ρ info , even in this extreme regime, the algorithm produces results which are useful as a first stage in a screening strategy.

3. Sensitivity, specificity, and error probabilities for COMP and NCOMP In this section we prove Theorems 5 and 7. We start with the following lemma. At the information theoretic maximal prevalence ρ ent , the expected number of false negatives is well less than one order of magnitude below the expected true number of positives.

Lemma 16. Fix k ∈ {0, . . . , q}, a pool Π i , and j 1 , . . . , j k ∈ Π i . Then

Proof. We condition on the number s of positive items in Π i besides the one we condition on, apply (2.1), and then simplify using Newton's theorem:

Choosing X j 1 = · · · = X j k = 0, the statement on γ k follows.

Lemma 16 allows us to calculate the sensitivity and specificity for Theorem 5.

Proof of Theorem 5. For item i to test positive, at most δ pools containing item i can test positive. The number of pools with positive result for item j is (A Y ) j . Therefore, we have by conditional independence and Lemma 16

which shows (2.2). Identity (2.3) follows analogously.

We apply Theorem 5 to calculate the probability of Type I and Type II errors in Corollary 7.

Proof of Corollary 7. Using Bayes formula, we have

This shows (2.5). Identity (2.6) is established analogously.

Finally, we derive the expectations given in Corollary 10.

Proof of Corollary 10. We calculate

and analogously E[T fn ] = n P(X 1 = 1) P(Z 1 = 0 | X 1 = 1) = nρ(1 − sens).

In this section, we prove Theorem 13. We are in the situation δ = 0, i. e. decoding by COMP, and p fp = p fn = 0, which means noiseless testing. For an We are going to use the Efron-Stein inequality:

Proposition 17 (Efron-Stein inequality). Let X 1 , . . . , X n and X 1 , . . . , X n be independent variables, where X i has the same distribution as X i for all i. Denote X := (X 1 , . . . , X n ) and X (i) := (X 1 , . . . , X i−1 , X i , X i+1 , . . . , X n ).

Let f be a function of n variables. Then

We can now prove Theorem 13.

Proof of Theorem 13. We pick an item i, create an independent copy X i of its state X i , and write X (i) = (X 1 , . . . , X i−1 , X i , X i+1 , . . . , X n ) for the vector X where X i is replaced by X i . Since the multipool condition is symmetric under exchanging items, the Efron-Stein inequality and the tower property of the conditional expectation imply

Here, F c i is the σ-algebra generated by {X 1 , . . . , X i−1 , X i+1 , . . . , X n }. The notation j ∼ i means that X i and X j share a pool, and i being pivotal for Z j means that flipping X i from 0 to 1 will flip Z j from 0 to 1.

The item i shares pools with exactly m(q − 1) many other items, whence the sum has m(q − 1) + 1 terms. Using | l k=1 a k | 2 ≤ l l k=1 |a k | 2 in the expectation in (4.2), we estimate further

For i to be pivotal for Z j , we need X j to be zero, the q − 2 other items in the unique pool which contains items i and j must be negative, and all other m − 1 pools that item j belongs to must have at least one positive item. This leads to 

In this section, we prove Theorem 4 on the existence of particular multipools as in Definition 1, that are maximally disjunct design matrices with constant row and column sums. We are in the situation where q is a prime or a power of a prime and n = q 2 items are arranged in pools of size q. The underlying idea is essentially due to [RS60] . Multipools with a prime number of items per pool have been described, e. g. in [Thi06; Täu20] and with a prime power as number of items per pool in [Erl+15] .

Proof of Theorem 4. Let us first prove that the maximal number of pools an item participates in in a multipool arrangement with n = q 2 is m = q + 1. By definition, two pools in a multipool can only share at most one item. Thus, any subset of two items in a pool uniquely determines this pool. Since there are at most n 2 (unordered) pairs of items, we have at most n 2 q 2 −1 = n(n − 1) q(q − 1) many pools in a multipool. For the case n = q 2 , this yields q(q + 1) = n + q many pools, whence every item can participate in at most q + 1 many pools. This proves the bound on the maximal multiplicity m in Theorem 4.

For a construction of such a multipool with maximal multiplicity, let F q denote the finite field of size q which exists because q is a prime power. We index the n = q 2 items by the elements of the finite vector space F 2 q and use the "straight lines" in F 2 q as pools: For all a, b, c ∈ F q , let Π a,b := {(x, ax + b) | x ∈ F q } and Π ∞,c := {(c, y) | y ∈ F q }. In any case, each line contains exactly q elements, each element of F 2 q is contained in exactly q + 1 lines, and different straight lines intersect in at most point of F 2 q . Counting the parameters a, b, c, one sees that there are q 2 + q straight lines in total. This yields a (n, q, m)-multipool of maximal multiplicity m = q + 1. For multipools of lower multiplicity m, note that each a ∈ F q determines a partition {Π a,b | b ∈ F q } of F 2 q . Therefore, for non-empty M ⊆ F q , the collection Π M = {Π a,b | a ∈ M, b ∈ F q } is a multipool in which each item is contained in exactly m = |M | ∈ {1, . . . , q} pools.

Remark 18. If q is a proper prime, the arithmetic in F q is simply the arithmetic modulo q and these lines are indeed periodically continued straight lines with different slopes, cf. Figure 15 . The complete (49, 7, 8)-multipool is depicted in Note that the design matrix A ∈ {0, 1} (M ×Fq)×F 2 q = {0, 1} t×n for the multipool Π M is given by   A (a,b) ,(x,y) = 1 Π a,b ((x, y)) = 1 (x, y) ∈ Π a,b 0 otherwise where a, b ∈ F q determine the pool and x, y ∈ F q index the item. See Figure 19 for the design matrix of the (49, 7, 8)-multipool and Figure 20 for the design matrix of the (64, 8, 9)-multipool.

Remark 19. An equivalent, more algebraic construction of pooled tests based on Reed-Solomon codes is described in [Erl+15] . There, the items are indexed by polynomials with coefficients in F q , more specifically, the n smallest polynomials according to the lexicographical order. To understand the base ordering in F q giving rise to this order, we need more details of the standard construction of F q . Consider the factoring of the prime power q = p a . For this, we fix an irreducible polynomial P p,a of degree a over F p , for example the Conway polynomial that fits the bill, see e. g. [Lüb08] . Then, F q := F p [x]/P p,a is the space of all polynomials over F p modulo P p,a . Consider the representative of an element P ∈ F q with degree lower than a. The coefficients can be used to form a number in base p which we can computed by considering P as a polynomial over Z and evaluating it at p. This given by A (a,b) ,f = 1 f (a) = b 0 otherwise. If n is a multiple of q and lower or equal to q 2 , this construction gives a multipool, too. Indeed, each item is in exactly one pool for each layer, so it is contained in exactly m pools. Furthermore, by the lexicographical order, the absolute terms of the polynomials cycle through F q exactly n/q times, so that each pool contains exactly n/q items. And finally, because n ≤ q 2 , all polynomials used to index the items are constant or linear, whence two items can share at most one pool. Indeed, if two polynomials of degree lower or equal to 1 agree in two or more places, they are equal and describe the same pool.

This gives us multipools with multiplicities up to m = q. The remaining layer in a multipool of maximal multiplicity m = q + 1 consists of lines of slope infinity and cannot be represented as a graph of polynomials, so they need to be added manually.

The geometric construction and the algebraic construction from Remark 19 produce the same multipools. Indeed, the geometric condition y = ax + b corresponds to the algebraic condition y − ax = b, thus choosing M = −M , the design matrices agree up to the order of rows.

Group Testing Algorithms: Bounds and Simulations

Clustering and superspreading potential of SARS-CoV-2 infections in Hong Kong

Group Testing: An Information Theory Perspective

Optimal Pool Size for COVID-19 Group Testing

Handbook of Combinatorial Designs

Non-adaptive probabilistic group testing with noisy measurements: Near-optimal bounds with efficient algorithms

Non-Adaptive Group Testing: Explicit Bounds and Novel Algorithms

Optimal Group Testing

Biological screens from linear codes: theory and tools

COVID-19 testing strategies and objectives. Stockholm. 2020

Improved bounds for noisy group testing with constant tests per item

Tapestry: A Single-Round Smart Pooling Technique for COVID-19 Testing

Temporal dynamics in viral shedding and transmissibility of COVID-19

Nonrandom binary superimposed codes

Why do some COVID-19 patients infect many others, whereas most don't spread the virus at all?

Is Pool Testing Method of COVID-19 Employed in Germany and India Effective?

Conway polynomials for finite fields

The mathematical strategy that could transform coronavirus testing

On Almost Disjunct Matrices for Group Testing

A strategy for finding people infected with SARS-CoV-2: optimizing pooled testing at low prevalence

Practical High-Throughput, Non-Adaptive and Noise-Robust SARS-CoV-2 Testing

Bericht zur Optimierung der Laborkapazitäten zum direkten und indirekten Nachweis von SARS-CoV-2 im Rahmen der Steuerung von Maßnahmen

Polynomial Codes Over Certain Finite Fields

Efficient high-throughput SARS-CoV-2 testing to detect asymptomatic carriers

Combinatorial Designs. Constructions and Analysis

Rapid, large-scale, and effective detection of COVID-19 via non-adaptive testing

A new pooling strategy for high-throughput screening: the Shifted Transversal Design

Temporal profiles of viral load in posterior oropharyngeal saliva samples and serum antibody responses during infection by SARS-CoV-2: an observational cohort study

Population-scale testing can suppress the spread of COVID-19

SalivaDirect: Simple and sensitive molecular diagnostic test for SARS-CoV-2 surveillance

Noisy Pooled PCR for Virus Testing

Germany Email address: christoph

Lehrgebiet Analysis, Fakultät Mathematik und Informatik