key: cord-0463898-h5e1sff4
authors: Brunner, James D.; Chia, Nicholas
title: Confidence in the dynamic spread of epidemics under biased sampling conditions
date: 2020-06-04
journal: nan
DOI: nan
sha: 4c7408033f98e5fe7fa91fe4a4b3a316a22ec254
doc_id: 463898
cord_uid: h5e1sff4

The interpretation of sampling data plays a crucial role in policy response to the spread of a disease during an epidemic, such as the COVID-19 epidemic of 2020. However, this is a non-trivial endeavor due to the complexity of real world conditions and limits to the availability of diagnostic tests, which necessitate a bias in testing favoring symptomatic individuals. A thorough understanding of sampling confidence and bias is necessary in order make accurate conclusions. In this manuscript, we provide a stochastic model of sampling for assessing confidence in disease metrics such as trend detection, peak detection, and disease spread estimation. Our model simulates testing for a disease in an epidemic with known dynamics, allowing us to use Monte-Carlo sampling to assess metric confidence. We use this method to show that trends in the disease may be identified using under $10000$ biased samples each day, and an estimate of disease spread can be made with additional $1000-2000$ unbiased samples each day. We also demonstrate that the model can be used to assess more advanced metrics by finding the precision and recall of a strategy for finding peaks in the dynamics.

Policy decisions in the face of an epidemic rely on perceptions of the population dynamics of an infectious disease, for example whether cases are growing or shrinking. If everyone were tested every day, then this would be a matter of looking at the trend in the total number of positive tests per day. However, this scenario is unrealistic. In actuality, the population sampling needed to track an epidemic in a community will depend on the nature of the question we would like to answer. Sometimes, these questions are in conflict with each other. For instances, the primary goal of healthcare providers is to identify infected patients in the hospital or clinical setting so that appropriate treatment and protective measures may be prescribed, while at the other extreme the epidemiologist concerned with infection prevention within the population may be interested in determining the number of infected individuals in the population in order to focus efforts at limiting disease spread. Clearly, the former is very targeted towards patients showing up at a clinic with certain symptoms while the latter requires broad testing of both symptomatic and asymptomatic populations.

The reality is that testing needs to serve multiple purposes with a finite number of tests. In particular, the COVID-19 pandemic and response from world leaders has shed light on the need for a better understanding of community infection data and how to use it, both for decision making and media reporting. Some of the particular challenges are the significant proportion of asymptomatic carriers of the disease Bai et al. (2020) ; Mizumoto et al. (2020) and the changing availability of the testing. Both have resulted in testing that is strongly biased towards infected individuals and not representative of the proportion of cases in the population. Here, we focus on two intermediate use cases -individuals and businesses that need to estimate risk, i.e., the probability that an infected person will be present in a given situation, and public policy makers that need to understand changing trends in the spread of the disease. We show that this can be done with a combination of biased and unbiased sampling that requires many fewer tests to be performed every day, but importantly must include the number of negative tests in addition to the number of positive cases that is more widely reported. Notably, this is in line with the World Health Organization's global surveillance guidelines, which include reporting of total tests so positive percentage can be determined Organization (2020) .

The purpose of this manuscript is to introduce a method for assessing confidence in conclusions made from biased sampling of the spread of an epidemic. We begin with a calculation of the number of tests needed to identify a significant portion of infected individuals in a given day. We then describe a stochastic dynamical model that simulates testing over the course of an epidemic with known dynamics. We then show how we can use a given model of epidemic dynamics to investigate the amount of testing needed to estimate disease spread and trends in disease. We further use our approach to simulate testing with variable bias and error and investigate the roles of bias and error in testing. We find that amount of testing needed to identify most infected people in a population of 300 million (approximately the population of the United States), is extremely high. On the other hand, we show that trends in the spread of the disease can be accurately identified by sign with less than 10000 biased tests per day. We show that approximately 1000 additional unbiased tests can be used to estimate the bias in testing. This can be used to estimate the extent of disease spread in the community on a daily basis. Our approach can also be used to assess the reliability of many data analysis techniques. We demonstrate this by assessing a strategy for finding peaks in the dynamics of the outbreak by using a smoothed numerical derivative of the data. Finally, we show the importance of understanding bias under the conditions of limited testing by examining COVID-19 data from within the United States of America.

As a special note, we emphasize that while many important efforts are being made to model the spread of COVID-19 and determine how testing can be used to reduce that spread Reich et al. (2020) ; Piguillem and Shi (2020) ; Alvarez et al. (2020) ; Chowdhury et al. (2020) , our approach does not attempt to predict disease spread. Instead, we are testing confidence in data analysis sampled from known dynamics. In other words, rather than trying to predict the future, our work focuses on estimating confidence in current trends under non-ideal conditions.

Sampling the population in order to identify all or some large proportion of the infected individuals will require a large amount of testing. If we assume that testing is done in a single day, the proportion of infected in the population is roughly constant in that day, and no person is tested more than once, the number of positive cases will follow a hypergeometric distribution based on the number of cases in the population, the bias in the testing, and the number of tests performed. We can compute the cumulative distribution function of this distribution, and so compute the number of tests needed so that the probability that some number of cases are found.

Precisely, we use the hypergeometric distribution with population rBT + (1 − r)T and infected rBT , where B is the same biased used in our sampling model below, r is the proportion of the population that is infected, and T is the total population size.

A stochastic model for disease sampling.

We developed a stochastic model to simulate the sampling of a population which is undergoing an epidemic with known dynamics. That is, given an underlying set of dynamics tracking asymptomatic infected individuals, symptomatic infected individuals, and non-infected individuals, we simulate testing members of this population for the disease. Let I 1 (t), I 2 (t), and H(t) represent the number of asymptomatic infected individuals, symptomatic infected individuals, and non-infected individuals in the population, respectively, for t ∈ [0, T ].

Tests are assumed to be carried out according to a Poisson point process with (possibly time varying) intensity function λ (t). The intensity function λ (t) can be interpreted as the rate at which members of the population are tested for the diseased, and so an increase in test availability, for example, would be reflected in this model with an increase in λ (t).

Each time a test is performed, we determine the status of the person tested. Under the assumption of unbiased testing, we categorize a person into one of three pools. The probability of a test result deponds on the respective proportion of the population which belongs to each pool. That is, the probability the test is performed on an asymptomatic infected person is P(this testee is asymptomatic infected) = I 1 (t)

Likewise, P(this testee is symptomatic infected) = I 2 (t)

and P(this testee is non-infected) = H(t) I 1 (t) + I 2 (t) + H(t)

.

(

We also account for the possibility that a test may give a false-positive or false-negative with some constant probability. Let ε 1 ∈ [0, 1] be the false-negative probability of the test, and ε 2 ∈ [0, 1] be the false-postive probability. Then, for any given test we can combine eqs.

(1) to (3) to see the following:

and P(negative test) = ε 1 P(person is infected) + (1 − ε 2 )P(person is not infected)

Accounting for testing bias.

Testing for disease in a population is not done uniformly at random. Instead, an individual displaying symptoms of the disease is much more likely to be tested for it than one who is not. We may model a bias in testing by adjusting eqs.

(1) to (3). Let B(t) be some function of time, with B(t) ≥ 1 for all t ∈ [0, T ]. Then, we can reflect the bias of the testing procedure by re-weighting the population for each test performed. We replace eqs.

(1) to (3) with the following:

and combine eqs. (6) to (8) with eqs. (4) and (5) to determine the probability of a single test result. The result is that the number of positive and negative tests that have been carried out up to time t are each non-homogeneous Poisson point processes with intensity functions

respectively.

We note that the model described above essentially assumes an infinite total population size. Practically, this means that testing is done on an insignificant proportion of the population, or equivalently that members of a population are immediately eligible to be re-tested after being tested. In appendix B, we describe a modification for this model which accounts for small population size and non-immediate retesting.

Both the initial model described in this manuscript and the model adjusted for small populations can be written as the sum of Poisson point processes with time-varying intensities. They can therefore be simulated using a slight adjustment to Gillespie's Algorithm (also called the Stochastic Simulation Algorithm) (Gillespie, 1976 (Gillespie, , 1977 . This adjustment accounts for possible changes in the intensity functions between points in the Poisson processes, for example from variations in λ (t) or B(t). This adjustment is made by choosing event times according to the maximum values of any time-varying functions, and allowing for the possibility of a non-event at each event time, a procedure often called thinning (Asmussen and Glynn, 2007; Anderson and Yuan, 2019) .

Simulation of the models as described allow us to perform Monte-Carlo estimations of the confidence that can be assumed in the calculation of various statistics from data. We perform Monte-Carlo estimation by repeatedly simulating sampling over the course of an epidemic with given dynamics and determining the success rate or average error of in determining a metric from simulated sampling when compared to determining the same metric from the known underlying dynamics.

To assess trends in data simulated according to our model, we discretize the time interval [0, T ] into evenly spaced intervals (e.g. into single day increments) ending at times t 1 ,t 2 , ...,t N = T . We then compute positive-test proportions for these intervals, simply defined as the proportion of tests carried out within the interval that were positive. This allows us to make sense of the simulated data even as testing capacity λ (t) varies in time.

We estimate N-day trends in disease spread using a linear least-squares fit to N consecutive days of simulated positive test proportions. We define the linear trend of the simulated data to be the slope of this fitted line. This can then be compared to a linear fit to the infected proportion

over the same time interval, computed using time-discretized dynamics. 

We attempt to find peaks in the data by estimating the time derivative of the positive-test proportions simulated. We then identify peaks as points at which the derivative crosses from positive to negative. That is, we estimate the change in true positive-test proportion from day to day, and identify when this proportion stopped increasing and began to decrease.

To estimate the time derivative of positive proportions, we first compute a numerical derivative over each time interval. We then blur this discretized derivative to reduce noise (and therefore false peaks) using a one-dimensional Gaussian filter.

Our model is designed to simulate sampling of an epidemic with any non-negative underlying compartmental dynamics. This means that the number of healthy, symptomatic infected, and asymptomatic infected members of a population can be any non-negative known functions of time. As a consequence of this feature, some set of known dynamics must be either chosen directly or generated by another dynamic model. Confidence in metric sampling is then measured against the known dynamics.

To demonstrate our method, we use the SIR model (Edelstein-Keshet, 2005; Hethcote, 2000; Kermack and McKendrick, 1927) with a time-variable rate of disease spread to generate underlying dynamics, as well as a similar compartmental model that allows for asymptomatic individuals, which we refer to as the SAIR model. See appendix A for details. These models represent a popular, simple choice of dynamic epidemic model with parameters often reported by the lay news media.

One obviously crucial role of testing of a disease outbreak is to identify patients for treatment and possibly quarantine. From that perspective, it is crucial to identify all or most infected members of a community. We therefore assess the number of tests this would require. Using the cumulative distribution function of the hypergeometric distribution, we see that in a community of 300 million with 5% infection, we need approximately 27, 009, 300 unbiased tests to have 90% confidence that we have found 90% of the cases. For higher bias, fewer tests are needed, as seen in fig. 1 .

We simulated sampling as described above with underlying dynamics generated by ODE models of outbreaks. We use the positive test proportion per day as our sampled variable, in order to account for variations in testing capacity. Figure 2 demonstrates that day-to-day variations in testing capacity are reflected in positive test count but not proportion, and that only recording positive test count may mask the real dynamics. 

With false positive and false negative rates equal to 0, biased sampling causes an overestimation in the proportion of a large population that is infected. If tests are taken at random in the population (and so not biased), the proportion of positive tests will on average be the same as the proportion of infected individuals r(t), which is the sum of the proportions of symptomatic and asymptomatic infected individuals:

Biasing the tests towards symptomatic individuals is analogous to sampling a population with extra symptomatic individuals added:

and we note that this overestimation will not lessen with a higher number of samples. Including the rate of false positive and negative tests, this becomes

In order to estimate the spread of the disease in a community, we can estimate the bias B if we have unbiased sample data as well. To do this with only positive/negative test data, we must assume that the ratio of symptomatic to asymptomatic infected members of the community is constant (i.e. r 1 = cr 2 ). For a more accurate daily estimate, we need to reduce the variance in our estimate of B. We simulated an estimate of B with various biased and unbiased sample capacities. Variances for these estimates are shown in fig. 3 .

The ability to correctly characterize a linear trend in the dynamics from sampled data depends on the strength of that trend as well as the sampling of the population. Confidence in trends is reported as the proportion of 5 day intervals in 1000 simulations 5/16 Figure 3 . Variance of bias estimate for various sampling capacities. This variance represents the error in estimation from a single day of biased and unbiased tests. In this experiment, we do not have asymptomatic infected individuals.

Simulation shows how biasing testing towards symptomatic individuals overestimates infected proportion of the population, even with no false-positive or false-negative tests, and a high rate of testing. Here, the bias parameter is set to B(t) = 10 and the testing rate is λ (t) = 8100 for all t.

which correctly identified the sign (positive/negative) of the linear fit to the dynamics. In fig. 5 , we test sampling's ability to identify a constant trend (i.e. linear increase or decrease) in the infected proportion. We see that sampling identifies the sign of the trend robustly for large enough absolute slope. We also see that increasing sampling improves identification of trends in data.

In fig. 6 , we show how this dependence on the underlying dynamics effects confidence in trend identification over the course of a simulated outbreak. Here, we see that trends can be identified with good confidence with 8100 samples per day for most time-intervals. Those intervals in which trends could not be confidently identified were those that included local maxima (peaks) or minima (valleys) in the epidemic dynamics.

In fig. 7 , we see that biased sampling actually improves the confidence of an identified trend. This is due to magnification of the trend in infections which are relatively rare. Biased testing allowed for better trend confidence in error free (no false positive/false negative) testing as well as testing with 10% error rate.

We observe that identification of peaks in data using the smoothing method described above has a high chance of finding the peaks in the dynamics, but has very poor precision, providing many false peaks. Figure 8 shows the precision and recall of this peak finding for one set of dynamics and sampling assumptions, while tables 1 and 2 give precision and recall for peak finding with two sets of dynamics (SIR generated and SAIR generated) with various sampling assumptions (with and without bias and errors). Tables 1 and 2 were generated with a smoothing parameter of 5, while fig. 8 uses smoothing from 1 to 10, with 10 giving the highest precision and recall.

Using data from The COVID Tracking Project (Lipton et al., 2020) to compute 5-day trends of the outbreak in the state of Minnesota, shown in fig. 9 , with data from March 6 to April 31, 2020.

We also note a significant negative correlation between positive test percentage and number of tests performed in many states. This can be explained by a reduction in bias as more tests become available (i.e. an increased willingness to test members of the population not showing symptoms). In fact, a strong negative correlation could indicate that testing may have been initially used in a more restrictive, and therefore more heavily biased, manner. We hypothesize that as testing increases, testing bias will approach some limit that represents the preferred policies of healthcare and government organizations. It may be that 7/16 even with high testing capacity, some bias will exist due to these policies. Changes in policies will result in future changes in testing bias. We note that our model can simulate a change in bias with a time-dependent bias parameter B in eqs. (6) to (8).

It is worth noting that some day to day variation may be the result of irregularities in negative test reporting, as evidenced by days with 100% positive rate. Pearson correlations are shown in fig. 10 , with significance computed as p-value of the correlation coefficient.

While the multiple purposes of infectious disease testing would be satisfied if all or almost all infected individuals could be identified, the amount of testing needing to have a high level confidence that almost all infected individuals have been identified is prohibitively high. For example, the hypergeometric distribution suggest that if we have a population of 300 million with 5% infection, then we need about 27 million unbiased tests per day for 90% confidence that we have found 90% of the cases. For COVID-19, it remains very unlikely that case numbers reported represent an accurate estimate of the extent of disease spread. Furthermore, these numbers cannot be compared from place to place or time to time because of changes in testing bias Lipton et al. (2020) . As an example of how testing bias can affect perception of a trend, we simulate of an artificial scenario where testing capacity increases drastically part-way through the course of an infection in fig. 2 , and demonstrate that considering only the number of positive tests per day completely obscures a peak in the dynamics. On the other hand, simply considering the proportion of tests in a day which are positive reveals the true dynamics. This emphasizes the importance of proportion of positive tests over the number of positive tests.

Testing for COVID-19 is clearly biased toward finding infected individuals. While reduced testing has drawbacks for addressing particular scenarios, such as screening healthcare workers, it also has important benefits for tracking the population level trends that inform policy decisions. As an intuitive example, consider a population with a very small proportion of COVID-19 cases, as would be expected in the very early or very late stages of an outbreak. Heavily biased testing helps better Figure 10 . Correlation between positive test percentage and tests performed in each state. Significance indicates p < 0.05, p < 0.01, p < 0.001, and p < 0.0001. detect the infection by focusing on where the cases are rather than spending the vast majority of tests on negative results. In this sense, biased testing is a form of importance sampling. Furthermore, biased testing reduces the number of tests needed to identify all or most infected individuals. In fig. 1 , we show the number of samples needed for bias parameters ranging from B = 1 to B = 50. Bias in testing is the natural result of the role of testing in the healthcare setting, and this confirms the advantages bias has for detecting population trends. However, using bias in testing as shown in eqs. (6) to (8), we see in fig. 4 that the spread of the infection will be overestimated significantly by biased testing. In other words, to accurately estimate the spread of disease, we must estimate the bias parameter B. This can be done by conducting a separate set of unbiased tests and using the relationship given by eq. (13). If we assume that the testing bias is constant (which is reasonable for a single day), this is a Monte-Carlo estimator where the error in this estimation is determined by the variance in the estimate. We simulated with a bias B = 10, and show the variance of single-day estimates for B with various biased and unbiased testing capacities in fig. 3 . From that simulation, we conclude that it is not unreasonable to estimate bias with 1000 unbiased samples per day, in addition to a larger capacity of biased testing.

Policy changes during an outbreak, such as those seen during the COVID-19 outbreak with changes in stay-at-home orders, appear to be based on trends in the disease dynamics, i.e., whether disease spread is accelerating or decelerating, or if there have been changes to the rate of acceleration. Our work shows that we can account for how bias in the testing changes in determining the trend. Moreover, we show that the overall positive or negative trend is not overly sensitive to the bias, meaning that assuming an approximately constant bias may work for most estimates. It is important to note that determining the sign of trends in the disease is easier when the trends are larger in magnitude, as shown in fig. 5 . The less change there is in infection rate, such as those through smaller policy changes, the more testing is needed to identify an effect. As an example, we see in fig. 6 that 8100 samples per day is enough to give good confidence in the estimated sign of a 5-day trend in disease dynamics for most of the course of an outbreak. This confidence is low when the trend is very weak, meaning the true dynamics are at a local maximum (peak) or minimum (valley). Finally, we see in fig. 7 that a constant bias in testing actually improves our ability to detect the sign of a trend in the dynamics. This is because biased testing magnifies trends in the data, as can be seen in fig. 4 . We see again that 8100 tests per day gives high confidence in the sign of five day trends in the data as long as that data is done with a constant high bias.

Policy may also be based on other metrics in sampling data, such as the occurrence of peaks or more complicated model fitting. Our model of sampling provides a method for testing the confidence of these metrics. As an example we show that the exact peaks in an outbreak can be found, as seen in table 2, but there will likely be a large number of false peaks, as seen in table 1. However, with the right smoothing, critical points in the dynamics can be identified with some confidence, as shown in fig. 8 .

Finally, as a relevant test of our approach and an exploration of real data from COVID-19, we use data from the COVID Tracking ProjectLipton et al. (2020) for the state of Minnesota to compute 5-day trends for that data, shown in fig. 9 . For data collected after mid April, 2020, testing capacity was generally over 2000 samples per day, and so these estimates can be seen as somewhat reliable with bias assumed to be approximately constant.

We also show the correlation between positive test percentage and number of tests performed for each state in fig. 10 . We see that in Minnesota there is no significant correlation; we assume that bias has been relatively constant. This was, however, an uncommon situation. In most states, there is a significant anti-correlation between between positive test percentage and number of tests performed. This may be due to changes in the policies of healthcare providers and government organizations as tests become available. We must therefore account for this change in bias when discussing trends in the spread of the disease. Additionally, some of this correlation may be due to the occurrence of days on which negative tests are not reported or under-reported. We may model changes in bias simply by choosing some time-dependent bias functions B(t) in eqs. (6) to (8).

We provide a model of sampling in a disease outbreak in order to simulate data analysis in different outbreak situations, and to assess infection testing strategies. Clearly, we should account for the confidence we have in the measurements of metrics used to set policy. This confidence is affected by testing capacity, errors, and bias. Our model provides a method to assess confidence with time-varying testing capacity and bias by simulating sampling over the course of an epidemic. This model demonstrates the importance of tracking testing capacity, estimating possible changes in bias, and tracking positive test percentage rather than raw number of positive tests. Our model provides an essential tool in designing an effective response to the outbreak of an infectious disease.

We also test dynamics which include an asymptomatic but infected population. We generate these dynamics with the model dx S dt = −(β 11 + β 12)x I 1 x S − (β 21 + β 22 )x I 2 x S dx I 1 dt = β 11 x I 1 x S + β 21 x I 2 x S − γx I 1 − δ x I 1 dx I 2 dt = β 12 x I 1 x S + β 22 x I 2 x S − γx I 2 + δ x I 1 dx R dt = γ(x I 1 + x I 2 )

where x S , x I 1 , x I 2 , x R represent the proportion of the population that is susceptible, asymptomatic infected, symptomatic infected, or recovered from the disease, respectively. Using these dynamics, we have I 1 (t) = x I 1 (t), I 2 (t) = x I 2 (t), and H(t) = x S (t) + x R (t).

The model as described above assumes that members of the population may be re-tested immediately after being tested. This is reflected in the fact that performing a test has no effect on eqs. (9) and (10). In large populations, this assumption is reasonable because the proportion of the population who have been tested is not significant. On the other hand, in small populations we must model the limited availability of untested members of the population.

To account for this effect, we must estimate the proportion of each sub-population (I 1 (t), I 2 (t), H(t)) which is available for testing. We assume that testing removes one person from testable population, and those removed are re-introduced after some exponential wait time. However, we still must approximate how the overall disease dynamics change the tested and untested population. That is, we must account for a healthy individual who has been tested becoming infected before being re-introduced into the testable population, or an infected individual recovering.

We make the following simplifying assumption:

where x = I 1 , x = I 2 , or X = H represent total sub-populations and x T is the number of tested and not-yet reintroduced members of the sub-population. In practice, we use an Euler approximation to estimate the proportion of a sub-population ineligible for testing:

x T (t 2 ) = x T (t 1 ) + x T (t 1 ) x(t 1 ) (x(t 2 ) − x(t 1 )) = x T (t 1 )

x(t 1 ) x(t 2 ).

where t 1 ,t 2 are the times of consecutive stochastic events in the model (i.e. a test performed or population member re-introduced into the testable population). With this model, overestimation of the positive percentage due to biased testing lessens as the rate of testing increases, as shown in fig. 11 . This is due to the limited number of testable infected individuals at any time. Figure 11 . Apparent bias decreases as the sampling rate increases in a limited population. Simulations were done with a population size of 10000.

A simple planning problem for covid-19 lockdown

Low variance couplings for stochastic models of intracellular processes with timedependent rate functions

Stochastic simulation: algorithms and analysis

Presumed asymptomatic carrier transmission of covid-19

Dynamic interventions to control covid-19 pandemic: a multivariate prediction modelling study comparing 16 worldwide countries

Mathematical models in biology

A general method for numerically simulating the stochastic time evolution of coupled chemical reactions

Exact stochastic simulation of coupled chemical reactions. The journal of physical chemistry

The mathematics of infectious diseases

Containing papers of a mathematical and physical character

The COVID tracking project

Estimating the asymptomatic proportion of coronavirus disease 2019 (covid-19) cases on board the diamond princess cruise ship

Global surveillance for covid-19 caused by human infection with covid-19 virus

Optimal covid-19 quarantine and testing policies

Covid-19 forecast hub

ACKNOWLEDGMENTS J.B. and N.C. were funded by the Mayo Clinic Center for Individualized Medicine.

Our code to generate sample paths of the model is written in the language go, and compiled as the executable file disease_confidence. Code to perform Monte-Carlo simulation is available as python scripts. All code is available at https://github.com/jdbrunner/dis

We generate SIR dynamics with the ODE modelwhere x S , x I , x R represent the proportion of the population that is susceptible, infected, or recovered from the disease, respectively. We allow β = β (t) to be a function of time, and in the above simulations takeand γ = 1 15 . This dynamic parameterization allows us to generate an infection with more than one peak time, as can be seen in fig. 4 .Using these dynamics, we take I 1 (t) = x I (t), I 2 (t) = 0, and H(t) = x S (t) + x R (t).