key: cord-0651367-x4c6lfzh authors: Costello, Fintan; Watts, Paul; Howe, Rita title: Homeostatic behavioural response to COVID-19 infections returns R to a set-point of 1 date: 2022-02-23 journal: nan DOI: nan sha: b068481ae4580f5802544a4b895d63ab8e91d846 doc_id: 651367 cord_uid: x4c6lfzh One clear aspect of behaviour in the COVID-19 pandemic has been people's focus on, and response to, reported or observed infection numbers in their community. We describe a simple model of infectious disease spread in a pandemic situation where people's behaviour is influenced by the current risk of infection and where this behavioural response acts homeostatically to return infection risk to a certain preferred level. This model predicts that the reproduction rate $R$ will be centered around a median value of 1, and that a related measure of relative change in the number of new infections will follow the standard Cauchy distribution. Analysis of worldwide COVID-19 data shows that the estimated reproduction rate has a median of 1, and that this measure of relative change calculated from reported numbers of new infections closely follows the standard Cauchy distribution at both an overall and an individual country level. In epidemiological models of disease spread, infection numbers at time t are a function of disease transmissibility, incubation and recovery rates (all fixed properties of the disease), of the proportion of infectious and susceptible individuals in the population at time t (functions of the state at time t − 1), and of behaviour: in particular, of the average number of contacts individuals make with others at that time, K t . In some models (Bertozzi et al., 2020) this contact number K t is taken as to be constant, giving a fixed transmission rate of β; in others K t (or β t ) is treated as a free parameter, varying with time in a way that is not described within the epidemiological model but instead is estimated via fitting the model to data (Ndaïrou et al., 2020 ; IHME COVID-19 forecasting team, 2020) or by using mobility or contact tracing datasets (Nouvellet et al., 2021; Badr et al., 2020; Russo et al., 2020) . We give a simple model of how people's behaviours (and so contact numbers) change over time in response to their assessment of risk of infection at that time. In this model people's behavioural response to infection balances the risk of infection associated with contact against the various (economic, social and psychological) gains associated with contact. We assume that people can estimate their risk of infection given a certain number of contacts (a risk that depends on infection rates in the community) and that each person has a certain constant risk or probability of infection per day, x, which they are willing to accept (whose value depends on their age, health, financial status, and so on). Each person will set their number of contacts on a given day so that, based on their estimate of the risk per contact, their overall risk that day is approximately x (so maximising their gains from contact without incurring unacceptable risk). We assume that actors such as businesses or governments will behave in a similar way, balancing risk against gain in setting policy responses to infection. In this model people will tend to change their behaviour so that their probability of infection varies around x, reducing contacts when risk is higher than x but increasing contacts when it is below x. Since the population risk of infection is the average of all individual risks, the overall probability of infection will vary over time around some constant X/N (where X the sum of acceptable risk levels and N the population size), and so the expected number of new infections per day will vary around X. Finally, since the reproduction rate R is the number of new infections caused by an existing infected individual, with new infections varying around a constant this model predicts that R will vary around a median value of 1. Two aspects of this model may be surprising. First, it goes against the common understanding that 'an R above one means an outbreak is growing, and below one means that it is shrinking' (Adam, 2020) . In this model an R below 1 does not necessarily mean the outbreak is shrinking: instead an R below 1 leads to an increase in contact numbers, which can cause a subsequent increase in new infections and in R. Second, this model assumes that people are able to accurately judge the probability of infection and adjust their behaviour appropriately, contradicting the common view that 'In making predictions and judgments under uncertainty, people do not appear to follow the calculus of chance or the statistical theory of prediction. Instead they rely on a limited number of heuristics which sometimes yield reasonable judgments and sometimes lead to severe and systematic errors' (Tversky and Kahneman, 1973) . This aspect of the model is motivated by our previous work suggesting that people's assessment of probability do in fact follow the statistical theory of prediction, and that observed patterns of systematic error in judgement are caused by the regressive effects of random variation or noise (Costello and Watts, 2014 , 2016 , 2018 , 2019 Howe and Costello, 2020) . We tested this model using data from the Our World in Data COVID hub (Ritchie et al., 2020 ) (accessed February 22, 2022 . This dataset gives the number of new COVID-19 infections reported each day for 225 countries, from the Johns Hopkins University COVID-19 Data Repository ; the reproduction rate each day for 187 countries, estimated using a Kalman Filter approach (Arroyo-Marioli et al., 2021) ; and the estimated stringency of government pandemic response each day for 173 countries, from the Oxford COVID-19 Government Response Tracker (Hale et al., 2021) . The median estimated R across this dataset was 1.0 (Fig 1) with median estimated Rs for each individual country being indistinguishable from 1 in a one-sample t-test (t(186) = 1.77, p = 0.08, 95% confidence interval for the mean: 0.95 . . . 1.003). 1 The reproduction rate R on a given day t is a function of n t , the reported number of new cases on that day. Could this R ≈ 1 result be an artefact of the n t reporting process? One problem with COVID case numbers is the frequent reporting of 0 new cases: just under 25% of n t values in the dataset were 0, with these often indicating that no reporting took place that day: a number of countries had reliable patterns of n t = 0 on weekend days only. These reporting gaps are visible as a spike in R values at 0 in the Fig 1 histogram of R values. To eliminate these reporting gaps we reran our analysis on a cleaned dataset including only days with n t > 0. The median estimated R for n t > 0 was 1.02, with median estimated Rs for each individual country being indistinguishable from 1 in a one-sample t-test (t(186) = 1.1, p = 027, 95% confidence interval: 0.99 . . . 1.03). All subsequent analyses use this cleaned dataset. Perhaps this result could be caused by a relationship between R and the number of tests being carried out? If testing increases when R is high, tests would include more cases likely to be negative and would reduce the apparent value of R; similarly, low test numbers could increase the apparent value of R. We calculated the correlation between R and number of daily tests carried out in the cleaned dataset. There was no significant difference between median R values for countries where this correlation was positive and those where it was negative (t(126.96) = 0.36, p− = 0.72). Perhaps this result is a consequence of government interventions alone, rather than behavioural responses to risk? To check this we compared median R values for countries where the average government stringency level was above the overall mean stringency, and those for which it was below. There was no significant difference between median R values for these groups (t(157.43) = 1.74, p = 0.08); the average median R was slightly higher in the highstringency group (1.02) than the low-stringency group (1.0). These results suggest that the number of new infections at time t, i t , varies around some constant X as a consequence of people's behavioural response to infection risk (and so R is distributed around 1). What can we say about the distribution of these values i t ? This behavioural response has a natural lag, L, which represents the time between an infection occurring (at time t − L, say) and that infection being observed by others and causing a behavioural response (at time t). This lag falls somewhere between the incubation and recovery period for the infection (an infection becoming observable only after incubation, and not being observable after recovery), and means that the observed rate of new infections at time t is equal to the actual rate of new infections at time t − L. If i t−L > X then the overall behavioural response at time t will reduce contact numbers, pushing i t downwards, while if i t−L < X then the overall behavioural response at time t will increase contact numbers, pushing i t upwards, and so the difference i t − i t−L varies around 0. Since this overall behavioural response is the sum of all individual responses in the population, from the Central Limit theorem this difference i t − i t−L will follow a Normal distribution i t − i t−L ∼ N (0, σ 2 t ) with some variance σ 2 t (which may change over time). The difference i t−2L − i t−L will follow the same distribution (albeit with variance σ 2 t−L ). Defining a measure of relative change in new infection numbers from time t − 2L to time t, we see that D L is the ratio of two standard Normal variables (sums of common standard deviations cancelling) and so this measure D L will follow the standard Cauchy distribution C (with location 0 and scale 1) for L between the incubation and recovery times. Assuming that changes in n t are proportional to changes in i t , we predict that D L values calculated from reported number of positive tests n t will also follow the standard Cauchy distribution C. To test this prediction we compare D L values calculated for our cleaned dataset against the theoretical distribution C. For each country, at each day t we calculated D L (t) for various values of L. For some days D L (t) could not be calculated (because one of the component infection numbers was missing), or involved division by 0; these values of D L (t) were dropped from analysis. Figure 2 (inset) shows a probability-probability plot comparing the cumulative probability of D L for L = 6 against that of C. Correlation of cumulative probabilities is a measure of goodness of fit between observed and theoretical values; here the correlation was high (r = 0.9997). Since probability-probability plots overweight extreme values, we also analysed the relationship between C and D L for values near the midpoint of the range, by selecting the subset of D L values between −15 and 15 (over 95% of the total sample). Figure 2 (main) shows a histogram of these values. The correlation between D L and C values for this central-region histogram was r = 0.993. As an additional check we calculated location and scale estimates by taking the median of D L and the median of the absolute value of D L ; these values were 0.02 and 1.02 respectively, confirming the fit to the standard Cauchy distribution. We carried out the same analysis for individual country data for L = 6. Correlations between cumulative probabilities of C and D L for individual countries were greater than r = 0.98 for all countries; for binned D L and C values in the −15 . . . 15 region the mean correlation was r = 0.97 (with a 2.5% . . . 97.5% percentile range of 0.83 to 0.99). The 2.5% . . . 97.5% percentile range for D L medians (location parameter estimates) across countries was −0.06 . . vidual country data. We predicted that this agreement should hold for L approximately between 5 and 14 (estimates for the incubation and recovery period for . To test this we carried out the above analysis for all values of L from 2 to 100. Probability-probability correlations, slopes and intercepts did not change noticeably because changes in L primarily affect D L around the median. Figures 3 plots the correlation between binned values of D L and C values in the −15 . . . 15 median region, for each value of L. Correlation was highest for values of L in the 5 to 14 region, as predicted, supporting the behavioural response model. There are a number of clear limitations to these results. First, our model of homeostasis due to behavioural response assumes that the susceptible population is aware of and responding to the risk of infection, and so applies to epidemic or pandemic situations only: we do not expect this homeostatic effect to hold in narrower outbreak situations. Second, our model depends on the assumption that new infection numbers at time t are a reflection of the probability of infection at that time. This assumption holds for infections with short incubation and recovery periods; for infections where these periods are longer, this assumption doesn't hold. Third: our model assumes that people are free to limit their number of contacts to match their acceptable level of risk. For some demographics this is not the case: people in poverty, for example, may be economically unable to limit their contacts in this way, and so will have an estimated risk of infection systematically above their acceptable risk level. Assuming that people's acceptable risk levels are well-calibrated, this predicts increased infections in such demographics (Patel et al., 2020; Little et al., 2021) . Fourth: we assume that reported infection numbers are proportional to actual infection rates. If reported infection numbers do not follow actual infection numbers, we do not expect these results to hold. A final caveat concerning the interpretation of these results. At first glance our results may suggest that government responses to infection have no value or no effect. This is not the case: government restrictions on contact clearly act to reduce the risk of infection. Instead, our model suggests that government restrictions act to reduce the overall level of acceptable risk A guide to r-the pandemic's misunderstood metric Tracking r of covid-19: A new real-time estimation using the kalman filter Association between mobility patterns and covid-19 transmission in the usa: a mathematical modelling study. The Lancet Infectious Diseases The challenges of modeling and forecasting the spread of covid-19 Surprisingly rational: Probability theory plus noise explains biases in judgment People's conditional probability judgments follow probability theory (plus noise) Invariants in probabilistic reasoning The rationality of illusory correlation An interactive web-based dashboard to track covid-19 in real time. The Lancet infectious diseases A global panel database of pandemic policies (oxford covid-19 government response tracker) Random variation and systematic biases in probability estimation Modeling covid-19 scenarios for the united states The impact of socioeconomic status on the clinical outcomes of covid-19; a retrospective cohort study Mathematical modeling of covid-19 transmission dynamics with a case study of wuhan Reduction in mobility and covid-19 transmission Poverty, inequality and covid-19: the forgotten vulnerable Coronavirus pandemic (covid-19) Tracing day-zero and forecasting the covid-19 outbreak in lombardy, italy: A compartmental modelling and numerical optimization approach Availability: A heuristic for judging frequency and probability X, so reducing the number of infections to a lower (but still approximately constant) value. Relaxation of those restrictions then produces an increase in the overall level of acceptable risk, X, and so causes infection rates to rise, but only to that new level.