key: cord-0855870-yxor6zd2
authors: Britton, T.; Trapman, P.; Ball, F. G.
title: The risk for a new COVID-19 wave -- and how it depends on $R_0$, the current immunity level and current restrictions
date: 2020-10-13
journal: nan
DOI: 10.1101/2020.10.09.20209981
sha: 1a043265f2b692982b724232911c2ad2eee937e9
doc_id: 855870
cord_uid: yxor6zd2

The COVID-19 pandemic has hit different parts of the world differently: some regions are still in the rise of the first wave, other regions are now facing a decline after a first wave, and yet other regions have started to see a second wave. The current immunity level $hat i$ in a region is closely related to the cumulative fraction infected, which primarily depends on two factors: a) the initial potential for COVID-19 in the region (often quantified by the basic reproduction number $R_0$), and b) the timing, amount and effectiveness of preventive measures put in place. By means of a mathematical model including heterogeneities owing to age, social activity and susceptibility, and allowing for time-varying preventive measures, the risk for a new epidemic wave and its doubling time, and how they depend on $R_0$, $hat i$ and the overall effect of the current preventive measures, are investigated. Focus lies on quantifying the minimal overall effect of preventive measures $p_{Min}$ needed to prevent a future outbreak. The first result shows that the current immunity level $hat i$ plays a more influential roll than when immunity is obtained from vaccination. Secondly, by comparing regions with different $R_0$ and $hat i$ it is shown that regions with lower $R_0$ and low $hat i$ may now need higher preventive measures ($p_{Min}$) compared with other regions having higher $R_0$ but also higher $hat i$, even when such immunity levels are far from herd immunity.

COVID-19 is currently spreading in many parts of the world. In several regions the spreading has now dropped substantially, and in other regions the first wave has been very small, most often owing to the implementation of e↵ective preventive measures. Regions where spreading now is low face two competing interests: lifting restrictions to normalize society, and to keep or strengthen restrictions in order to avoid a new major COVID-19 outbreak. The minimal overall e↵ect of preventive measures needed to avoid a large future outbreak, denoted p Min , and how it depends on the basic reproduction number R 0 and the current immunity levelî, is hence a highly important question which is investigated here. In addition we consider the doubling time of a new outbreak should it take place, which gives an indication of its impact before additional preventive measures would be implemented.

The basic reproduction number R 0 quantifies the initial potential of an epidemic outbreak for a particular disease in a particular region, and is defined as the average number of new infections caused by a typical infected individual in the beginning of the epidemic outbreak (before preventive measures are put in place and before population immunity starts to build up), [3] . For COVID-19 estimates of R 0 vary substantially between di↵erent regions, e.g. between 2 and 5 among 11 European countries [6] .

Preventive measures aim to reduce the average number of infections caused by an infective, by either reducing the risk of transmission given a contact (e.g. hand washing, wearing face mask), reducing the number of daily contacts (e.g. social distancing, school closure) and/or reducing the e↵ective infectious period (e.g. testing and isolating, treatment). Let p(t) denote the overall e↵ect of such preventive measures at time t, where 0  p(t)  1, and with p(t) = 0 corresponding to no preventive measures and p(t) ⇡ 1 meaning more or less complete isolation of all individuals.

Letî(t) denote the community fraction that cannot get infected at time t, a few of these being currently infectious, but the majority having recovered from the disease and now being immune (waning of immunity is here neglected since our time frame is less than a year). At time t, it is the current (or e↵ective) reproduction number R t of a region that determines if a new main outbreak can take place or not. In particular, a region with low current transmission avoids the risk for a large new outbreak as long as R t < 1, and regions with ongoing transmission will see a decline in transmission whenever R t < 1.

For simple epidemic models, which assume a homogeneous community that mixes homogeneously, it is well-known that R t = R 0 (1 p(t))(1 î (t)), since R 0 is reduced both due to the preventive measures and from the fact that some contacts will be with already infected people. This implies that R t  1 is equivalent to p(t) 1 1/(R 0 (1 î (t))), thus quantifying, in terms of R 0 and the current immunity levelî(t), the minimal amount of preventive measures needed to avoid a new large outbreak.

For more realistic epidemic models this simple relation between R t and R 0 , p(t) andî(t) does not hold. In fact, for epidemic models acknowledging population heterogeneities it holds that R t < R 0 (1 p(t))(1 î (t)). The main reason is that individuals having high social activity and/or high susceptibility are more likely to be infected early in the epidemic, implying that individuals at risk later in the epidemic will on average be less susceptible and socially active, thus also infecting fewer if they become infected [1] . Here we study an epidemic model in which social mixing depends on age-structure, that also allows for variable social activity as well as variable susceptibility within age-groups. For this model the aim is to quantify R t as a function of R 0 , p(t) andî(t), and in particular to quantify the minimal amount of restrictions p Min for a region having initial basic reproduction number R 0 and current immunity levelî (the index t is now dropped and implicitly considered as current time).

We illustrate our findings by expressing p Min and the doubling time for di↵erent regions in Europe and US, but these illustrations are by no means exact. First, the model is clearly a simplification of the ongoing COVID-19 pandemic, but even more so the estimates of R 0 and the current immunity levelî for di↵erent regions contain appreciable uncertainty. Nevertheless our results allow, for the first time to our knowledge, a risk comparison between regions having di↵erent R 0 and di↵erent immunity levelî.

The epidemic model is based on the model in Britton et al. [1] . Individuals are divided into 6 di↵erent age groups, and mixing patters are taken from the empirical study of Wallinga et al. [11] . Within each age group individuals are divided into three categories: 50% have normal social activity, 25% have low social activity (half as many contacts as those with normal activity) and 25% have high social activity (double activity). It is important to stress that social activity a↵ects both the risk of getting infected and infecting others in that socially active individuals have more contacts both when susceptible and when being infectious.

To this model with age-cohorts and variable social activity, studied in [1] , we now also add variable susceptibility [7] , which is done similarly to variable social activity. We assume that 50% have normal susceptibility, 25% have half the susceptibility and 25% are twice as susceptible, and the variable susceptibility is assumed to be independent of both social activity level and age-group. The choice to divide social activity as well as susceptibility into three groups as above is of course quite arbitrary. This choice of heterogeneity structure is quite moderate in that there is no tail (with individuals having very high social activity or susceptibility) and the coe cient of variation equals 0.48 which is moderate (see the Supplementary Materials, SM, for further comments).

It seems natural to also add variable infectivity for individuals who become infected. However, such variable infectivity will have no e↵ect on our results if it is assumed independent of susceptibility and social activity, and is hence omitted.

We use a deterministic SEIR epidemic model (see SM) with a total of 6*3*3=54 di↵erent types of individuals, but very similar results would be obtained from simulations of a corresponding stochastic model assuming a large population (which can be proved using methods in [4]). The latent state "E" (for exposed) is assumed to have mean 3 days followed by an infectious period ("I") having mean 4 days, thus being quite close to other models for the spread of COVID-19 [6] . Details of the model are given in the SM, where it is explained that our results hold also for the corresponding model in which the latent and infectious periods need not follow exponential distributions, and more generally for the model in which infectives have independent and identically distributed shapes of the infectivity profiles.

It is straightforward to numerically derive properties of the model, such as the basic reproduction number R 0 , the time dynamics and its final fraction infected when the 3 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted October 13, 2020. . https://doi.org/10.1101/2020.10.09.20209981 doi: medRxiv preprint epidemic stops (see SM for further details).

During the outbreak, preventive measures of varying magnitude may be put in place. We assume that these preventive measures do not a↵ect the latent and infectious periods, but only that they reduces the rate of infectious contacts. More precisely we make the strong and somewhat restrictive assumption that, at time t, all contacts (between the 54 types of individuals) are reduced by the same factor p(t). This assumption can easily be relaxed, but to explore all possibilities of contact reduction is infeasible, and among all specific preventions the uniform one, where all contacts are reduced by the same factor, is the most natural choice. Even when assuming such uniform reduction of contact rates, its reduction may vary in time in di↵erent ways. However, in the SM it is shown that the exact time allocation of the preventive measures has negligible e↵ect in our model: any time-varying preventive measures {p(t); 0  t  t 0 } from the start of the epidemic up until some fixed time t 0 , leading to the same overall fraction infected, will have nearly the same fractions infected among the di↵erent types of individuals (see SM for details). Thus early mild preventive measures will result in the same composition of infected individuals as doing nothing and then suddenly going to a full lockdown, assuming the two preventive measures lead to the same overall fraction infected.

Consider a large community in which COVID-19 spreads according to our model with some fixed value of R 0 , and for which preventive measures {p(t); 0  t  t 0 } were put in place (the same preventive e↵ect on all type of contacts). We further assume that by t 0 the transmission has more or less stopped, resulting in a fractionî having been infected (and are immune) and the remaining fraction 1 î still susceptible. Our main scientific question lies in quantifying what the e↵ective reproduction number R t 0 equals if all restrictions are lifted at time t 0 . If R t 0 > 1 it follows that a new large epidemic outbreak may occur if all restrictions are lifted, as opposed to the case R t 0  1 when herd-immunity has been reached (though smaller local outbreaks are still possible).

In the more common COVID-19 scenario that R t 0 > 1, the minimal amount of preventive measures necessary to avoid a new large outbreak is given by p Min = 1 1/R t 0 . This amount p Min is thus a measure for the risk of a new large outbreak. In the plots below p Min has been computed as a function of R 0 and the current immunity levelî, and is quantified by a heatmap. The left in Figure 1 is for the main model allowing for heterogeneities with respect to age, social activity and susceptibility, and with disease-induced immunity. The right plot in Figure 1 shows the corresponding plot when immunity comes from vaccinating uniformly in the community (which is equivalent to disease-induced immunity for a model assuming a completely homogeneous community). In Figure S1 of the SM we show a similar plot for the model allowing for heterogeneities with respect to age and social activity but not with respect to susceptibility (treated in [1] ). The p Min -values for 4 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted October 13, 2020. Figure 1 : Plot of the minimal amount of preventive measures, p Min , necessary to avoid a new large outbreak, as a function of R 0 and the current immunity levelî. The left plot is for disease-induced immunity and the right plot is for vaccine-induced immunity.

its disease-induced immunity are very similar to those of the present model (left plot of Figure 1 ).

In the left plot is seen that, for a fixed value of R 0 , the necessary amount of preventive measures needed to avoid a large future outbreak decreases quite rapidly with the amount of diease-induced immunityî, and forî su ciently large the color is deep blue reflecting herd immunity. of R 0 the minimal amount of preventive measures p Min is plotted as a function of the immunity level, both when immunity comes from disease exposure (solid lines) and when it is achieved by means of vaccination.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted October 13, 2020. . When comparing the e↵ect of disease-induced immunity with the e↵ects of vaccine-induced immunity a di↵erence is clearly observed in each of the two figures. More specifically, for a given R 0 and some positive immunity levelî, the necessary amount of preventive measures is substantially higher if immunity comes from vaccination as compared to disease-induced immunity (most easily seen in Figure 2 ). As a numerical illustration, in a region with R 0 = 2.5 that has experienced an outbreak in a mitigated situation resulting in an immunity level ofî = 25%, the required amount of preventive measures is p Min = 29%, whereas if instead the immunity levelî = 25% came from (uniform) vaccination, then the necessary amount of preventive measures is p Min = 1 1/(R 0 ⇤ 0.75) = 47%.

An alternative way to compare the e↵ect of disease-induced immunity with vaccineinduced immunity is to compare the doubling time between disease-induced and vaccineinduced immunity if all preventive measures are dropped at a time-point when transmission is very low. To illustrate this some further assumptions about the generation time distribution have to be made. These follow from the determinstic SEIR epidemic model and are provided in the SM. If, as above, we consider a region having R 0 = 2.5 and immunity levelî = 25%, then the doubling time for disease induced immunity equals 12.7 days whereas it equals 6.6 days if instead the immunity is vaccine-induced. ( Figure S2 in the SM gives heatmaps for the doubling times as functions of R 0 andî both when immunity comes from disease exposure and when vaccine-induced.) Consequently, if all restrictions were to be lifted the epidemic would start growing quite quickly, but much quicker if immunity came from vaccination. The same qualitative result applies if restrictions are lifted only partially but still below p Min .

We now use estimates of R 0 and current immunity levelsî for a few di↵erent groups of related regions in order to compare the minimal preventive measures p Min of regions within each group. The regions that are compared are: Madrid vs Cataluna (containing Barcelona) in Spain, Lombardy (containing Milan) vs Lazio (containing Rome) in Italy, New York State vs Washington D.C., and the three Scandinavian capital regions Stockholm, Copenhagen and Oslo.

As mentioned earlier, the model is a simplification of the real disease spreading situation for COVID-19 by neglecting several heterogeneities (households, spatial aspects, social networks, ...) and by assuming that earlier preventive measures acted proportionally in the same way between all types of individuals. However, uncertainty in the estimated R 0 andî is believed to be much greater, so the obtained minimal preventive measures p Min for di↵erent regions are to be interpreted as illustrations rather than exact numbers. Nevertheless it allows for a comparison between regions with similar R 0 but di↵erent immunity levels, as well as comparing regions with higher R 0 and immunity levels with regions having both lower.

For the sake of illustration we make the slightly unrealistic assumption that all regions have the same the same mixing between age groups, the same age-structure and the same variation in social activity and variation in susceptibility. The only di↵erences between compared regions are the overall rate of contacts (measured by R 0 ) and the amount of preventive measures thus leading to di↵erent immunity levelsî.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. .

As described above it is assumed that there currently is no or low transmission. However, the results apply also to the situation where there is substantial ongoing transmission, the only di↵erence is that then the issue is not to avoid a new wave, but to make transmission start declining.

We focus on comparing regions close to each other (making our assumption of similar mixing, age structure and variable activity and susceptibility more reasonable) and use estimates of R 0 and immunity levelsî taken from the same literaure source. In the SM we explain in detail how the estimates are obtained, but in brief it is as follows. The European country-specific R 0 estimates are taken from [6] (except for Sweden for which the estimate is taken from its preprint [5] , see SM for motivation). To obtain separate estimates of the two Spanish regions we use [2] estimating that Madrid has about 5% higher R 0 than Cataluna (a conservative estimate of the di↵erence). For Italy, Riccardo et al. [8] estimates more or less identical (initial) basic reproduction numbers for Lombardy and Lazio, so here we have not distinguished between the two regions. The R 0 estimates for New York and Washington D.C. are taken from [9] .

The immunity levels are harder to find estimates of in the literature. For this reason we have used the o cial number of case fatalities per 100 000 individuals in the separate regions as of October 5, 2020, and assumed that the infection fatality risk (ifr ) equals 0.5% (slightly smaller than the estimated ifr for China in [10] ). By assuming that all infected individuals become immune, and assuming no prior immunity, this gives an estimated immunity level. Of course, the ifr most likely di↵ers substantially between regions owing to di↵erences in age-distribution and health care. Further, the choice to set if r = 0.5% is a rough approximation, as is the assumption that all infected become immune and that there is no prior immunity. The region-specific immunity levelsî should hence be seen as illustrations, but their order relations are most likely correct and when this holds true the qualitative comparison statements remain true. Table 1 : Estimates of R 0 and current disease-induced immunity levelsî for di↵erent regions, and the corresponding estimated minimal preventive measures p Min . For comparison, the minimal preventive measures needed to avoid a large outbreak at the start, p (start) Min , and the minimal preventive level when immunity instead is achieved by Vaccination, p In Table 1 the estimated R 0 and the disease induced immunity levelsî are given first. Then comes p Min computed from the model for the estimated R 0 andî and hence taken from 7 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. . the heatmap (Figure 1 ). As a comparison the initial minimal preventive level required to avoid a large outbreak at the start (whenî = 0), p (start) Min = 1 1/R 0 , is listed, as is the minimal preventive level required to avoid an outbreak if the current immunity level i was obtained instead from uniform vaccination, p (V ac) Min = 1 1/(R 0 (1 î )). (Table S1 in the SM give the corresponding doubling times if restrictions were lifted.)

It is seen that no studied region has reached herd immunity, meaning that none of the regions can lift all restrictions without risking a new large outbreak. By comparing p Min with p (start) Min it is further seen that the high levels of required preventive measures at the start of the epidemic have been reduced substantially in regions having su↵ered from high transmission during the first epidemic wave. More specifically, p Min in Lazio (includes Rome) now clearly exceeds that of Lombardy (includes Milan). Cataluna (includes Barcelona) also seems to require slightly more preventive measures than the Madrid region, but the di↵erence is small. New York still needs more preventive measures than Washington D.C., but the di↵erence has dropped compared to the initial required minimal levels. Among the Nordic capital regions, Stockholm had the highest initial minimal preventive measures to avoid an outbreak, whereas now Copenhagen has highest minimal preventive measures followed by Oslo, but the di↵erences are small.

Min , it is seen that disease-induced immunity plays a more significant roll as compared with immunity achieved by vaccination. In particular, the regions having highest immunity levels (New York, Madrid and Lombardy) would clearly have larger minimal preventive requirements had immunity come from vaccination.

The main aim of the paper has been to compare the levels of restrictions needed to avoid new major outbreaks of COVID-19 for di↵erent regions having di↵erent initial potential (R 0 ) and di↵erent current immunity levelsî. Clearly, regions with high R 0 that have not yet experienced much spreading need to be most careful, but perhaps more interesting is a comparison between a region with high R 0 having experienced much transmission, with another region having smaller R 0 but also having lower immunity. The main conclusion from our study is that disease-induced immunity reduces the risk for a large future outbreak substantially more than when immunity is achieved from vaccination. Smaller local outbreaks are possible irrespective of region and are not the focus of the present paper.

In the comparison of di↵erent regions it is seen that the region now requiring the highest amount of preventive measures may have switched from a region with high R 0 that has experienced high transmission, to another region having smaller R 0 but which has experienced less transmission.

The epidemic model studied allows for individual variation owing to age, social activity and variable susceptibility. The age e↵ect is taken from an empirical study. However, the variation owing to social activity and variable susceptibility is chosen arbitrary but the choice is believed to be less variable (with lighter tails) than reality. Many other heterogeneities are ignored (e.g. households, schools and work places, spatial aspects, travel and 8 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. . https://doi.org/10.1101/2020.10.09.20209981 doi: medRxiv preprint commuting, ...) but it is believed that the e↵ect of adding such other heterogeneities is that p Min is shifted close to proportionally.

A greater uncertainty lies in the estimation of R 0 and the immunity levelsî, but this can be reduced once better data become available. The estimates of p Min in Table 1 are hence only to be interpreted as illustrations.

A di↵erent extension could be to consider preventive measures acting di↵erently between di↵erent types of individual. The present framework can easily be extended to this situation, the missing information is estimates of how prevention have reduced contacts di↵erently between di↵erent pair of subgroups of individuals.

We conjecture that our two main qualitative results hold true. These are that the e↵ect of disease-induced immunity is greater than vaccine-induced immunity, and that regions having su↵ered from many infections up until now, may be in a better situation with regards to future outbreaks as compared to other regions with lower R 0 but with no or low immunity levels. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. 

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. Materials and Methods Figs. S1 to S3 Table S1 1 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. . https://doi.org/10.1101/2020.10.09.20209981 doi: medRxiv preprint

In this supplementary information we describe the deterministic SEIR (Susceptible, Exposed, Infectious, Removed) epidemic model in a population partitioned by age, activity level and (relative) susceptibility. The model is an extension of the one in [2] , and the presentation below hence follows closely (and partly copies) the model presentation in the SI of [2] . For reasons of notational convenience we label the types (the combination of age, activity level and susceptibility level) from 1 to m, where m is the product of the number of age cohorts, the number of activity levels and the number of susceptibility levels. In our model m = 6 ⇥ 3 ⇥ 3 = 54. A more detailed exposition than the one presented here can be found in [1, Sections 5.5 and 6.2].

We assume that for all j 2 {1, · · · , m} the population consists of n j people of type j. We set n = P m j=1 n j and ⇡ j = n j /n. We assume that the population is large and closed, in the sense that we do not consider births, deaths (other than possibly the deaths caused by the infectious disease) and migration. Throughout the epidemic, n i is fixed, so people who die from the infectious disease are still considered part of the population. For j, k 2 {1, · · · , m}, every given person of type j makes "infectious contacts" with every given person of type k independently at rate ↵a jk /n. If at the time of such a contact the type-j person is infectious and the type-k person is susceptible then the latter becomes latently infected (Exposed). People of the same type may infect each other, so a jj may be strictly positive (and often is!). Because the definition of an infectious contact includes that the contact leads to transmission of the disease, it is not necessarily the case that a jk is equal to a kj . For the same reason it is possible that the relative susceptibility exceeds 1. The parameter ↵ is a scaling parameter, used to quantify the impact of control measures in the main paper, without measures ↵ is set so that R 0 has the desired value. Exposed individuals become Infectious at constant rate and infectious individuals recover or die (are Removed) at constant rate µ. The rates of becoming infectious and removal are assumed to be independent of type. It is straightforward to extend the model to make those rates age and/or activity level and/or susceptibility level dependent. However, dependence on these factors would have impact on the relationship between p Min and the doubling time after the first wave.

In the described multi-type SEIR model, the expected number of people of type k that are infected by an infected person of type j during the early stages of the epidemic is n k ⇥ (↵a jk /n) ⇥ (1/µ) = ⇡ k ↵a jk /µ, where 1/µ is the expected duration of an infectious period. The next-generation matrix M has (for j, k 2 {1, · · · , m}) as element in the j-th row and k-th column the quantity ⇡ k ↵a jk /µ. Suppose that the next-generation M is irreducible, i.e. that for any j, k 2 {1, · · · , m} it is possible for the infection of a type-j individual to lead to the infection of a type-k individual, either directly or through a chain 2 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. . of infectives involving other types. It is easily seen that this condition is satisfied in our model. The basic reproduction number R 0 is then defined as the largest eigenvalue of M ; it is necessarily real and positive. If R 0 > 1, then a large outbreak is possible with strictly positive probability, while if R 0  1 an outbreak stays small with probability 1.

In the model under consideration, where the rates and µ are independent of the type of a person, there exists a Malthusian parameter ⇢ such that the number of infectious individuals grows initially as e ⇢t , where ⇢ < 0 if R 0 < 1 and ⇢ > 0 if R 0 > 1. In [14] it is shown that the relationship between ⇢ and R 0 is (under the assumed conditions) the same for a multitype population as it is for a homogeneous population, where ⇢ satisfies

with (t) being the expected rate of new infections caused by a person t time units after he or she was infected (e.g. p 12 in [3] ). (Note that R 1 0 (t) gives the probability density function of generation-time of the epidemic.) In our model

The growth rate ⇢ is then the unique solution in ( min(µ, ), 1) of

The doubling time is given by [ln 2]/⇢. We set S j (t) to be the number of people of type j that are susceptible to the disease at time t, E j (t) the number of people of type j that are latently infected, I j (t) the number of infectious people of type j and R j (t) the number of removed people of type j (j 2 {1, · · · , m}). Note that S j (t) + E j (t) + I j (t) + R j (t) = n j = ⇡ j n for all t 0, because the population is closed. Again for j 2 {1, · · · , m}, we define s j (t) = S j (t)/n j , e j (t) = E j (t)/n j , i j (t) = I j (t)/n j and r j (t) = R j (t)/n j .

Theory on Markov processes [6, Chapter 11] (see also [1, Section 5.5] for the single type counterpart) gives that for large n the above model can be described well by the following system of di↵erential equations (again for j 2 {1, · · · , m}):

3 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. .

To be complete, in the main text, we use when analysing the time-dependent behaviour of an epidemic that for all j 2 {1, · · · , m}, s j (0) = 1 ✏, e j (0) = ✏ and i j (0) = r j (0) = 0. In the analysis below we do not impose specific assumptions on the initial conditions. The epidemic will ultimately go extinct, because the population is closed, so for all j 2 {1, · · · , m} we have that e j (t) ! 0 and i j (t) ! 0 as t ! 1. Thus s j (t) + r j (t) ! 1 as t ! 1. Furthermore s j (t) is non-increasing, so s j (1) = lim t!1 s j (t) exists.

It can be shown in the spirit of [1, Equation (6.2)] that for j 2 {1, · · · , m},

To understand this identity we observe first that

s j (0) is the fraction of initially susceptible people of type j who escape the epidemic, while the sum in the right-hand side can be written as

In words the summands read as the number of people of type k that were infectious at some moment during the epidemic, times the rate at which a type-k person makes infectious contacts with someone of type j, times the expected time an infected person is infectious. In other words, the right-hand side is the cumulative force of infection during the entire epidemic acting on a person of type j. Standard theory on epidemics gives that minus the natural logarithm of the probability that a given initially susceptible person of type j avoids infection is the cumulative force of infection acting on that person. Thus (1) gives that the fraction of initially susceptible people that are ultimately still susceptible is equal to the probability that a given initially susceptible person avoids infection. This argument is independent of the Markov SEIR structure of our model, and it is straightforward to generalize the results of the paper to epidemic models in which infected people have a general random infectivity profile as long as the expected shape of the infectivity profile (i.e. they have the same density for the generation time) does not depend on the type of the infectious person. In particular, note that the calculation of p Min described below holds for this more general model. If R 0 > 1 and the epidemic is initiated by few infectives in a large population then, conditional upon a large outbreak occurring, the final fractions of initially susceptible people of the di↵erent types are given by the unique solution of (1), with s j (0) = 1 and r j (0) = 0 for all j 2 {1, · · · , m}, that satisfies s j (1) < 1 for all j 2 {1, · · · , m}.

For the age structured population and contact intensities between di↵erent age groups we used [17] (just as was done in [2] ). The age groups are 0-5, 6-12, 13-19, 20-39, 40-59 4 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. . https://doi.org/10.1101/2020.10.09.20209981 doi: medRxiv preprint and 60+. The contact matrix A † , i.e. the matrix with elements {a † jk ; j, k 2 {1, · · · , 6}} is deduced from Table 1 of [17] . Note that the numbers reported in Table 1 of [17] are the expected number of contacts from a person of type j with people of type k: c jk = n k a † jk /n = ⇡ † k a † jk . (We use a † jk and ⇡ † instead of a jk and ⇡ because we already use a and ⇡ to denote the fractions in the population of the di↵erent types which are characterised by age cohort, activity level and susceptibility level, while a † and ⇡ † denote contact rates and fractions in the population of the di↵erent age cohorts only). We then divide the elements of Table 1 by the corresponding ⇡ † k to obtain the matrix A † . (The values of ⇡ † j , j 2 {1, · · · , 6}, are obtained using Appendix 

As explained in the main text we can use this matrix to generate the 54 by 54 contact matrix for the model in which we take age, activity level and susceptibility level into account in the following way. For reasons of clarity we denote the type of a person now by a three-dimensional vector (c, a, s), where the first entry stands for the age cohort, which takes a value in {1, · · · , 6}, the second entry stands for social activity level, which can take values {1/2, 1, 2} depending on whether the level is low, medium or high and the third entry stands for the level of susceptibility which in our example also takes values {1/2, 1, 2} depending on whether the relative susceptibility is low, medium or high.

The expected number of type-(c 0 , a 0 , s 0 ) people infected by a given infected person of type (c, a, s) is then C c,c 0 ⇥ a ⇥ a 0 ⇥ s 0 , where C c,c 0 = ↵A † c,c 0 ⇡ c 0 ,a 0 ,s 0 , where ⇡ c 0 ,a 0 ,s 0 is the fraction of the population with type (c 0 , a 0 , s 0 ).

LetR 0 be the desired value of R 0 in the absence of preventive measures and ↵ 0 be the corresponding value of ↵. Then ↵ 0 is such that largest eigenvalue of the matrix [↵ 0 µ 1 a jk ⇡ k ] (i.e. the matrix having element ↵ 0 µ 1 a jk ⇡ k in its j-th row and k-th column) isR 0 . For ↵ 2 (R 1 0 ↵ 0 , ↵ 0 ], let ⌧ (↵) = (⌧ 1 (↵), · · · , ⌧ m (↵)), where ⌧ j (↵) = 1 s j (1, ↵) and s j (1, ↵) (j 2 {1, · · · , m}) is the solution of (1), when s j (0) = 1 and r j (0) = 0 for all j 2 {1, · · · , m}, that corresponds to a large epidemic.

Let ↵ ⇤ be such that the largest eigenvalue of the matrix M ⇤ (↵ ⇤ ) = [↵ ⇤ µ 1 a jk ⇡ k (1 ⌧ k (↵ ⇤ ))] is one and h D = P 1 j=1 ⇡ j ⌧ j (↵ ⇤ ). Note that M ⇤ (↵ ⇤ ) is the next-generation matrix 5 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. for an epidemic among the remaining susceptible population when the epidemic with ↵ = ↵ ⇤ has finished and all preventive measures are lifted (so ↵ is then set to ↵ 0 ), whence h D is the disease-induced herd immunity level for the epidemic with R 0 =R 0 . (The notation ↵ ⇤ and h D are as in [2] .)

For fixedR 0 > 1 and immunity levelî 2 (0, h D ), we obtain p Min as follows. Let

ThenR 0 is the basic reproduction number for an epidemic with no preventive measures (i.e. with ↵ = ↵ 0 ) among the susceptible population remaining when the epidemic with ↵ =↵ has finished. It follows that p Min , the minimum amount of preventive measures to necessarily prevent a large outbreak among this remaining susceptible population, is given by 1 R 1 0 .

The timing of preventive measures a↵ect the overall fraction infected but not its composition

The special case where the deterministic SEIR epidemic model described above has a nextgeneration matrix which splits up into a product with one factor depending on the type of the infector and the other factor on the type of the susceptible type is known as separable mixing [5] . In the separable mixing situation, two preventive measure {p 1 (t); 0  t  t 1 } and {p 2 (t); 0  t  t 2 } leading to the same overall fraction infected at times t 1 and t 2 , respectively, also have the same fraction infected among all di↵erent types of individuals (i.e. the same composition of infected) at those times, as we now show. Suppose that a jk = f j g k (j, k 2 {1, · · · , m}), where f j > 0 and g j > 0 for all j 2 {1, · · · , m}, and that ↵ is time-dependent and denoted by ↵(t) = ↵ 0 p(t), where ↵ 0 is the value of ↵ in the absence of any preventive measures. Then the di↵erential equation for s j (t) becomesṡ

Assume without loss of generality that g 1 = 1. Then solving (2) yields

Consider a fixed immunity levelî > 0 that is attainable and suppose it is achieved at time t 0 . Then 1 î = P m j=1 ⇡ j s j (t 0 ) so, using (3), s 1 (t 0 ) is given by the unique solution 6 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. 

and then s j (t 0 ) (j 2 {2, · · · , m}) are uniquely determined by (3) . Note that if the immunity levelî is fixed then the solution of (4) is independent of the preventive measure of {p(t); t 0} and hence so are the fractions infected among the di↵erent types when the immunity levelî is reached.

The model considered in the present paper is not of the separable mixing form. More precisely, the variable social activity and the variable susceptibility enter the expression in the form of separable mixing, but the age-structure does not. However, the e↵ect of this rather small deviation from separable mixing is negligible as Figure S3 shows. In Figure S3 we have plotted p Min as a function of the disease-induced immunityî (assuming R 0 = 2.5) for two very distinct types of preventive measures. The blue curve corresponds to a constant preventive measure p(t) = p until the epidemic stops, where the value p is induced by the final overall fraction infectedî. The red curve is instead obtained by having no restrictions until the levelî is reached and at this time the epidemic is stopped (we can think of a complete lockdown). We hence have two very di↵erent preventive measures, one having constant restrictions from the start and the other strategy having no preventive measures until a sudden stop. Nevertheless we see that the two curves in Figure S3 are indistinguishable.

The consequence is that the minimal preventive measures required to prevent future outbreaks, p Min under a mitigated epidemic outbreak leading to an overall immunity level i is for all intents and purposes independent of how the preventive measures have varied over time.

We now argue why adding variable infectivity to the other heterogeneities has no e↵ect on the presented results as long as this variable infectivity is independent of age, social activity and susceptibility. In fact, the rate of infection from individuals of type (c, a, s) acting on susceptibles of type (c 0 , a 0 , s 0 ) depends only on the mean infectivity of infectives of type (c, a, s). Further, when considering a deterministic model, corresponding to an infinite population, this mean is deterministic, implying that only the mean infectivity enters the equations. The same conclusion holds in a corresponding stochastic model in the limit as the total population size n ! 1, which can be shown either by appealing to the law of large numbers for density dependent population processes (see e.g. [6, Chapter 11, Theorem 11.2.2] or [3, Theorem 2.2.7] for a version with time-dependent transition rates) or in a more general setting by using the Sellke construction of the epidemic (see e.g. [1, Section 6.1]). Note that the model with separable mixing considered above includes the case of variable infectivity, which enters via f j (j 2 {1, · · · , m}). However, for given 7 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. immunity levelî, the fractions of the di↵erent types infected s j (t 0 ) (j 2 {1, · · · , m}) when an overall fractionî of the population is infected do not depend on f j (j 2 {1, · · · , m}) and hence are independent of any variability in infectivity.

The individual heterogeneity of social activity and susceptibility was modelled identically and assumed to be independent, independent also of the age-cohort. It was assumed that 50% have medium or normal value, 25% has half the value and 25% had double the value. This choice is of course very arbitrary and made mainly in order not to have too many di↵erent types of individuals (even so there are 54 types!).

Beside the specific form, a relevant question is if the model exaggerates individual heterogeneity. We think this is not the case. For one thing, the distributions of social activity and susceptibility in the model have no heavy right tails, nor do they have values close to 0, and such tails typically have major implications (cf. epidemics on networks where heavy tail degree distributions alter R 0 dramatically [11] ).

The amount of variation is sometimes quantified by the coe cient of variation c.v., defined for a positive random variable by c.v. = p V ar(X)/E(X). Our model, having values 0.5, 1 and 2 with respective probabilities 0.25, 0.5 and 0.25, hence has c.v. = 0.48 (note that c.v. is independent of the actual values 0.5, 1 and 2 -multiplying these values by any positive constant k results in the same c.v.). A c.v. of the order 0.5 by no means reflects an unusually high individual variation. In fact, Gomes et al. [9] , and references therein, report c.v. values between 2 and 4 (for a combination of variable susceptibility and social activity) for di↵erent diseases including COVID-19.

Description of R 0 andî values in Tables 1 and S1 As explained in the main text, except for Sweden, the country specific R 0 estimates of the European countries given in Table 1 are taken from [8] . The R 0 estimate for Sweden (3.9) is taken instead from the corresponding preprint [7] . The reason for making this change is that in the preprint Sweden clearly had the highest R 0 among the Scandinavian countries, whereas in the published version ([8]) Sweden's R 0 estimate had dropped dramatically from 3.9 to 2.7 while Norway's and Denmark's estimates were almost unchanged (in fact both increased by 0.2). The reason for this change comes from problems with fitting the e↵ects of Sweden's unusual preventive measures later in the epidemic and is an artefact of jointly estimating R t later in the epidemic (personal communication with Neil Ferguson).

The estimates of R 0 for New York (the state) and Washington D.C. were taken from [15] . As for the Spanish and Italian subregions, we have rescaled the country estimates from [8] by the relation between region-specific estimates of Madrid and Cataluna, and between Lombardia and Lazio. For Spain, Madrid was estimated to have about a 5% higher R 0 than Cataluna [4], so Madrid was scaled up by 2.5% and Cataluna scaled down 8 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 13, 2020. . by 2.5%. For Italy the two region-specific estimates di↵ered by only 1%, so these estimates were left unchanged.

The number of case fatalities per 100 000 individual reported in Table 1 were all downloaded from o cial websites. The regions in Spain, Italy, US and Sweden were downloaded from Wikipedia October 5, 2020. The number of case fatalities per 100k for Denmark is from Danish Infectious Disease Institut [13] per September 14, 2020, and finally from Oslo it comes from [10] with fatalities up to September 23 2020. By assuming an infection fatality rate (if r) of 0.5% in all regions considered (slightly smaller than estimated for COVID-19 in China [16] ), and assuming that all infected individuals become immune but assuming no prior immunity, this gives the region-specific immunity estimatesî of Table 1 . We again stress that these are just illustrations since several assumptions are not met. For example, the if r is most likely higher in New York (and possible also in Lombardy) as compared to the other regions. Moreover, the true if r may very well be as large as 1% which would reduce allî by 50%, though this would not change their relative sizes.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted October 13, 2020. . https://doi.org/10.1101/2020.10.09.20209981 doi: medRxiv preprint Figure S1 : Plot of the minimal amount of preventive measures p Min for the model in [2] allowing for heterogeneity with respect to age and social activity but not susceptibility.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted October 13, 2020. . https://doi.org/10.1101/2020.10.09.20209981 doi: medRxiv preprint . The left plot is when immunity is disease-induced and the right when immunity is vaccine-induced. The white region is where herd-immunity is achieved meaning that an epidemic will not grow implying that there is no doubling time.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted October 13, 2020. . https://doi.org/10.1101/2020.10.09.20209981 doi: medRxiv preprint Figure S3 : Plot of the p Min as a function ofî assuming R 0 = 2.5. The blue curve is for the situation that preventive measures are kept constant at a level such that the outbreak ceases when a fraction exactlyî of the population has been infected, and the red curve is obtained when there are no preventive measures until the levelî is reached -then the epidemic is stopped. The two curves are indistinguishable.

12 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted October 13, 2020. .

Table S1: Estimates of R 0 and current disease-induced immunity levelsî for di↵erent regions, and the corresponding doubling times t D (in days) during the exponential growth in case all restrictions were lifted at a time-point when transmission is very low. For comparison, the initial doubling times if no restrictions were put in place initially, t 13 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted October 13, 2020. . https://doi.org/10.1101/2020.10.09.20209981 doi: medRxiv preprint

Stochastic epidemic models and their statistical analysis

A mathematical model reveals the influence of population heterogeneity on herd immunityto SARS-CoV2

Stochastic epidemics in a homogeneous community

Mathematical tools for understanding infectious disease dynamics

Markov processes: characterization and convergence

Estimating the number of infections and the impact of non-pharmaceutical interventions on covid-19 in 11 European countries. Imperial College covid-19 Response Team

Estimating the e↵ects of nonpharmaceutical interventions on covid-19 in Europe

Individual variation in susceptibility or exposure to SARS-CoV-2 lowers the herd immunity threshold

The Norweigan Public Health institute)

Epidemic spreading in scale-free networks

Epidemiological characteristics of COVID-19 cases in Italy and estimates of the reproductive numbers one month into the epidemic

Statens Serum Institut (The Danish Infectious Disease Institute)

Inferring R 0 in emerging epidemics-the e↵ect of common population structure is small

Statelevel tracking of COVID-19 in the United States

Estimates of the severity of coronavirus disease 2019: a model-based analysis

Using data on social contacts to estimate age-specific transmission parameters for respiratory-spread infectious agents