key: cord-0746559-zajfrkep authors: Lefèvre, Claude; Picard, Philippe; Simon, Matthieu; Utev, Sergey title: A chain binomial epidemic with asymptomatics motivated by COVID-19 modelling date: 2021-11-01 journal: J Math Biol DOI: 10.1007/s00285-021-01680-5 sha: e14a017031c1302c0559faa36ff1eb022ea8232c doc_id: 746559 cord_uid: zajfrkep Motivated by modelling epidemics like COVID-19, this paper proposes a generalized chain binomial process which integrates two types of infectives, those with symptoms and those without. Testing of infectives and vaccination of susceptibles are then incorporated as preventive protective measures. Our interest relates to the distribution of the state of the population at the end of infection and to the reproduction number [Formula: see text] with the associated extinction condition. The method uses the construction of a family of martingales and a branching approximation for large populations, respectively. A more general branching process for epidemics is also constructed and studied. Finally, some results obtained are illustrated by numerical examples. For many infectious diseases, an unknown fraction of the infected persons has no symptoms but is able to transmit the infection. These cases, called asymptomatic, can have a significant impact on the spread of the disease. The importance of asymptomatic carriers has been studied for various epidemics, including salmonella, Ebola and many other bacterial or viral infections (see e.g. Potasman 2017; Chisholm et al. 2018; Aguilar and Gutierrez 2020) . The respiratory disease COVID-19 is caused by the novel coronavirus SARS-CoV-2. For a person with symptoms, the average incubation period appears to be approximately 5 days. Once the disease is declared, such a symptomatic infective is generally treated at home or in the hospital: it is thus removed from the infection process. A major problem is caused by the number of asymptomatic carriers. Various samples were taken in early 2020 to get an idea of the prevalence of asymptomatic infectives. A large-scale testing in Iceland showed that about 50% of the people who tested positive for COVID-19 were asymptomatic. In an Italian village at the epicentre of the epidemic, the estimated proportion of asymptomatics was around 50 to 75%. A survey in Belgium showed that more than 70% of positive tests in retirement homes were asymptomatic, for both residents and nurses. It is now recognized that the number of asymptomatic cases depends on many factors such as age and genetic factors, and that it may involve 20% and more of infected cases. Furthermore, preliminary analyzes suggest that antibodies for an asymptomatic infectious person provide natural immunity after about 10 days. On the basis of these observations, we propose to build a stochastic epidemic model of SIR type (Susceptible-Infected-Removed) with two classes of infectives, symptomatic and asymptomatic. The corresponding periods of infection can be considered to be approximately constant, of length respectively 5 or 10 days, so that we formulate the model on a discrete time scale. In this context, the traditional SIR model is the chain binomial process, called the Reed-Frost model, which we will modify in order to incorporate asymptomatic carriers as an additional vector for the spread of the disease. There is an abundant mathematical literature on epidemic models and their applications. Much material, pioneering and more recent, can be found in the books of Diekmann and Heesterbeek (2000) and Brauer et al. (2019) for a deterministic approach, and in the books by Daley and Gani (1999) , Andersson and Britton (2000) and Britton and Pardoux (2019) for a stochastic approach. SIR epidemics relate to infectious diseases which lead to a state of final elimination, by death or immunization for example. Their stochastic versions aroused much interest and is now a standard subject of applied probability. For recent works, we refer e.g. to Lefèvre and Picard (2015) , Ball (2019) and Simon (2020) . Models incorporating several types of infectives like here have been studied by different authors. Let us mention e.g. Picard and Lefèvre (1990) , Scalia-Tomba (1990) , Ball and Clancy (1995) , Ball and Britton (2007) , Leung et al. (2018) and Lefèvre and Simon (2020) . The modelling of COVID-19 is currently the subject of much research. A large part concerns the deterministic approach but there are some papers on the study or simulation of stochastic models; see e.g. Britton et al. (2020) and Gouriéroux and Lu (2020) . An originality of the present epidemic model is that it describes the spread of an infectious disease such as COVID-19 quite realistically using a stochastic SIR process of Markovian structure. Our main purpose is to determine the exact distribution of the state of the population when the infection will die out. Approximating branching processes are also discussed and provide the reproduction number R 0 which is a classic tool for outbreak monitoring by health authorities. Note that other questions are of important practical interest. In particular, the evolution of the infection over time would make it possible to measure the pressure exerted on the hospital system. We hope to study this topic in a future work. The paper is organized as follows. In Sect. 2, we introduce a generalized chain binomial epidemic process that explicitly takes into account the existence of asymptomatic infectives. In Sect. 3, the total population is assumed of finite size so that the extinction of the epidemic is certain. We build a family of martingales for the process and determine the distribution of the final state of the population. In Sect. 4, the initial susceptible population is assumed infinite and the number of contacts per infective constant on average. We approximate the process by a branching model and deduce from it the corresponding reproduction number and the associated condition of almost sure extinction. In Sect. 5, possible testing of infected cases is added as a preventive protection against the spread of infection. We adapt the previous reasoning by martingales and obtain the distribution of the new final state of the population as well as the reproduction number under testing. In Sect. 6, vaccination of all susceptible individuals is considered at the start of the epidemic in order to seriously reduce future contaminations. The vaccine efficacy is believed to depend on specific characteristics of the susceptible, so that the initial susceptible population is now subdivided into several distinct homogeneous groups. We generalize the methods used before to obtain the distribution of the numbers of susceptibles escaping infection in the different groups, and to determine the reproduction number under vaccination. In Sect. 7, the population is considered large enough to describe the epidemic by a general branching process with two types of individuals. We give the condition of extinction of the model and we derive, when this condition is satisfied, the probability generating function of the total numbers of infectives, symptomatic or not. In Sect. 8, we numerically illustrate some of the results provided in the paper. The Reed-Frost chain binomial model is the most known SIR stochastic epidemic process in discrete time. It is concerned with a closed and homogeneously mixing population. Initially there are, say, n ≥ 1 susceptibles, m ≥ 1 infectives and 0 removed case. Each infective has a constant latent period and an infectious period that is reduced to a single point in time. At this point, it infects any susceptible present with the prob-ability p, all possible infections being independent. After that, the infective recovers or is detected and isolated, thus being removed from the infection process. Specifically, consider the time scale t ∈ N 0 . We denote by S t and I t the numbers of susceptibles and infectives present at time t. Let R t be the number of removed cases up to time t. The population being closed, we have S t + I t + R t = n + m for all t. The spread of the disease evolves according to the following bivariate Markov chain. Write Bin(k; θ) for a binomial distribution that counts successes in a sequence of k independent experiments where each gives success or failure with the probabilities θ and 1 − θ , respectively. Given (S t , I t ), the population state at time t + 1 is defined by (2.1) Obviously, since the population size is always n + m. The epidemic terminates at time T ≥ 1 when there are no more infectives in the population (i.e. with I T = 0). A key measure of the total extent of the epidemic is given by the ultimate number of susceptibles S T . The study of its distributions has received considerable attention. We refer e.g. to Lefèvre and Picard (1990) for exact results and Barbour and Utev (2004) for approximation results. The Reed-Frost model is also used as a template to build many more elaborate variants (see e.g. Martin-Löf 1986; Picard and Lefèvre 1990; Neal 2006; Ball 2019) . Incorporating asymptomatics For certain diseases, asymptomatic carriers play an important role in the spread of infection. As mentioned in the Introduction, this is the case for the current COVID-19 epidemic. On the one hand, some of those infected show clear, often severe, symptoms usually after about 1 week. Due to the control measures, these people are then treated at home or in the hospital, so that they are considered withdrawn from the population. During this week, however, they are able to transmit the infection to susceptibles. On the other hand, another part of the infected people shows no symptoms, or very mild. They acquire natural immunity after about 2 weeks, which then removes them from the infection process. Before that, they can also infect susceptibles, usually to a lesser extent due to a lower viral load. To simplify the model a little, we suppose that a symptomatic infective contaminates any susceptible present at 1 point of time with the probability p = 1 − q, whereas an asymptomatic infective can do it at 2 points of time with the same probability p a = 1 − q a . In addition, a susceptible infected becomes either symptomatic with the probability π or asymptomatic with the probability 1 − π . Take a week as the time unit, and therefore the time scale t ∈ N 0 . We denote by S t the number of susceptibles at time t, I t the number of infectives with symptoms at time t, J (1) t and J (2) t the numbers of asymptomatics at time t who begin their first and second week of infection and with sum J t = J t , R t the number of removed symptomatics (i.e. declared and treated cases) up to time t and V t the total number of removed asymptomatics (i.e. immunized persons) up to time t. Write Mult(k; θ 1 , . . . , θ j ) for a multinomial distribution resulting from a sequence of k independent experiments where each gives j possible outcomes with the respective probabilities θ 1 , . . . , θ j of sum 1. By referring to (2.1), we assume this time that given t , V t ), the population state at time t + 1 is defined by (2.2) Note that given S t − S t+1 , we thus have where n is often very large and m + m a ≥ 1 to start an epidemic. By definition, In the following, we focus on the end of the epidemic. This will happen at time T as soon as the population no longer contains any infectives (i.e. when I T = J T = 0). At this time, the state of the population is given by (S T , V T ), the final numbers of suceptible and immunized persons. Our objective in Sect. 3 is to determine the distribution of the vector (S T , V T ), and therefore also that of the number R T of removed symptomatics by (2.4) To derive the distribution of the population state at the end of the epidemic, we will apply a martingale method that has proven to be efficient and robust in epidemic theory since the pioneering paper by Lefèvre and Picard (1990) . Consider here the process constitutes a martingale provided that ν k (z) is defined by Proof Choose z arbitrarily in R, and let x, y, h 1 , h 2 be four real numbers to be fixed later. We will condition on F t , the history of the epidemic until time t. From the definition of J (2) t+1 and V t+1 in (2.2), we have In addition, the expectation in the right-hand side of (3.3) can be expanded as Using (2.3), the expectation in the right-hand side of (3.4) is given by so that after insertion in (3.4), (3.5) Thanks to (3.5), formula (3.3) becomes Let us differentiate (3.6) k times, 0 ≤ k ≤ n, with respect to x. This yields after a little rearrangement. Inside (3.7), we now take y = q k , h 2 = zq k a and h 1 = h 2 q k a = zq 2k a , which gives (3.8) To get a martingale from (3.8), it then suffices to take i.e. x = ν k (z) defined by (3.2). This leads to the announced martingale (3.1). From Lemma 3.1, we deduce a system of n + 1 relations which allows to obtain the distribution of (S T , V T ). Remember that R T then follows from (2.4). ∈ R, E( S T k [ν k (z)] S T z V T ) = n k [ν k (z)] n q km (zq 2k a ) m a , 0 ≤ k ≤ n,(3. and these n + 1 relations provide us with the distribution of (S T , V T ). Proof Let us consider the martingale (3.1) at the stopping time T . By the optional stopping theorem, we directly deduce the system of n + 1 identities (3.9). To check that (3.9) gives the distribution of (S T , V T ), we introduce in the left-hand side of (3.9) the factor n s=0 1 S T =s , which is equal to 1 of course. This leads to The relations (3.10) form a triangular system of n + 1 linear equations in the unknown Thus, the w s (z) can be calculated recursively for s = n, . . . , 0 by taking successively k = n, . . . , 0. For example, denoting f (k, z) the right-hand side of (3.10), we get from k = n, n − 1 the two equations Now, by definition, w s (z) is a polynomial in z of degree m a + n − s. Let us differentiate v times w s (z) with respect to z, m a ≤ v ≤ m a + n − s, and then take z = 0. Obviously, this gives the quantities Applying the same operation to (3.10), we then obtain the joint probability mass function P(S T = s, The distribution of the single variable S T follows by choosing z = 1 in (3.9). This gives the n + 1 relations (3.11) below with (3.12). Note that a system of this form has been obtained previously for a variety of SIR epidemic models (see Lefèvre and Picard 2015; Ball 2019 and the references therein). where ν k is defined by This is a triangular system of n +1 linear equations in the n +1 unknown probabilities It is well known that triangular linear systems like (3.11) can lead to numerical problems inherent in the solution. Specific algorithms and methods, however, can help solve the system appropriately (see e.g. Demiris and O'Neill 2006) . The distributions of S T and V T are, of course, closely related. This is particularly clear by taking k = 0 in (3.9), which gives the identity (3.13). By differentiation with respect to z with z = 1, we then obtain the simple relation (3.14) for the means. which implies for the means (3.14) Remark 1 Let us underline that as long as T is concerned, the model (2.2) is equivalent to supposing that each asymptomatic becomes immune after 1 week (instead of 2 weeks) and during this period, it does not contaminate any susceptible present with probability ρ = q 2 a . For this modified model, J (2) t = 0 and J t = J (1) t , t ∈ N 0 , and given (S t , I t , J t , V t ), the population state at time t + 1 is defined by (3.15) starting with (S 0 , I 0 , J 0 , V 0 ) = (n, m, m a , 0). Indeed, by applying to (3.15) an analogous martingale reasoning, we can obtain exactly the same n + 1 relations (3.11). This is also in accordance with intuition. The final state distribution of the model (3.15) could also be derived recursively using more direct relations. Let U t = I t + J t and consider the expectations where a(s) is some real function such as a monomial x s , x ∈ [0, 1], or an indicator ( 3.16) with the initial conditions So, the recursive scheme (3.16), (3.17) allows us to calculate all the expectations f (s, u). Returning to (3.15), we then obtain This method is simple in principle. Its main drawback, however, is that the recursions involved here take a lot of computation time. We will discuss possible improvements and compare the two approaches above in an upcoming research. When the initial number n of susceptibles is very high, it will remain almost constant during the early stages of the epidemic. The spread of infection is then expected to be reasonably estimated through a branching process. Such an approximation has been studied for a variety of SIR epidemic models (see e.g. Ball and Donnelly 1995; Lefèvre and Utev 1999; Ball and Neal 2010) . This leads to the well-established concept of reproduction number R 0 which, in general terms, measures the expected number of secondary cases by each primary case. In fact, the constraint R 0 ≤ 1 is known to be the condition for the almost sure extinction of the epidemic. Note however that R 0 is observed to be very variable in practice, especially for COVID-19 (see the discussions of Elliott and Gouriéroux 2020; Donnat and Holmes 2021). For the model (2.2), we assume that Notice that for large n, This means that the number of susceptibles contaminated per week by any given infective is roughly constant on average. In what follows, we choose to argue by examining the course of time week after week. This has the advantage of mimicking the actual spread of infection in the epidemic model (2.2). An alternative approach would be to consider all the infections caused by an individual when it becomes infected (in accordance with Remark 1). Both methods are reasonable and, as shown below, they lead to the same extinction condition. At the early stages t of the epidemic, S t = S 0 − O(1) because n is very large. Thus, by applying a coupling argument (as in Lefèvre and Utev 2010) , we can derive the following approximation: given (S t , J t ), where {Po(a t )} and {Po(b t )} are two independent Poisson processes of means a t and b t . So, under (4.1), (4.2), we may write that for any finite horizon H , where {X t } is a branching process of dynamics Thanks to this approximation, we will be able to highlight a threshold behavior for the epidemic. The result could be derived from the theory of multi-type branching processes (as used later in Sect. 7). For clarity, we prefer to give a different proof below which is self-contained and less standard. Proposition 4.1 Under the branching hypotheses (4.1), (4.2), the reproduction number R 0 is given by The infection process almost surely dies out iff R 0 ≤ 1, which is equivalent to denote the dot product in the 3-dimensional space R 3 . From the Markovian property, we get conditionally to X t = y that for all a ∈ R 3 , By the tower lemma, substituting Define the matrix We then see that (4.7) can be rewritten as Now, suppose that the finite limit x = E(X ∞ ) exists. From (4.9), it must satisfy the harmonic type identity where M τ denotes the transposed matrix to M. Hence, the limit x satisfies the matricial equation (4.10) Consider 0 < π < 1, the two limit cases π = 0 or 1 being easily discussed. Then, the matrix M τ is non-negative irreducible, and by the Perron-Frobenius theorem, M τ then has a positive eigenvalue R 0 that is simple and greater in absolute value than any other eigenvalue. As a consequence, x = 0 is the only solution to (4.10) when R 0 < 1. Moreover, a suitable coupling argument can show that extinction will also occur when R 0 = 1. Obviously, the eigenvalues z of M τ (or M) are identified by solving the characteristic equation det(M − z I ) = 0, i.e. from (4.8), (4.11) Apart from z = 0, the other non-negative solution to (4.11) is the dominant eigenvalue R 0 whose expression is provided by (4.5) as stated. In addition, a direct calculation shows that R 0 ≤ 1 is indeed equivalent to (4.6). Under this condition, we have E(X ∞ ) = 0, which implies that the infection gets extincted almost surely. Note that for the classic Reed-Frost model (π = 1), (4.5) reduces to R 0 = λ and (4.6) becomes λ ≤ 1, in accordance with standard results. The reproduction number R 0 in (4.5) is the (geometric) growth rate for the initial phase of the epidemic in a large population. It corresponds to the Perron-Frobenius eigenvalue of the next-generation matrix M defined in (4.8). As stated previously, it is obtained by following the time week after week. Indeed, the components M i, j of M give the expected numbers of infections of type j generated by an infective of type i during one week. Thus, within the matrix M, an asymptomatic in the first and second week of its infection are treated as two separate infectives. Another possibility is to adopt the model (3.15) and to define a generation matrixM by considering asymptomatic cases over their entire period of infection (two weeks), which leads toM (4.12) The Perron-Frobenius eigenvalue of (4.12) provides an alternative reproduction numberR 0 , called basic in usual approximation branching processes. This gives R 0 = πλ + 2(1 − π)β, which is precisely the left-hand side of (4.6). Thus, R 0 andR 0 provide, of course, the same extinction condition (4.6). Note that R 0 ≥R 0 in a subcritical epidemic, which means that R 0 then gives a more pessimistic indicator of the force of contagion. Conversely, R 0 ≤R 0 in a supercritical epidemic. Remark 4 A careful coupling argument as in Lefèvre and Utev (2010) allows to extend the branching approximation to the random time when the number of infectives will be o( √ n). The powerful approximation developed by Barbour and Utev (2004) is probably valid here too, but it seems much more technically difficult. One method of preventive protection against infectious diseases is to perform screening tests. Various studies have already proposed epidemic models with tests in which an infected person is isolated by quarantine if it is detected positive (see e.g. Castillo-Chavez et al. 2003; Berger et al. 2020) . For the current model, we assume that once infected, an individual can be tested and possibly detected. These tests are carried out on the basis of signs of infection, therefore for symptomatic cases but also for asymptomatic cases suspected of being infected. If the test is positive, the infective is isolated and removed from the process of contagion in the population. It should be noted that systematic screening of the population, even susceptibles, could also be considered. More precisely, any symptomatic at time t is tested independently with the probability b. The test confirms the presence of infection with the probability c. Thus, this infective escapes isolation through testing with the probability α 0 ≡ (1−b)+b(1−c). We denote by I t the number of symptomatics who escape isolation (and thus remain present) at time t. Let D I t be the number of symptomatics isolated at time t, and W I t the number of symptomatics isolated up to time t. On the other hand, any asymptomatic at time t who shows mild symptoms is also tested independently. For the first (second) infection point, the test is performed with the probability b 1 (b 2 ) and the result is positive with the probability c 1 (c 2 ). Thus, for the first week, an asymptomatic escapes isolation with the probability t ) the number of asymptomatics at time t who escape isolation when t is the first (second) infection point. Let D t . We now modify the epidemic model (2.2) accordingly. The state of the population is given by the vector (S t , I t , J with initial values (n, m, m a , 0, 0, 0, 0). Conditionally to the state at time t, the population state at time t + 1 is defined by with of course, At the end T of infection, the final state is the vector (S T , V T , W I T , W J T ). To get its distribution, we proceed as before by using a martingale reasoning. First, Lemma 3.1 is generalized as follows. Lemma 5.1 Let z, z 1 , z 2 ∈ R. For each integer k ∈ {0, . . . , n}, the process constitutes a martingale provided that ν k (z, z 1 , z 2 ) is defined by Proof Let us take z, z 1 , z 2 arbitrary in R, and let x, y, h 1 , h 2 be real numbers which will be chosen later. By conditioning on the history of the epidemic F t and then on S t+1 , we obtain from (5.1) that Using the definition of J (2) t+1 and D (2) t+1 , we rewrite the expectation in (5.3) as Given (F t , S t+1 −S t = s), we have from (5.1) that the vector (I t+1 , J t+1 ) has a multinomial distribution Mult(s; p 1 , p 2 , p 3 , p 4 ) where (5.6) Thus, the expectation in (5.5) is simply Inserting (5.5) and (5.7) in (5.4) then yields, after some calculations, We now differentiate (5.8) k times, 0 ≤ k ≤ n, with respect to x, which gives To get a martingale from (5.9), just choose y = q k , a and x such that hence x = p 1 y + p 2 h 1 + p 3 z 1 + p 4 z 2 . Remembering (5.6) and y, h 1 , h 2 above, this yields x = ν k (z, z 1 , z 2 ) defined by (5.3), and therefore the announced martingale (5.2). Now, we may apply the stopping time theorem to the martingale (5.2). As for Proposition 3.2, we then obtain a system of n + 1 simple relations which allow us to determine the desired distribution of (S T , V T , W I T , W J T ). Details are omitted here. In particular, by taking z = z 1 = z 2 = 1, we deduce the n + 1 relations (5.10), (5.11) below for the distribution of S T (in place of (3.11), (3.12)). whereν k is defined bỹ (5.11) These n + 1 linear equations provide the n + 1 probabilities P(S T = s), 0 ≤ s ≤ n. Moreover, arguing as for (3.14), we obtain simple expressions for the means of V T , W I T and W J T in terms of E(S T ): R 0 after testing For practice, it is important to know how such a test procedure affects the initial reproduction number. As before, suppose that the initial suceptible size n tends to ∞, and the parameters q and q a are of the form (4.1) and satisfy (4.2). In addition, we choose again to follow the spread of the epidemic week after week. Within this framework, the infection model in (5.1) can be approximated by the branching process the Poisson and binomial distributions involved being independent. Let α = (α 0 , α 1 , α 2 ) be the vector of testing parameters, and denote by R 0 (α) the reproduction number for the model (5.12). The following result is an easy generalization of Proposition 4.1. Under the branching hypotheses (4.1), (4.2), the reproduction number R 0 (α) is given by (5.13) The condition R 0 (α) ≤ 1 for almost sure extinction of the infection process becomes Proof It suffices to adapt the reasoning followed to obtain (4.5) and (4.6). The matrix M defined by (4.8) is now replaced by Its eigenvalues are the roots of the equation det(M − z I ) = 0, which is simplified as The threshold parameter R 0 (α) is the positive root of (5.15). We see that it is given by (5.13) and that the condition R 0 (α) ≤ 1 reduces to (5.14). As before, an approach in terms of the total number of infections generated by each infective is possible here as well. The next-generation matrixM becomes The basic reproduction numberR 0 for (5.16) is again given by the left-hand side of (5.14). The most effective measure of protection against diseases is certainly vaccination. In practice, however, a vaccine does not always exist and its efficacy depends on characteristics pertaining to susceptibles. Epidemic models with vaccination strategies have already been studied in different works (see e.g. Smith et al. 1984; Becker and Utev 2002; Sirl 2018 and the references therein). In this section, we assume that all susceptible people are vaccinated at the very beginning t = 0. The vaccine affects only susceptibility levels and its efficacy depending on characteristics proper to the susceptibles such as age, sex, genetics. In addition, it is of the leaky type, meaning that it does not affect an individual's ability to transmit the disease if they become infected. More general vaccination schemes are possible but in our context the current one is acceptable as a simplification. More precisely, we start with the general infection scheme (2.2) but with all suceptibles vaccinated as specified above. Suppose the vaccine has L possible efficacy levels on individual susceptibilities. These levels are taken fixed (non-random) and have values e l ∈ [0, 1], l = 1, . . . , L. The whole susceptible population is therefore subdivided into L groups according to these values. All possible infections are kept independent. This time, a symptomatic (asymptomatic) infective does not infect any susceptible present in a group l with the probability q l (q l,a ), for 1 ≤ l ≤ L. Thus, the novelty is that the probabilities of non-infection for a susceptible depend on the efficacy of the vaccine in the group to which it belongs. Typically, q l = q + (1 − q)e l , and q l,a = q a + (1 − q a )e l , (6.1) where q (q a ) is the probability for a susceptible of no-encounter with any given symptomatic (asymptomatic) case. Furthermore, a susceptible of group l who is infected becomes a symptomatic (asymptomatic) case with the probability π l (1 − π l ). Let us adopt the same notation as in (2.2) for the sizes of the different types of individuals, except that we add an index l to indicate their group of belonging. So, in group l, there are, at time t, S l,t susceptibles, I l,t infectives with symptoms and J (1) l,t (J As before, R t and V t are the numbers of removed symptomatics and asymptomatics up to time t. The state of the population is given by the vector {(S l,t , I l,t , J (1) l,t , V t , R t ), 1 ≤ l ≤ L}. Initially, (S 1,0 , . . . , S L,0 ) = (n 1 , . . . , n L ) (of sum n), and (I 0 , J , m a , 0, 0) . Given the state at time t, the population state at time t + 1 is defined by where the L conditional multinomial distributions are independent. By definition, S t + I t + J t + V t + R t = n + m + m a for all t. Similarly to (2.3), we see that given the vector (S 1,t − S 1,t+1 , . . . , S L,t − S L,t+1 ), We are interested in the final state of the population. For this, we will first derive a family of martingales for the process {(S 1,t , . . . , S L,t ), constitutes a martingale provided that ν l,k 1 ,...,k L (z) is defined by ν l,k 1 ,...,k L (z) = π l q k 1 1 . . . q k L L + (1 − π l )q 2k 1 1,a . . . q 2k L L,a z. (6.5) Proof We generalize the method followed to establish Lemma 3.1. Take z arbitrary in R and let x 1 , . . . , x L , y, h 1 , h 2 be real numbers not yet fixed. Given the history of the epidemic F t , we obtain from (6.2) that Considering all the possible values for (S 1,t+1 , . . . , S L,t+1 ), the expectation of the right-hand side of (6.6) is written as From (6.3), the expectation in the right-hand side of (6.7) becomes (6.8) Thus, inserting (6.7) with (6.8) in (6.6), we obtain the following formula, in place of (3.6), Now, differentiate (6.9) k l times with respect to x l , 0 ≤ k l ≤ n l , for l = 1, . . . , L. This leads to (6.10) Choose in (6.10) y = q k 1 1 . . . q k L L , h 2 = zq k 1 1,a . . . q k L L,a and h 1 = h 2 q k 1 1,a . . . q k L L,a = zq 2k 1 1,a . . . q 2k L L,a . We then deduce the announced martingale (6.4) provided which yields x l = ν l,k 1 ,...,k L (z) defined by (6.5). At the end time T , the population state reduces to the vector (S 1,T , . . . , S L,T , V T ), with R T resulting. To get its distribution, we apply the stopping time theorem to the martingale (6.4). Arguing as for Proposition 3.2, we obtain this time a system of (n 1 + 1) . . . (n L + 1) linear relations which allow us to determine recursively the final distribution. In particular, taking z = 1 gives the following relations (6.11), (6.12) for the distribution of (S 1,T , . . . , S L,T ). where ν l,k 1 ,...,k L is defined by These linear equations provide the (n 1 + 1) . . . (n L + 1) probabilities P(S 1,T = s 1 , . . . , S L,T = s L ), 0 ≤ s l ≤ n l , 1 ≤ l ≤ L. Moreover, the formula (3.14) for the mean of V T can be generalized as expected: The models with screening or vaccination generalize the initial model (2.2) by introducing several types of infectious or susceptible. They constitute particular multi-type versions of the Reed-Frost model (2.1). This class of models and their final outcome have already been studied by several authors, including recently Lefèvre and Picard (2016) and Ball (2019) . The current specific context allows a simpler reasoning to be used which provides complete results for the models concerned. R 0 after vaccination. Let us show how such a vaccination scheme affects the reproduction number. For this, we assume that all susceptible groups are initially very large, and that on average an infective contaminates roughly a constant number of susceptibles in each group. So, we write that for 1 ≤ l ≤ L, n l = l n, q l = e −λ l /n , q l,a = e −β l /n , (6.13) with n → ∞. The new parameter l above can be interpreted the initial proportion of susceptibles in the group l. As pointed out, for large n, n l (1 − q I t l q J t l,a ) = l (λ l I t + β l J t ) + O(1/n). (6.14) As before, we choose to follow the infections week after week. Then, the branching process which approximates (6.2) is given by (6.15) Arguing as in the case without vaccination, we easily extend Proposition 4.1 to the model (6.15) with vaccination. Let = ( 1 , . . . , L ) , and denote by R 0 ( ) be the corresponding reproduction number. Under the branching hypotheses (6.13), (6.14), the reproduction number R 0 ( ) is given by (6.16) The condition R 0 ( ) ≤ 1 for almost sure extinction of the infection process becomes L l=1 l [π l λ l + 2(1 − π l )β l ] ≤ 1. (6.17) Here too, it is allowed to work with the total number of infections generated by each infective. This leads to a different generation matrixM for which the basic reproduction numberR 0 is given by the left side of (6.17), as in Remarks 3 and 5. Particular case Suppose that initially, each susceptible is either vaccinated or not, and the vaccine provides full protection. In other words, there are L = 2 groups of susceptibles: those who receive a vaccine (group l = 1) in proportion 1 , and those who do not (group l = 2) in proportion 2 = 1 − 1 . Since the efficacy of the vaccine is perfect, we have e 1 = 1 in (6.1), so q 1 = q 1,a = 1 and thus λ 1 = β 1 = 0. Moreover, e 2 = 0 in (6.1), so q 2 = q and q 2,a = q a . From (6.16), the reproduction number is equal to and from (6.17), the infection almost surely goes away when 2 ≤ 1 π 2 λ 2 + 2(1 − π 2 )β 2 . (6.18) Following e.g. Becker and Utev (2002) , we can then consider the minimum vaccination coverage V .C which is necessary to prevent a major epidemic. From (6.18) and as 1 = 1 − 2 , we get V .C = max 0, 1 − 1 π 2 λ 2 + 2(1 − π 2 )β 2 . Let us return to a common situation where the infectious disease spreads moderately in a large susceptible population which remains practically constant. Motivated by the approximation of Sect. 4, we will substitute for the branching model (4.3), (4.4) a more general two-type branching process denoted by {(I t , J (1) t ), t ∈ N 0 }. In this extended model, each symptomatic i present at time t generates at time t + 1 a number C t,i of infectives, symptomatic or not in proportions π 0 and 1 − π 0 . All the variables C t,i are independent identically distributed (i.i.d.) with probability generating function (p.g.f.) φ 0 (z) = E(z C t,i ), z ∈ [0, 1]. In the same way, each new asymptomatic j 1 at time t generates at time t + 1 a number D (1) t, j 1 of infectives, symptomatic or not in proportions π 1 and 1 − π 1 . All the variables D Moreover, each asymptomatic j 2 beginning at t its second week generates a number D (2) t, j 2 of infectives, symptomatic or not in proportions π 2 and 1−π 2 . All the variables t, j 2 ), z ∈ [0, 1]. Thus, there are two novelties compared to the branching process of Sect. 4. On the one hand, the numbers of infections made per unit of time are here distributed arbitrarily (instead of Poisson law). On the other hand, the probabilities of becoming symptomatic or not now depend on the origin of the infection. For clarity, these probabilities satisfy below 0 < π 0 , π 1 , π 2 < 1, but the limit cases 0 or 1 can be treated similarly. As before, we denote by R t (V t ) the total number of symptomatic (asymptomatic) removals up to time t. The state of the epidemic at time t is then given by the vector = m a and J (2) 0 = m b (which can be non-zero). Conditionally to this vector, the state at time t + 1 is defined by where the three conditional multinomial distributions are independent. We start by deriving a martingale for the epidemic branching process. To avoid a possible explosion of infection, it should at least be assumed that each infective may not generate any infection, i.e. φ l (0) > 0, l = 0, 1, 2. Lemma 7.1 Let z, w ∈ (0, 1]. The process Proof Let us fix z, w ∈ (0, 1] and introduce y, g, h to be determined later in (0, 1]. From (7.1), we can write that given the history F t , We now need to evaluate the conditional expectation in the right side of (7.4). Recall that any symptomatic i at time t generates, independently of the others, C t,i infectives at time t + 1, each of them being symptomatic (asymptomatic) with the probability π 0 (1 − π 0 ). Thus, the I t infectives present at t contribute to this expectation for In the same way, the contribution of the J (k) t asymptomatics present at t is equal to Therefore, the desired expectation is given by (7.5) From (7.4) and (7.5), we then deduce that The relation (7.6) leads to the announced martingale (7.2) provided that (y, g, h) ∈ (0, 1] 3 and is solution to the Eq. (7.3). The question imposed by (7.3) is whether these equations admit (at least) such a solution (y, g, h) . We show below that there is a unique solution if w is taken small. Lemma 7.2 For w small enough, the system (7.3) has only one solution in (0, 1] 3 . Proof A natural approach is to use the Banach fixed point theorem, which states that in a complete metric space (E, d) , a contraction mapping F : E → E has a unique fixedpoint e * ∈ E, and from any e 0 ∈ E, the sequence e n = F(e n−1 ) converges to e * as n → ∞. Here, denote u = (y, g, h) so that the system (7.3) is of the form u = F(u), or u l = F l (u), l = 0, 1, 2. Under the usual topology, E = [0, 1] 3 is a complete space, and define the metric d(u, v) = max{|u l − v l |, l = 0, 1, 2}, u, v ∈ E. Since F maps E to E, the theorem is applicable provided that the mapping F is contractive, i.e. there exists k ∈ [0, 1) such that d [F(u), F(v) ] ≤ kd (u, v) . This can be established under the condition that z and w are small in (0, 1], using the finite-increments formulas with one variable for F 0 = wφ 0 and F 2 = zφ 2 , and with two variables for F 1 = zφ 1 φ 2 . Details are omitted here. Instead, we will follow a simple reasoning to get the result with the only condition that w is small in (0, 1]. First, we insert the equation for h into the equation for g and write (7.7) Thus, the system (7.3) for (y, g) becomes Note that φ 0 (0) > 0, φ 0 ≤ 1, ψ(0, 0) > 0, ψ ≤ 1 and φ 0 , ψ are infinitely differentiable with non-negative derivatives. Let us now examine (7.8) for any given w ∈ (0, 1]. Since the function φ 0 is continuous convex with 0 < φ 0 ≤ 1, the function w(y) ≡ w φ 0 [π 0 y + (1 − π 0 )g] − y is continuous convex with w(0) > 0, w(1) ≤ 0. Thus, (7.8) has a unique solution y ≡ y(g) ∈ (0, 1]. We also notice that the first two derivatives of y(g) are Since φ 0 , φ 0 ≥ 0, we see that y , y ≥ 0 when w is close enough to 0. (7.10) Let us then turn to (7.9). Substituting y = y(g), we obtain the equation for any z, w ∈ (0, 1]. The derivative of order 2 of ψ 0 (g) yields ψ 0 = z(ψ y y + ψ g ) = z[ψ yy (y ) 2 + ψ yg y + ψ y + ψ y + ψ gg + ψ gy y ], (7.11) in obvious notation. All the partial derivatives of ψ are non-negative. Using (7.10), we get from (7.11) that ψ 0 ≥ 0, i.e. ψ 0 is convex, if w is small. Since ψ 0 (0) > 0 and ψ 0 (1) ≤ 0, we deduce that the equation ψ 0 (g) = 0 has a unique root in (0, 1]. Finally, having (y, g) gives h by the third equation of (7.3). We focus again on the end time T of the infection process. The final state of the model (7.1) then reduces to (V T , R T ), the total numbers of asymptomatic and symptomatic cases. Let us show that for all z, w ∈ (0, 1], the system (7.3) admits a solution which is expressed simply in terms of (V T , R T ). Lemma 7.3 For z, w ∈ (0, 1], a solution to (7.3) is given by (7.12) Proof Denote by q j (z, w), j = 0, 1, 2, the three expectations in the right-hand side of (7.12). We want to show that these q j (z, w) satisfy the system (7.3). Consider the case where j = 0, for example. We define by E k,l the event that the initial present individual of type 1 generates k infectives of which l are symptomatic. From (7.1), we have Then, we can write that where we used the definition of q 0 (z, w), q 1 (z, w) and the factor w is for the initial symptomatic removed. By combining (7.13) and (7.14), we deduce that This provides the first equation of (7.3) since the variable C 1,1 is of p.g.f. φ 0 . If T is almost surely finite, we can apply the stopping time theorem to the martingale (7.2). This directly gives the p.g.f. (7.15) below for the vector (V T , R T ). Note that the condition T < ∞ a.s. is required for the martingale proof, but that the result remains valid without this restriction. Therefore, we have not indicated the condition in the statement. Proposition 7.4 For z, w ∈ (0, 1], (7.15) where [y(z, w), g(z, w), h(z, w)] is a solution to the system (7.3). Now, we want to determine the condition of extinction of the infectious process. Let us again choose to follow the spread of infection week after week. The types of infectives are labeled 0 for a symptomatic, 1 for a new asymptomatic and 2 for an asymptomatic in second week. Let M be the next-generation matrix whose elements M k,l represent the expected numbers of direct infectives of type l generated by an infective of type k. Writing μ l = φ l (1) for l = 0, 1, 2, we have (7.16) which is to be compared with the matrix (4.8). Proposition 7.5 T is almost surely finite iff R 0 ≤ 1 where R 0 is the Perron-Frobenius eigenvalue of M. Proof Since a new asymptomatic remains present for 2 units of time, there is a generational overlap in the branching process. Given the assumptions made, the matrix M in (7.16) is non-negative irreducible, and extinction will almost surely occur when its Perron-Frobenius eigenvalue R 0 , which is simple and strictly positive, does not exceed 1 (see e.g. Harris 1963, Theorem 7.1, and Haccou et al. 2005 , Section 2.4). This parameter R 0 is the reproduction number. It is obtained by solving the characteristic equation D(z) = 0 with D(z) = det(M − z I ). This leads to a cubic equation in z whose roots, real or complex, can be calculated by the Cardano method. The condition R 0 ≤ 1 implies that the product of the eigenvalues is of modulus ≤ 1, i.e. | det(M)| ≤ 1, and thus |π 2 − π 0 |μ 0 μ 2 ≤ 1. Moreover, from the graph of D(z), we can see that if π 0 μ 0 ≥ 1, the largest root of D(z) = 0 is > 1, therefore R 0 ≤ 1 also requires π 0 μ 0 < 1. Particular case When the probabilities π l are all equal (≡ π ), the spectral radius R 0 is known explicitly. Indeed, the equation D(z) = 0 reduces to which is analogous to (4.11) when μ 1 = μ 2 . Thus, the formula (4.5) is replaced by and the condition (4.6) by πμ 0 + (1 − π )(μ 1 + μ 2 ) ≤ 1. Note that for this case, we can show that the system (7.3) has a unique solution (y, g, h) ∈ (0, 1] 3 whatever z, w ∈ (0, 1]. Remark 8 As in Remark 3, the alternative reasoning in terms of the total numbers of infections per individual leads to the 2 by 2 generation matrix The associated characteristic equation is quadratic, so that the corresponding basic reproduction numberR 0 can be obtained here explicitly. Finally, we derive an expression for the means of (R T , V T ) when extinction is guaranteed. Let m = (m, m a , m b ), and define the vectors e 1 = (1, 0, 0) τ , e 3 = (0, 0, 1) τ . Corollary 7.6 When R 0 < 1, (7.17) Explicitly, this gives Proof Let us derive (7.17) for E(R T ), say. Differentiating the two sides of (7.15) with respect to w and taking z = w = 1, we get (1, 1), (7.18) since y(1, 1) = g(1, 1) = h(1, 1) = 1. For the partial derivatives involved, we now differentiate the two sides of (7.3) with respect to w and fix z = w = 1. So, we see that the vector x = [(∂ y/∂w), (∂ g/∂w), (∂h/∂w)](1, 1) satisfies the equation Since R 0 < 1, the matrix I − M is invertible, so that x = (I − M) −1 e 1 . Inserting this identity in (7.18), we then obtain the desired result. The explicit formulae given after (7.17) easily follow from the expression of (I − M) −1 . We start by considering the chain binomial model with asymptomatics introduced in Sect. 2. Figure 1 gives the distribution of the final number S T of susceptibles when initially there are n = 100 susceptibles, m = 2 symptomatics and m a = 2 asymptomatics, the probability of non-contamination by an infective is q = exp(−λ/n) with λ = 1.2 for a symptomatic and q a = exp(−β/n) with β = 0.5 for an asymptomatic, and the probability π of becoming symptomatic after infection is equal to either 0.6 or 0.8. The probabilities P(S T = s) are calculated using the formulas (3.11), (3.12). We observe that the distribution of S T is bimodal, which is a common feature in SIR epidemics. Moreover, the epidemic is more severe when the proportion π of symptomatic cases is greater (in fact, the mean of S T is lower and its variance is higher). This was expected as λ, the average number of infections per symptomatic, is greater than 2β, the average number of infections per asymptomatic. From the distribution of S T , we deduce the mean E(S T ) and we then calculate the means E(V T ) and E(R T ) using the formulas (3.14) and (2.4). Figure 2 plots these three means as a function of the probability π of becoming symptomatic. It is clear that as π increases, the proportion of new symptoms tends to increase. As these cases are more contagious, it implies that E(S T ) will decrease (left graph), so E(V T ) also while E(R T ) will increase (right graph). Figure 3 compares the reproduction number R 0 for the initial model (Sect. 4) with those obtained for the two extensions with testing of infectives (Sect. 5) and vaccination of susceptibles (Sect. 6). The parameters in (4.1) are q = exp(−λ/n) with λ = 1.2 and q a = exp(−β/n) with β equal to either 0.3 or 0.5. In the model with testing, the probabilities for an infective not to be detected by testing are (α 0 , α 1 , α 2 ) = (0.6, 0.9, 0.9). Note that the probability is naturally taken higher for an asymptomatic in the extended branching model of Sect. 7: in the left graph, as a function of the mean number of infections per asymptomatic μ (= μ 1 = μ 2 ) when μ 0 = 0.9, (π 0 , π 1 , π 2 ) = (0.7, 0.2, 0.2); in the right graph, as a function of π (= π 1 = π 2 ) when π 0 = 0.7, (μ 0 , μ 1 , μ 2 ) = (0.9, 0.4, 0.4) and (m, m a , m b ) = (10, 5, 0) than for a symptomatic. In the model with vaccination, there are L = 2 groups of susceptibles, those who receive the vaccine (group 1 in proportion 1 = 0.3), and those who do not (group 2 in proportion 2 = 0.7). The additional parameters are q l = exp(−λ l /n), l = 1, 2, with (λ 1 , λ 2 ) = (0.2, λ), q l,a = exp(−β l /n), l = 1, 2, with (β 1 , β 2 ) = (0, β), and π 1 = π 2 = π . Observe that λ 1 < λ 2 and β 1 = 0, which means that the vaccine decreases the contamination power of a symptomatic and eliminates that of an asymptomatic case. The three values of R 0 are calculated from (4.5), (5.13), (6.16), respectively, as a function of the probability 1 − π of becoming asymptomatic. Of course, testing or vaccination strategies have the effect of slowing down the spread of the epidemic and therefore of reducing the reproduction number, whatever π . When β = 0.3 (left graph), the three curves obtained for R 0 are decreasing with 1 − π , i.e. a higher probability of becoming asymptomatic weakens the epidemic. Note that 2β, the average number of infections per asymptomatic, is here small. When β = 0.5 (right graph), we see that for the model with testing, R 0 on the contrary increases with 1 − π . The reason is that the average number 2β of infections per asymptomatic is now higher, and asymptomatics are less likely to be tested and detected, making them contribute more to the diffusion of infection. For the model with vaccination, a large value of 1 − π results in a very low R 0 since a vaccinated asymptomatic is no longer contagious. We now move on to the extended branching process for epidemics of Sect. 7. Suppose that μ 1 = μ 2 ≡ μ, i.e. the mean numbers of infections per asymptomatic are equal for both types. Figure 4 shows the curve of R 0 as a function of μ when the mean number of infections per symptomatic is μ 0 = 1. These probabilities of becoming symptomatic are π 0 = 0.7 and π 1 = π 2 for three different values, which gives three curves. The corresponding values of R 0 are obtained as Perron-Frobenius eigenvalue of the matrix (7.16). In all cases, R 0 obviously increases with μ. Observe that the curves change order at their intersection point μ = 0.5. This is because when 2μ becomes greater than μ 0 , an asymptomatic has, on average, more infections than a symptomatic, so smaller values of π 1 = π 2 lead to more serious epidemics. Figure 5 gives the curves of the means E(V T ) and E(R T ) calculated from (7.17). In the left graph, these curves are represented as a function of μ ≡ μ 1 = μ 2 when μ 0 = 0.9, (π 0 , π 1 , π 2 ) = (0.7, 0.2, 0.2) and (m, m a , m b ) = (10, 5, 0). They of course increase with μ and go to ∞ when μ → 0.528 (i.e. R 0 → 1). Note that the speed of increase is higher for E(V T ). This is not surprising since the higher the value of μ, the higher the number of asymptomatics in the population. In the right graph, the curves are represented as a function of π ≡ π 1 = π 2 when π 0 = 0.7, (μ 0 , μ 1 , μ 2 ) = (0.9, 0.4, 0.4) and (m, m a , m b ) = (10, 5, 0). As we have already seen in Fig. 2 , increasing π implies that E(V T ) decreases and E(R T ) increases. An epidemiological model of malaria accounting for asymptomatic carriers Susceptibility sets and the final outcome of collective Reed-Frost epidemics An epidemic model with infector-dependent severity The final outcome of an epidemic model with several different types of infective in a large population Strong approximations for epidemic models Applications of branching processes to the final size of SIR epidemics Evaluation of vaccination strategies for SIR epidemics on random networks incorporating household structure Approximating the Reed-Frost epidemic process Protective vaccine efficacy when vaccine response is random An SEIR infectious disease model with testing and conditional quarantine Stochastic epidemic models with inference A mathematical model reveals the influence of population heterogeneity on herd immunity to SARS-CoV-2 Mathematical models of isolation and quarantine Implications of asymptomatic carriers for infectious disease transmission and control Computation of final outcome probabilities for the generalised stochastic epidemic Mathematical epidemiology of infectious diseases: model building, analysis and interpretation Modeling the heterogeneity in COVID-19's reproductive number and its impact on predictive scenarios Uncertainty on the reproduction ratio in the SIR model. Papers SIR model with stochastic transmission Branching processes: variation, growth, and extinction of populations A non-standard family of polynomials and the final size distribution of Reed-Frost epidemic processes Risk models in insurance and epidemics: a bridge through randomized polynomials Polynomials, random walks and risk processes: a multivariate framework SIR-type epidemic models as block-structured Markov processes Branching approximation for the collective epidemic model Dynamics in Reed-Frost epidemics Who is the infector? Epidemic models with symptomatic and asymptomatic cases Symmetric sampling procedures, general epidemic processes and their threshold limit theorems Multitype randomized Reed-Frost epidemics and epidemics upon random graphs A unified analysis of the final state and severity distribution in collective Reed-Frost epidemic processes Asymptomatic infections: the hidden epidemic On the asymptotic final size distribution of epidemics in heterogeneous populations 2020) SIR epidemics with stochastic infectious periods Assessment of the protective efficacy of vaccines against common diseases using case-control and cohort studies Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations Acknowledgements We thank the referees and the associate editor for their many helpful comments and suggestions. C. Lefèvre received partial support from the Chair DIALog sponsored by CNP Assurances.