key: cord-0946207-5fkk0ws0 authors: Gandolfi, Alberto; Aspri, Andrea; Beretta, Elena; Jamshad, Khola; Jiang, Muyan title: A new threshold reveals the uncertainty about the effect of school opening on diffusion of Covid-19 date: 2022-02-22 journal: Sci Rep DOI: 10.1038/s41598-022-06540-w sha: fb3d60f6faf83ae9eecfeac56e91a19aa1f98b81 doc_id: 946207 cord_uid: 5fkk0ws0 Studies on the effects of school openings or closures during the Covid-19 pandemic seem to reach contrasting conclusions even in similar contexts. We aim at clarifying this controversy. A mathematical analysis of compartmental models with subpopulations has been conducted, starting from the SIR model, and progressively adding features modeling outbreaks or upsurge of variants, lockdowns, and vaccinations. We find that in all cases, the in-school transmission rates only affect the overall course of the pandemic above a certain context dependent threshold. We provide rigorous proofs and computations of the thresdhold through linearization. We then confirm our theoretical findings through simulations and the review of data-driven studies that exhibit an often unnoticed phase transition. Specific implications are: awareness about the threshold could inform choice of data collection, analysis and release, such as in-school transmission rates, and clarify the reason for divergent conclusions in similar studies; schools may remain open at any stage of the Covid-19 pandemic, including variants upsurge, given suitable containment rules; these rules would be extremely strict and hardly sustainable if only adults are vaccinated, making a compelling argument for vaccinating children whenever possible. √ β 12 β 21 S 2 (0)/S 1 (0) to make sense of (2)), analogous to the number of deviations away from a mean, used to describe a transition as plotted in Fig. 4 . We take α = 2 in "Parameter calibration" section. Since in concrete cases, see "Parameter calibration" in "Methods" section, β 12 , β 21 << β 22 the right hand sides of (2) and (3) are close to β * 11 . Calculations are done in a linear approximation of the SIR model, which applies to the Covid-19 pandemic as the numbers of active cases are kept relatively low by containment measures in the early stages of the outbreaks 26 . In the linear approximation it is possible to formally compute the total number of cases up to a certain time t , which corresponds to when a lock down is imposed. If the target is to contain the increase in the number of total cases up to t to a given percentage ε , an explicit formula allows to compute the maximal allowed value of β 11 . In a realistic example with total population and recovery rate γ normalized to 1, setting S 1 (0) = 0.2, S 2 (0) ≈ 0.8, β 12 = β 21 = 0.5, β 22 = 2 , see "Parameter calibration" in "Methods" section, the critical point is β * 11 ≈ 8 . Assuming an initial fraction of 3 × 10 −5 of active cases in Subpopulation 2 and none in Subpopulation 1, a rescaled time frame of t = 5 (corresponding to approximately 50 days), and ε = 0.3 , a suitable value of α gives β 11 ≤ 6.344. The first part of Fig. 1 , for t ∈ [0, 5] , shows a simulation of the active cases with the above values. One can see that school opening has a moderate effect for small values of β 11 , and then the effect becomes dramatic as the values increase past the critical point. It follows that closing schools, i.e. setting β 11 = 0 , is of limited impact if the reproduction rate β 11 S 1 (0) in school is somewhat lower than β 22 S 2 (0) , the external reproduction rate, and of substantial impact otherwise. This provides harmless school opening options, assuming that one has access to the reproduction rates in the subpopulations. The phase transition is preserved under lock-down, albeit with a different critical point. An analogous effect takes place when a lockdown is imposed. If at some time t transmission rates are reduced to values β ij , corresponding to a subcritical reproduction number, then the effect of β 11 on the total number of active cases undergoes the same phase transition as during the outbreak, but with critical point More precisely, let (1) β * 11 = β 22 S 2 (0)/S 1 (0). (2) β 11 < β 22 S 2 (0)/S 1 (0) − α β 12 β 21 S 2 (0)/S 1 (0), (3) β 11 > β 22 S 2 (0)/S 1 (0) + α β 12 β 21 S 2 (0)/S 1 (0), . www.nature.com/scientificreports/ indicate the attack rate of the epidemic, i.e. the fraction of the initially susceptible population that is eventually infected by the disease in the course of the epidemic from t to complete eradication. It turns out that a sufficient condition to ensure that �S(β 11 ) does not exceeds (1 + ε)�S(0) is where F (see (26) below) is a function that depends on the proportions of active cases and susceptible individuals at time t. In a realistic example continuing the one for the outbreak, with β 12 = β 21 = 0.25, β 22 = 1, we get β * 11 ≈ 4.7630 . In addition, we take ε = 0.3 , see "Linear approximation during the initial phase of anoutbreak or new strain upsurge" in "Methods" section; in order to contain the increase in attack rate to no more than 30% one needs now to have Although in a different scenario, this is smaller than the value 6.344 found in the outbreak, as there the aim was just to avoid producing an even more extended diffusion of the infection. The second part of Fig. 1 , for t ∈ [5, 18] , illustrates active cases in the lockdown scenario, with the above values of the model parameters. When considering a complete outbreak-lockdown cycle, the attack rate undergoes a similar transition, depending on the values of the two transmission rates β 11 and β 11 . If the pair is sufficiently closed to (0, 0), then there is little change in �(S) , while there is a drastic change for larger values of the two transmission rates (see Fig. 7 ). out, the total number of cases from a restart of the epidemic to the complete disappearance due to vaccination undergoes an analogous phase transition, with threshold The attack rate is only moderately changed for β 11 below the threshold, while the outcome of the vaccination process is substantially disrupted for larger values of β 11 . Notice that if β 11 < β * 11 then the in-school reproduction number is R S = β 11 S 1 (0) < 1 i.e. R S is subcritical. In a realistic case, continuing with the data from the example above, we now suppose a vaccination program is introduced targeted to a 60% coverage in about 3 months (Israel kept this pace at the time we are writing, with schools almost completely closed). In Fig. 1 , this corresponds to t ∈ [18, 31] , and now S 1 (0) = 0.198872 is the susceptible at the start of this simulation, which is equal to the susceptible population calculated for the end of the previous simulation for lockdown (corresponding to t = 18 in Fig. 1 ). Then R S = 0.198872 × β 11 , and the Figure 1 summarizes the numbers of active cases in the three scenarios we have analyzed: the cyan curve corresponds to closed schools, while the green one is a subcritical pattern; the red curves, instead, show the risk that the pandemic spirals out of control because of insufficiently controlled school opening. Notice that the explicit values that we give in Formulas (1), (4) and (6) are relevant from the theoretical point of view, as they indicate that the critical thresholds are different. Their specific values, however, could be hard to estimate from these formulas. The initial fraction of infected in (1), for instance, is almost impossible to estimate, due to the initial absence of awareness and testing. Other methods and more details models would be needed for a careful estimation of the threshold in concrete cases. An analogous behavior takes place in more elaborate and realistic models, involving presymptomatic, asymptomatic, testing, isolation etc. Critical values appear for the in-school transmission rate, below which the effect of school opening on the epidemic trajectory is extremely contained. We provide simulations in "A SPIAR model" in section (see Fig. 10 ), and evidence from case studies here below. Evidence of phase transition appears in several data driven case studies. The effect of a phase transition seems to appear in all data driven studies (see "Other case studies" in section). Most studies reach a definite conclusion: in some cases, the data or the model after calibration correspond to a subcritical regime, so that the study ends up asserting the almost irrelevance of school opening on the pattern of the epidemic for all the analyzed cases; in other cases, the study determines a supercritical setting, and then comes to the opposite conclusion. Some works include one or more parameters that can be modulated to envision the effect of school reopening. In these cases, one can see the effect of a sharp transition from a subcritical, acceptable reopening, to an excessively impactful one. In España et al., Figs. 4 and 5, for instance, one can see that up to 50% capacity the effect of opening schools is almost negligible, while it becomes substantial above 75%; this is a likely indication of a critical point between these values 17 . For convenience, their Fig. 4 is reproduced here in Fig. 2 . A very detailed study of school opening in The Netherlands is conducted in Rozhnova et al. , and their conclusions are a clear description of the phase transition 12 . Using a data driven, elaborate model, Rozhnova et al. claim that their "analyses suggest that the impact of measures reducing school-based contacts depends on the remaining opportunities to reduce non-school-based contacts 12 . If opportunities to reduce the effective reproduction number ( R e ) with non-school-based measures are exhausted or undesired and R e is still close to 1, www.nature.com/scientificreports/ the additional benefit of school-based measures may be considerable, particularly among older school children. " The first scenario of Rozhnova et al. corresponds to a subcritical in-school transmission rate, so that the effect of closing schools would be very moderate. The second scenario seems to correspond to an in-school transmission rate around the critical value, so that both containment, in-or out-of-school, are effective 12 . Yuan et al. uses a detailed compartmental model and data from the second semester 2020 in Toronto, an outbreak context, to estimate the likely impact of school opening; with parameters estimated and collected from literature, the paper finds that opening school has little effect on the overall course of the pandemic: in all scenarios presented in their Figs. 2 and 4 the difference between school opening and closure is extremely contained 16 . The findings of the research is then consistent with our phase transition scenario. In particular Yuan et al., find that "school reopening was not the key driver in virus resurgence, but rather it was community spread that determined the outbreak trajectory"; in other words, the parameters of the models, although not explicitly given in the paper, are such that the external transmission is preponderant 16 . As an additional finding, it is observed in Yuan et al. that, according to their model, "brief school closures did reduce infections when transmission risk within the home was low" 16 : in this case, a reduced transmission rate outside makes the one in school likely supercritical. The role of phase transitions. Phase transitions, like the one occurring at R 0 = 1 or the one we detect for in-schoool transmission rates, are fundamental in epidemiology 23 . They consist of the fact that changes in one parameter, the in-school transmission rate in our case, produce almost no effects except when the threshold is crossed; at that point, however, a small change in the parameter determines completely different behaviors. The presence of a threshold for the in-school transmission rate can explain the divergent conclusions of data driven studies, as they might have been observing two different phases. The presence of a phase transition can also dramatically disrupt forecasts based on calibrated compartmental models: one calibration might lead to a subcritical phase, in which the transmission in school is irrelevant, and another, based on possibly similar data, might lead to a supercritical phase, in which in-school transmission is the driving factor of the pandemic. This phenomenon is known to affect epidemic forecasts 27 , and we think it might be the reason beyond the mentioned conflicting conclusions of several studies. (a) Synthetic data of daily active cases in the two different scenarios, one with subcritical and one with supercritical in-school transmission rates, and slightly different initial number of cases. Gaussian noise has been added to make the example more realistic. (b) Reduction in daily cases due to school closure in the first scenario (c) Reduction in daily cases due to school closure in the second scenario www.nature.com/scientificreports/ To make things worse, even retrospective studies trying to evaluate the role of school openings or closures on the evolution of the pandemic run the risk of being completely untrustworthy. Covid-19 data are affected by enormous errors, due to the presence of asymptomatic, lack and partial reliability of testing, difficulty in assessing close contacts etc. It follows that estimation of parameters for both statistical and model based studies are affected by large errors. In the presence of a threshold, even small errors can lead to incorrect attribution of the situation under observation to one phase, or to conflicting attributions to two opposite phases by different studies. In such scenario, a retrospective study could misclassify the effect of school opening or closure; and different studies even based on almost the same data might end up reaching opposite conclusions. We explicitly illustrate this phenomenon with a simulation in the next section. Finally, awareness about the presence of a phase transition suggests the type of measurements that could be carried out, analyzed and finally released to the public. In our case, for instance, one could consider adapting our model to specific local situations, and then measuring in-school transmission rates; these can then be used as basis for local policies about school opening, and also as a possible public indicator of the potential risk of interventions on schools. The information that the in-school transmission rate is approaching a critical level would certainly stimulate and justify the reinforcement of containment measures. The findings about vaccination strongly support vaccinating children as well. Confounding effects on retrospective studies. In the noisy, synthetic data in Fig. 3a , the number of daily infected in a population have been generated with the same parameters, except that In school transmission rates are supercritical in the first case, and subcritical in the second. But the different number of initial cases, a value that is subject to errors of various order of magnitudes and is quite arbitrarily assigned in the various studies, makes the two trajectory basically indistinguishable. In a retrospective study one is forced to assign an initial value to the number of infected, and then estimate other parameters from the observations. Both scenarios are then plausible, depending on the chosen initial values. As Fig. 3b confirms, the research would conclude in the first scenario, that closing schools would have been basically useless. In the second scenario, however, the opposite conclusion would be drawn, as illustrated by Fig. 3c . We have identified the presence of different phases for the effect of the in-school transmission rates on the course of a Covid-19 like epidemic. Such results provide evidence in favor of keeping the schools open when specific epidemiological conditions are met and preventive measures are respected: the key condition is that the transmission rate in schools must be kept below a certain threshold that depends on the situation. As the threshold might not be easily determinable nor achievable, however, there can be contingent motives for school closure if policy makers, as it happened in most locations during the Covid-19 pandemic, are not aware or able to exploit the critical threshold. Awareness about the critical threshold can in any case suggest directions for the analysis of locally adapted models, data collection and exploration, and public release, and can give policy makers sound instruments for containing school closures. In addition, the presence of a threshold is the likely cause of the opposing views that many studies have presented, some asserting almost irrelevance of school opening, and others pointing to its significant effects. Minimal changes in the overall conditions, or in values of the estimated parameters may determine one phase of the other; this may result in different attributions of responsibility to school opening, and creates the possibility of an arbitrary identification of the phase due to parameter estimation in the presence of very noisy data. www.nature.com/scientificreports/ Finally, we have seen that with a vaccination campaign being carried out largely for out-of-school individuals only, there is a threshold below which schools can still be opened; it corresponds, however, to an internal reproduction number that would eradicate the virus if schools were completely isolated. As measure to achieve such a low reproduction number are highly demanding, this level of containment seems to be sustainable for brief periods only, after which vaccination for children becomes the only viable possibility to return to normality. Although the presence of a phase transition in the effect of the school transmission rate in the overall course of the epidemic seems to have been unnoticed so far in the literature, there are many works related to ours. For a different perspective, focused on the sustainability of opening from the point of containing the number of cases of a single school, one can see 25 . Compartmental models with two subpopulations are discussed in many works in general terms 28 ; and then applied to the school opening issue in data driven analyses 7, 11, 16, 17 : we discuss the relation of some of these results with our work in "Other case studies" in "Methods" section. Finally, other papers 6-8 make a purely statistical evaluation of the effect of school opening (see "Other case studies" in "Methods" section). Our work has several limitations. Our results are based on abstract, simplified models, and, although they seem to be stable when more detailed features are included, we cannot ensure that they always take place in more complex models. Even when a critical value can be estimated, ensuring that the transmission rates are below their relative thresholds is clearly a matter of distancing, testing, and other measures 29 . We do not elaborate here on how to develop a set of possible interventions, and on how to measure their success in containing the transmission rates in schools. There are several directions for future work and research. From the practical point of view, our analysis needs to be adapted to local and contingent situations adding specific details to the model, and collecting and analyzing appropriate data before becoming a viable tool for policy makers. From the mathematical point of view, it would be interesting to ascertain the presence and behavior of the phase transition in the non linear models. Also, it would be relevant to explore the analogous phenomenon when the cross terms are not small with respect to those in the main diagonal, a situation which could explain the dynamics of vaccinated vs. unvaccinated population. Finally, one could evaluate the presence of similar phenomena with more than two subpopulations, in order to detect which combinations drive the pandemic, and which internal transmissions can be disregarded up to a certain threshold. Compartmental models. In order to evaluate the effect of school opening on the course of the epidemic we use compartmental models, as they proved capable of predicting the courses of outbreaks in many instances 30 . We start with the simplest SIR model with unit total population, and two subpopulations i = 1, 2 where Subpopulation 1 refers to students (in-school) while Subpopulation 2 includes the remaining population including teachers and staff. The size of Subpopulation i is n i ; we indicate by S i , I i , R i the susceptible, infected, and recovered individuals, respectively, within Subpopulation i. By definition, Subpopulations 1 and 2 are complementary, so we avoid any double counting. The transmission rate from Subpopulation j to i is indicated by β ij . More features are added later on. We make a sequence of theoretical claims concerning the effect of the contact rate in the subclass representing schools. Most of the claims are verified in suitable linear approximations of the SIR model; each result is then complemented with numerical simulations. The linear approximations give very close approximations of the non linear model as in the entire course of the current COVID-19 pandemic the proportion of active cases I = I 1 + I 2 is kept relatively low by either containment measures, lockdowns, or vaccinations: until the time of writing this work the average, taken from publicly available data, of the maximum number of active cases in the most exposed countries is around 1%, SIR model with two subclasses. We first consider the simplest model of interest, represented in terms of a coupled SIR system Notice that the recovery rate is the same in the two subpopulations as for COVID-19 they seem to depend on the severity of the infection but not directly on age [31] [32] [33] , and time is rescaled so that it is equal to 1. This makes time unit of about 10-14 days 34 . In addition, β 12 , β 21 are generally smaller than β 11 , β 22 28, 35 . We intend to compare the attack rates �S( between two suitable times t a < t b , as function of the in-school transmission rate β 11 ; here β 11 = 0 corresponds to schools being closed, and β 11 > 0 corresponds to schools being open with varying degrees of physical distancing and other containment measures in place. Parameter calibration. Alongside the rigorous proofs for the linearized models, we perform several simulations with realistic parameters, which are calibrated as follows. Time is rescaled so that γ = 1 , a unit being approximately 10 days. To calibrate transmission rates β i,j , we start from the ratio of contact rates as can be extracted from Prem et al. 35 . This is a pre Covid-19 accurate study of contact rates, and we assume that the ratios of contact rates has remained approximately the same during the pandemic, with absolute values modified by awareness and measures. There are no equivalent studies for the pandemic period, and the values identified in some local studies 28, 36 are in agreement with what we find. According to this scheme, the cross transmissions rates β 12 and β 21 are calibrated in relation to β 22 ; Soyoung et al. estimate β 12 /β 22 ≈ 7/47 in the early times of the pandemic 28 . A slightly larger value of this ratio is obtained by considering the typical social contacts 35, 36 , in which the cross contacts are about 1/2 of the contacts among adults, and the reduced susceptibility of children is estimated to be about 1/2 that of adults 37 : this gives β 12 /β 22 ≈ 1/4 . As our considerations work better with β 12 /β 22 small, we use the last more conservative estimate. For β 11 , it can be extracted from Prem et al. 35 that β 11 ≈ 6β 22 can be taken as first reference value, to be later varied according to school opening policies. Using the reproduction matrix A in (12) below, one can write its largest eigenvalue in terms of β 22 . Since the largest eigenvalue equals R t , the overall reproduction number of the pandemic, one can use the estimated values of R t to get an evaluation of β 22 . With the reference values above, it turns out that in fact β 22 ≈ R t ; this is another indication of the phase transition phenomenon, as for the above reference values the school transmission rate is almost irrelevant. For the outbreak and vaccination scenarios we take β 22 ≈ R t ≈ 2 , and for the lockdown β 22 ≈ R t ≈ 1. Linear approximation during the initial phase of an outbreak or new strain upsurge. A suitable linear approximation for the initial period of the first outbreak, or of any of the possible infection waves taking place after a successful lockdown, is the following from which we extract the second and the fourth equations for I 1 , I 2 . In vector form we have where Id is the 2 × 2 identity matrix and is the reproduction matrix 38 . The proof is in Appendix A. The largest eigenvalue of A is the overall reproduction number R 0 38 , and the early evolution of the epidemics depends on the size of max , and on the spectral gap max − min . This is the so-called slaved phase, in which the active cases of both populations are both lead by approximately the same exponential growth 38 . Dependence of the largest eigenvalue of 2 × 2 matrices from the first entry. To get a first indication of a sudden change in the effect of the in-school transmission rate β 11 , we study the behavior of the largest eigenvalue of a quite general 2 × 2 matrix as function of its first entry. Let then the following estimate holds: Theorem A.2 Let A = (a ij ) be a 2 × 2 matrix with positive entries and let max (a 11 ) > min (a 11 ) be its eigenvalues. We have that for any α ∈ 0, a 22 √ a 21 a 12 The proof is in Appendix B. Figure 4 shows an example of the change in derivatives. As a consequence, if, for instance, α = 2 as we will assume from now on, then Simple calculations in Appendix D show then that if 0 < a 11 ≤ a 22 − α √ a 12 a 21 , which is realistic since The change in the largest eigenvalue is illustrated in Fig. 5 . A phase transition for school opening during an outbreak. We now want to see a similar behavior in the active cases I 1 + I 2 in the coupled SIR model (9) . Let and denote by � , for i = 0, 1, 2 , the vectors W and V in (13) corresponding to 0 , 1 , 2 , and 0 , 1 , 2 , respectively. Note that, without loss of generality, we can assume that V i 1 > 0 , for i = 0, 1, 2. Let � I j (t), j = 0, 1, 2 denote the infected at time t corresponding to eigenvalues j , j = 0, 1, 2 respectively. Notice that, to the contrary of what happens in the next section for the lockdown case, the value of β 11 for which the denominator is zero does not correspond to a singularity: this is to be expected as it differs from β * 11 in (1), and we confirmed it numerically. Taking I as in (13), we get an explicit expression for �S(β 11 ) . For a fixed ε , representing the allowed fractional increment in the number of cases when the school is open, the allowed bound for β 11 is given by With the parameters used in "There is a phase transition in the effect of school transmissionrates on the overall epidemic course during anoutbreak (or a variant upsurge)" in "Results" section, this gives the mentioned value β 11 ≤ 6.344. (9) is suitable to model lockdown as well, provided that the reproduction rate, which is the largest eigenvalue of (12), satisfies R 0 < 1 , and that initial conditions taken at time t have a more substantial number of cases and recovered. A linear approximation of the system is possible as the overall number of active cases is never allowed to grow beyond relatively small fractions of the population, never more than 1% in most countries. With these conditions, a linear approximation is with S 1 (t) + S 2 (t) + I 1 (t) + I 2 (t) < 1. Allowed level of school transmission for a successful lockdown. Let us assume that a lockdown is applied from time t to t that successfully eradicates the virus; hence with I i (t) ≈ 0 for i = 1, 2 . From the mathematical point of view, we can take the eradication time to be +∞ as the dynamical system reaches an equilibrium with no active cases and does not change afterwards. Hence we consider Formulas (16) -(19) apply, with 0 and t replaced by t and ∞ , respectively; the quantities I i , β i,j , for i, j = 1, 2 , decorated by an overscore; and I 1 (∞) = I 2 (∞) = 0. The denominator of (19) with the above changes is singular for The numerator at β 11 = β * 11 , on the other hand, is not identically zero; this is seen by substituting the value (22) for β 11 in (19) , with the adaptations listed above: after some algebra, carried out in Mathematica TM , the numerator is seen to equal (17) I 1 (t) − I 1 (0) = (β 11 S 1 (0) − 1)I 1 + β 12 S 1 (0)I 2 I 2 (t) − I 2 (0) = β 21 S 2 (0)I 1 + (β 22 S 2 (0) − 1)I 2 , (18) I 1 = − I 1 (0)−I 1 (t)+β 12 (I 2 (0)−I 2 (t))S 1 (0)−β 22 (I 1 (0)−I 1 (t))S 2 (0) −1+β 11 S 1 (0)(1−β 22 S 2 (0))+β 22 S 2 (0)+β 12 β 21 S 1 (0)S 2 (0) I 2 = − I 2 (0)−I 2 (t)−β 11 (I 2 (0)−I 2 (t))S 1 (0)+β 21 (I 1 (0)−I 1 (t))S 2 (0) −1+β 11 S 1 (0)(1−β 22 S 2 (0))+β 22 S 2 (0)+β 12 β 21 S 1 (0)S 2 (0) . A(β 11 ) := 1 In addition, if we require that school opening does not affect more than a certain percentage the overall incidence proportion by asking that www.nature.com/scientificreports/ for some ε > 0 , then after some algebra, carried out in Mathematica TM ,we get that With the values as in (24) and (25) A complete outbreak-lockdown cycle. A confirmation of the behavior of the effect of school opening on one outbreak-lockdown cycle is shown here via a direct simulation. Continuing the numerical example of "The phase transition is preserved under lock-down, albeit with a different critical point" in "Results" section, suppose a lockdown is imposed starting from t = 5 , and a 50% reduction is achieved in the transmission rates different from β 11 ; Fig. 7 shows that as the outbreak is resolved after the lockdown, the cumulative number of cases is close to that at β 11 = 0 when β 11 and β 11 are close enough to the origin, and sharply deviates otherwise. It turns out, however, that we can make a great simplification to (27) in order to ease the mathematical analysis of the upcoming sections. Instead of assuming, as realistic, that individuals, who are vaccinated at rate v , are still partly susceptible, we can introduce a fictitious smaller vaccination rate v which gives complete immunity. Consider, in fact, the system It turns out that if v =v(1 − small (for all t) , then the curves of infected I(t) = I 1 (t) + I 2 (t) in (27) and in (28) are very close. We verify this perhaps slightly counter intuitive statement via simulations for the coupled system, and in a Lemma in Appendix E stated for a single population for simplicity. Simulating the total number of infected I(t) in (27) and in (28) , starting at t = 0 for simplicity, we see an almost perfect overlap in Fig. 8a . For comparison, we also simulated the case of β ♯ 22 = 0.5 × β 22 in Fig. 8b , which now shows a substantial divergence. For these reasons, we adopt from now on model (28) to analyze vaccination. A linear approximation to SIR with vaccination. In order to analyze (28) we develop a linearization. Notice that in the linear approximation for the initial phase of an SIR model, the terms β i1 S i I 1 + β i2 S i I 2 are taken to be zero for both i = 1, 2 . With the same assumption in the vaccination case, we get the equation S ′ 2 = −vS 2 : we therefore use the solution to this equation as linear approximation of S 2 (t) . This leads to the following linearization (26) Evidence of a phase transition in β 11 with critical point 1/S 1 (0) during vaccination. We proceed by using the linear approximation to evaluate the attack rates as function of β 11 ; in particular, we focus on the one for the external Population 2. Our calculation is done recursively, as shown in Appendix F. The following theorem summarizes the calculation. Theorem A.4 Assume that I 1 , I 2 are integrable in [0, +∞) and let for k = 0, 1, . . . . We have that The proof is in Appendix F, where we also give an explicit expression for I 2 in terms of hypergeometric functions. (30) I ′ 1 = β 11 S 1 (0)I 1 + β 12 S 1 (0)I 2 − I 1 I ′ 2 = β 21 S 2 (0)e −vt I 1 + β 22 S 2 (0)e −vt I 2 − I 2 . (31) www.nature.com/scientificreports/ To compute the attack rate for the vaccination case observe that the change in active cases is given by I ′ i , and the change in recovered cases is I i , i = 1, 2 ; the change in infected is then I ′ + I , and the attack rate is The attack rate A is divergent as β 11 approaches β * 11 = 1/S 1 (0) , see (34) , which is an indication that the linear approximation breaks down, and that this value is likely to be the critical point. In order to contain the increase in the total cases by no more than a proportion ε we need β 11 satisfying Estimate of the peak time during vaccination. A further evidence of the critical point is obtained by an estimate of the peak time of the infection from (30) . Assuming that the active cases in the two subpopulations peak at approximately the same time t , we set I ′ 1 (t) = I ′ 2 (t) = 0 . The solution is Hence, the peak time also diverges at β 11 = β * 11 . With the realistic values of the parameters used previously, and we have Figure 1 for t > 18 illustrates a simulation of the differential system, where it is seen that β * 11 = 5.02836 is the critical point for the influence of school opening on the overall epidemic. To achieve a sensible containment take ε = 0.3 , in which case (37) gives β 11 < 3.03111 , a bound also visible in Fig. 1 for t > 18. A SPIAR model. To illustrate how a phase transition mechanism also appears in more elaborate and realistic models, we develop and simulate one example. We introduce the compartments of susceptibles S i (not subjected to any virus transmission), presymptomatic P i (infected in incubation period), asymptomatic A i (infected not showing symptoms after incubation), infected where the parameters s, ξ represent the fractions of asymptomatic encountered at school and of undetected infected individuals, respectively; ε 1 , ε 2 are the fractions of symptomatic in Subclass 1 and Subclass 2, respectively; κ the rate of exit from latency period. The recovery rate γ is normalized to 1 as before, and v = 0 if there is no ongoing vaccination. Parameters have been calibrated as given in Table 1 following standard estimations appearing in literature and data studies 37, 41, 42 of 2020 and an estimation of the percentage of asymptomatic and infected people 43 . Figure 10 shows how phase transitions appear also in the SPIAR model. Here, the active cases are given by the sum P(t) + I(t) + A(t) := P 1 (t) + I 1 (t) + A 1 (t) + P 2 (t) + I 2 (t) + A 2 (t) , and three scenarios are considered, as before: outbreak, lockdown, and vaccination. In Fig. 10 , the cyan curve corresponds to closed schools, while the green one is a subcritical pattern; the red curves, instead, show the risk that the pandemic spirals out of control because of insufficiently controlled school opening. Other case studies. A survey of many detailed, data driven studies related to the effect of school opening during the pandemic shows traces of phases transition in all of them. España et al. use a detailed compartmental model calibrated on mortality and other estimated and observed data in Bogotà, Colombia, during the whole 2020 17 . The study develops various scenarios of school reopening, and evaluates its impact; the phase transition described in our work appears clearly in Figs. 4 and 5 their: one can see that up to 50% capacity the effect of opening schools is almost negligible, while it becomes substantial above 75% capacity. This leads to the conclusion that there has to be a critical point between these values. The appearance of a phase transition phenomenon in Rozhnova et al. 12 has been discussed at length in "Evidence of phase transition appears in several datadriven case studies" in "Discussion" section. Yuan et al. use a detailed compartmental model and data from the second semester 2020 in Toronto, an outbreak context, to estimate the likely impact of school opening, also discussed in "Evidence of phase transition appears in several datadriven case studies" section 16 . Di Domenico at al. analyse French data for late Spring 2020 in order to predict the effect of various forms of school reopening after the end of the lockdown; the paper only makes predictions for short periods (see their , and the exact details of the contacts and transmission rates, which are partially estimated and partly obtained from previous measurements, are not provided; still, one can see, especially in their Fig. 5 , that the transmission rates are supercritical, and school opening determines a sharp increase in the overall epidemic spreading 11 . Among the statistical papers, Iwata et al. perform a Time-series analyses using the Bayesian method on Japanese data collected during the initial lockdown, and suggests that school closure did not appear to decrease the incidence of COVID-19 6 . Matzinger et al., on the other hand, use US data from the early stages of the outbreak, and regression analysis; the study finds empirical evidence suggesting that school closings dropped infection rate to half: we can interpret this as a sign that transmission rate in schools was supercritical at that time 10 . An analysis of data gathered by a surveillance of COVID-19 cases in students and staff after reopening of schools across England showed that in-school infections were much less influential than external ones 7 ; a study of Italian data from early Fall 2020, a period of low epidemic incidence, also showed very little transmission taking place in schools 8 where W > 0 . Then, the assertion of the theorem, which is related to I , follows immediately from (40) . The rate of growth in the initial exponential phase depends on the largest eigenvalue of the reproduction matrix (12) . The result is an immediate consequence of the Perron Frobenius theorem. In fact, since all the elements of A are positive, then the eigenvector ξ associated to max has positive components while the eigenvector η associated to min has at least one negative component. Since Observe now that � H(t) = e max t 1 α (η 2 ξ 1 I 1 (0) − η 1 ξ 1 I 2 (0)) 1 α (η 2 ξ 2 I 1 (0) − η 1 ξ 2 I 2 (0)) + e min t 1 α (−η 1 ξ 2 I 1 (0) + η 1 ξ 1 I 2 (0)) 1 α (−η 2 ξ 2 I 1 (0) + η 2 ξ 1 I 2 (0)) α = det ξ 1 η 1 ξ 2 η 2 = ξ 1 η 2 − ξ 2 η 1 . α 2 a 12 a 21 + 4a 12 a 21 Proof We show the result for H , which is defined in (39) . Then, the result for I follows straightforwardly using (40) . Note that 2 > 1 > 0 and ˜ 2 >˜ 1 >˜ 0 . We proceed componentwise. Using the fact that 1 − 1 < 0 , 0 − 1 < 0 and 0 − 1 (42) and (44) we then get that is hence Therefore, for t ≥ max(t 0 1 , t 1 1 ) Analogously one can show that So picking up t > max(t 0 1 , t 1 1 , t 0 2 , t 1 2 ) and using (40) , the claim follows. Inequality (14) indicates a phase transition. In fact from now on let a 2 22 >> a 12 a 21 and α = 2 . In this case and so by (14) we have Given and if ′ max (0) << ′ max (a 22 − α √ a 12 a 21 )/4.8 then from (14) Hence, by (45) and since ′ max is increasing To give a mathematical justification of our statement consider the two systems (28) and (27) School closure and management practices during coronavirus outbreaks including COVID-19: a rapid systematic review What we know about COVID-19 transmission in schools: the latest on the COVID-19 global situation & the spread of COVID-19 in schools The psychological impact of the COVID-19 epidemic on college students in China Impact of COVID-19 and lockdown on mental health of children and adolescents: a narrative review with recommendations A cross-sectional and prospective cohort study of the role of schools in the SARS-CoV-2 second wave in Italy Was school closure effective in mitigating coronavirus disease 2019 (COVID-19)? Time series analysis using Bayesian inference SARS-CoV-2 infection and transmission in educational settings: a prospective, cross-sectional analysis of infection clusters and outbreaks in England Secondary transmission of COVID-19 in preschool and school settings after their reopening in northern Italy: a population-based study Inferring the effectiveness of government interventions against COVID-19 Strong impact of closing schools, closing bars and wearing masks during the COVID-19 pandemic: results from a simple and revealing analysis Modelling safe protocols for reopening schools during the COVID-19 pandemic in France Model-based evaluation of school-and non-school-related measures to control the COVID-19 pandemic Modeling the impact of school reopening on SARS-CoV-2 transmission using contact structure data from Shanghai Do school closures reduce community transmission of COVID-19? A systematic review of observational studies Determining the optimal strategy for reopening schools, the impact of test and trace interventions, and the risk of occurrence of a second COVID-19 epidemic wave in the UK: a modelling study School and community reopening during the COVID-19 pandemic: a mathematical modeling study The impact of school reopening on COVID-19 dynamics in Evaluating scenarios for school reopening under COVID19 COVID-19, children and schools: overlooked and at risk Transmission of SARS-CoV-2 in Australian educational settings: a prospective cohort study The challenges of modeling and forecasting the spread of COVID-19 Why is it difficult to accurately predict the COVID-19 epidemic? Phase transitions in virology Clusters of SARS-CoV-2 infection among elementary school educators and students in one school district-Georgia Planning of school teaching during Covid-19 Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe The turning point and end of an expanding epidemic cannot be precisely forecast School opening delay effect on transmission dynamics of Coronavirus disease 2019 in Korea: Based on mathematical modeling and simulation study Reopening schools during COVID-19 Mathematical epidemiology: past, present, and future Inferred duration of infectious period of SARS-CoV-2: rapid scoping review and analysis of available evidence for asymptomatic and symptomatic COVID-19 cases Duration of infectiousness and correlation with RT-PCR cycle threshold values in cases of COVID-19 Effects of Latency and Age Structure on the Dynamics and Containment of COVID-19 Effects of age and sex on recovery from COVID-19: Analysis of 5769 Israeli patients Projecting social contact matrices in 152 countries using contact surveys and demographic data Changes in contact patterns shape the dynamics of the COVID-19 outbreak in China Age-dependent effects in the transmission and control of COVID-19 epidemics Modelling Infectious Diseases in Humans and Animals Determination of an optimal control strategy for vaccine administration in COVID-19 pandemic treatment On an SE(Is)(Ih)AR epidemic model with combined vaccination and antiviral controls for COVID-19 pandemic Prevalence of Asymptomatic SARS-CoV-2 Infection Epidemiology of COVID-19 Among Children in China Assessment of the SARS-CoV-2 Basic Reproduction Number, R0, Based on the Early Phase of COVID-19 Outbreak in Italy Higher Transcendental Functions Volume I All authors contributed equally to preparing the paper. All authors reviewed the manuscript. The authors declare no competing interests. Correspondence and requests for materials should be addressed to A.G.Reprints and permissions information is available at www.nature.com/reprints.Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.