key: cord-0462902-rz4opgw3 authors: Barmparis, G. D.; Tsironis, G. P. title: Physics-informed machine learning for the COVID-19 pandemic: Adherence to social distancing and short-term predictions for eight countries date: 2020-08-18 journal: nan DOI: nan sha: 79b822a725d4eeef397d704d29669f15e8cd5ce3 doc_id: 462902 cord_uid: rz4opgw3 The spread of COVID-19 during the initial phase of the first half of 2020 was curtailed to a larger or lesser extent through measures of social distancing imposed by most countries. In this work, we link directly, through machine learning techniques, infection data at a country level to a single number that signifies social distancing effectiveness. We assume that the standard SIR model gives a reasonable description of the dynamics of spreading, and thus the social distancing aspect can be modeled through time-dependent infection rates that are imposed externally. We use an exponential ansatz to analyze the SIR model, find an exact solution for the time-independent infection rate, and derive a simple first-order differential equation for the time-dependent infection rate as a function of the infected population. Using infected number data from the"first wave"of the infection from eight countries, and through physics-informed machine learning, we extract the degree of linear dependence in social distancing that led to the specific infections. We find that in the two extremes are Greece, with the highest decay slope on one side, and the US on the other with a practically flat"decay". The hierarchy of slopes is compatible with the effectiveness of the pandemic containment in each country. Finally, we train our network with data after the end of the analyzed period, and we make week-long predictions for the current phase of the infection that appear to be very close to the actual infection values. The COVID-19 pandemic started in December 2019 in China and subsequently spread fast in the rest of the world. After an initial "hesitant" approach, most countries essentially adopted rules of social distancing that also originated in China. Several countries delayed the imposition of measures and, as a result, saw large numbers of infected persons and deaths. Other countries acted very swiftly and managed to control the infected numbers and especially the mode of spreading. There was an initial discussion related to "herd immunity" that was in part attempted by some countries, but soon the basic global approach was that introduced by China, i.e. social distancing. However, the degree and swiftness of social distancing were different in each country; in Italy and Spain, for instance, there was an initial delay while Greece acted very quickly and with strong measures. Although ultimately the effectiveness of any measure is reflected in the number of deaths, in this work, we use the more error-prone infection data for a number of reasons. The infection data are representative of the dynamics of the disease at the country level, even though they clearly depend on the number of tests performed. At the initial phase, the test availability was limited, and thus it was used on a need to be the basis and, as a result, targeted more closely individuals with symptoms. Additionally, since the COVID-19 pandemic affects people of older ages primarily, the death data are strongly age biased and thus do not reflect the true dynamics of the spreading that leads to these deaths. In an earlier publication, the first version of which appeared in the arXiv on March 31, 2020, i.e. right in the middle of the pandemic, we used a Gaussian hypothesis for the spreading of the disease and number of infected persons and predicted the spreading and the horizon of the first wave 1 . We showed that this specific functional dependence originated from the imposed measures and, in particular, from an approximately linear reduction in the infection rate α(t) as a result of imposed measures. This hypothesis proved to have two-fold usefulness: On one hand, it gave a good prediction for the horizon of the epidemic in countries like Greece, Italy, and Spain while the measures were in effect. On the other hand, for countries such as the US and UK where measures either did not enforce in full strength or were not applied fast enough, the prediction of the model based on the Gaussian hypothesis was rather poor. Although this was expected, it nevertheless gives a very good way to assess now, i.e. after the fact, how efficient were the measures in these and other countries. This may be done by evaluating from the real data an effective number that gives a degree of the harshness of imposed measures, adequate timing, etc. This number, denoted by σ, is the slope of the assumed linear dependent decay of the infection rate coefficient. Large σ means that the effective measures where drastic and applied on time while, in the other extreme, σ ≈ 0 signifies the practical absence of measures. In order to perform the present study, we use the following approach: We start with an SIR model 2 and use analytics in order to derive a differential equation of the infection rate α(t); this equation contains the information on the individual infection percentage in the population. We then take the data for the country's infected population and estimate the infection rate α(t). This step is done by using Machine Learning (ML) techniques and, in particular, by using physics-informed neural networks (PINN). We pre-train the latter on simulated SIR data and subsequently train it on each country's reported infected data. Instead of assessing a general α(t) curve, we assume a linear functional dependence explicitly; its slope σ is the result of the ML procedure we apply. Once the infection rate is known, we validate the resulting SIR model to the country's data and then vary sigma to see the changes in the epidemic. This procedure gives a clear picture of both of the effective measures in each country but also their efficiency. The assumption of linear decay in α(t) with slope σ is tantamount to an effective linearization to the actual infection rates. Clearly, other, more complex forms may be assumed. We find that this simple form can efficiently capture the nature of the phenomenon and give a simple quantitative estimate of the imposed measures. The values of the slope σ are obtained directly through ML and, thus, in a sense, are directly derived from the infection data. Thus, we may link each infection curve with an effective decay slope σ that denotes the overall control that the measures exercise on the infection phenomenon. Since the approach is fundamentally data-driven, the knowledge of a particular slope gives a handling on the possible measures exercised. Furthermore, once the PINN we develop works well, we may use it to make predictions. Specifically, we use data from the second phase of the spreading, that we assume starts after the initial decline in the infection, train the network with this data and make short term predictions for the current period. The structure of the paper is thus the following. In the next section II, we map the SIR system of two first-order ODE's into a unique second-order one that, in the case of constant α(t), may be solved exactly. Subsequently, in section III, we derive a first-order equation for α(t) and write its general solution of arbitrary time dependence in the infected population. We verify explicitly that when the population dynamics is Gaussian, the dominant decay in α(t) is linear, as discussed in a more restricted form in 1 . Subsequently, in section IV, we apply our ML arsenal to derive α(t) from each country's data and determine the spectrum of σ's for eight countries. We use this information back to the SIR model to investigate the actual as well as other hypothetical scenarios for the evolution of the epidemic in each country. This step gives a clear picture of the effectiveness of measures in each country. In section V, we use the data from the first phase of the pandemic for the eight countries, train the PINN with this data except for the last week of available data and subsequently make predictions for the evolution during that week and compare with the available data. Finally, in section VI, we summarize our findings and conclusions. In Appendix A, we present the solution of the SIR model obtained through the exponential ansatz as well as approximate solutions in extreme limits to demonstrate the infection's behavior. In Appendix B, we present a flowchart that details the ML approach used in this work. The simple Susceptible-Infected-Removed (SIR) infection model is very powerful in determining qualitative but also quantitative aspects of the COVID-19 pandemic 2 . The basic equations are where S ≡ S(t), I ≡ I(t) are the percentage of susceptible and infected individuals respectively and the infection and removal rates α ≡ α(t), µ = µ(t) respectively are functions of time in general. We introduce the variable q(t) through the ansatz: Upon substitution to the set of Eqs (1,2) we obtainṠ Using Eqs. (4,5) we obtain a closed equation for q, ie. Equation (7) is a unique second order equation that fully captures the dynamics of the SIR infection model. While it is highly nonlinear, it is nevertheless quite useful in determining the infection dynamics since it is general and contains the arbitrary time dependence of both the infection and removal rates. It will be used subsequently in the application of ML techniques to the COVID-19 infection data. In the case of constant infection and removal rates it can be solved exactly; this solution is given in Appendix A. We start with the general Eq. (7) with time dependent infection rate α(t) and for the sake of simplicity we assume that the recovery rate µ(t) ≡ µ is a constant; in this case the expression of Eq (7) simplifies to ν = µt and thus (7) is rather involved in several cases we might be interested in the inverse problem where although we know the infection data we are not able to asses directly the applied measures α(t) that generates it. We know, for instance, that a monotonic linear drop in the infection rate, as for instance introduced by gradual social distancing measures results in an approximately Gaussian evolution 1 . We may thus write Eq. (7) as: The Eq. (8) is a Bernoulli equation 3 that can be turned into a linear first order equation by using the transformation The general solution thus of Eq. (8) obtained through the solution of Eq. (11) is where C is an arbitrary constant. Let us now consider the case where the infected population behaves similar to a Gaussian function 1 , keeping however also a linear time-term in the exponent that provides some time asymmetry, i.e. take Simple algebra leads to and thus the solution of Eq. (12) becomes Finally, We note that the dominant term is that of linear decay since at longer times and β < 0 the Gaussian term in Eq. (21) essentially disappears while the exponential term also decays when µ > γ. In general, of course, the functional dependence of α(t) is more complex and in cases with strong asymmetry introduced by γ we have distinctly nonlinear decay. We observe thus how significant is the precise functional form of the time dependent infection rates for the general evolution of the SIR modeling of the infection phenomenon. The exact mathematical analysis of the SIR model is important for the analysis of the data through ML techniques. The COVID-19 "first wave" started at different times in various countries and had a completely different evolution. In countries where very restrictive measures similar to those of China were imposed an effective spreading control swiftly was accomplished. Other countries that either delayed or imposed partial or essentially no measures saw larger numbers in the infected population and slower decay in the infected numbers. Here we make no judgment as to whether measures were "good" or "bad", but we simply want to be able to extract the presence of the measures from the dynamics of the infected population. Specifically, we would like to see what is the imprint of social distancing in the distribution of the infected population across the eight model countries we follow. To accomplish this, we use a strategy that utilizes methods from Artificial Intelligence and in particular Machine Learning (ML). The basic assumption in our approach is that the SIR model can capture the essentials of the epidemic in each country. A direct consequence of this assumption is that we can use simulated data from the SIR model to pre-train the specific neural networks we use for each country. The specifics of the application of ML in this problem are detailed in the Appendix. The application of ML techniques to data often suffers from the fact that data are considered "pure" with no connection to a specific phenomenon. A remedy towards introducing specificity is through the use of physics-informed techniques where the ML processes, typically those involving Artificial Neural Networks (ANN), are restricted imposed by physical laws in mathematical form 4 . In the specific problem, the SIR equations play this role and put strict bounds to the ANN used for simulating the phenomenon. Once we have a physics-informed network that is trained on the infected data of the specific country, we may use it to extract the presence and persistence of the social distancing measures typified through the function α(t). Since the ANN finds a general decay and also given the discussion in the previous section, we posit a linear dependence in the form α(t) = σ 0 + σt, where the intercept σ 0 and the slope σ are determined through the ANN. The slope σ, in particular, is a parameter with an important physical significance, since it estimates within the linearized model the degree of efficiency of social distancing. In other words, a large value in σ describes in an average way a country that followed through the first wave strict social distancing measures while, on the contrary, small values in the slope denote much looser adherence to measures. We note that these measures are not necessarily the externally imposed ones but also include the self-imposed measures. Specifically, in this work, we approximate each country's daily reported cases using a deep neural network containing one input node, five hidden layers of 100 nodes with a "sigmoid" activation function, and one output node. Initially, we use simulated SIR data and an arbitrary linear function for α(t) with a constant value for µ, to train the model. The model is trained, using a custom training loop, by minimizing the mean squared error loss on the data, M SE D : denote the set of the reported, x i = ln(I i ), and corresponding predicted cases,x i = model(t i ) and the mean squared error loss defined by Eq.(7) with α = α(t) = σ 0 + σt and ν = µt (ie. µ = constant), M SE SIR : where, Then for each of the countries into consideration, we load the real data and smooth it using a seven time-steps moving average. We then scale the data using Min-Max normalization. We load the pre-trained model and allow all its weights to be tuned by minimizing again both the M SE D and M SE SIR loss functions on the country's data, getting at the same time the optimal values for α(t) and µ for the given country. The pre-trained model is used to accelerate each country's training process. The training process of each country stops using early stopping with a horizon of 200 epochs. Having extracted the optimal α(t) as well as µ for each country, we use them to solve the SIR model, Eqs. (1, 2) . The solution is then fitted to the country's real data using the initial conditions (I 0 , S 0 ) as fitting parameters. The total number of the predicted cases during the "first wave" period of each country, including the relative error to the corresponding total number of reported cases and the total number of cases obtained by varying α(t) by ±10%, is presented in the Table (IV) . A plot of the results for each country is shown in Fig. (1) . In the map of Fig. (2) we portray the results of Table ( IV) in a more graphic way. The machine learning algorithms were implemented in Python using TensorFlow/Keras 5 and the ADAM 6 optimizer. The data used in this study were published online at OurWorldInData.org 7 . The arsenal of physics informed machine learning was used in the previous section in order to extract dynamical parameters such as the time-dependent infection we well as the removal rates from the documented infection data. The procedure through SIR pre-training proved to be quite efficient and gave a hierarchy of α(t) for different countries for the initial period of the infection. It is both tempting as well as challenging to apply this procedure to the present phase of the COVID-19 pandemic and attempt to make future predictions. In the process of future point evaluations, our procedure needs to satisfy two constraints; one is the overall mean square minimization that reduces the overall error. The second is the one imposed by physics, i.e. it must follow the SIR model. In order to accomplish the latter, the procedure needs to know the functional of α(t) as well as the value of µ at the future points. We provide this information through the extrapolation of the values from the previous times. Once these values are known, the SIR dynamics is warrantied and provides the second, physics derive constraint. In Fig. (3) we show the evolution of the current phase of the pandemic as well as the prediction obtained for a horizon of one week. In preparing these results, we used the available COVID-19 data, starting precisely where the first phase ended and used it up to one week before end dates for all eight countries for PINN training. Subsequently, we used the network for prediction and compared the results with the existing data. We note that the network's short-term predictive power is quite good on average in most countries. The spreading of COVID-19 has generated a wave of illness and death worldwide, accompanied by a severe disruption in financial, educational, commercial activities, global travel etc [8] [9] [10] . During the first phase of the spreading, there were different approaches to the measures to be taken to slow it down. Different countries reacted in different ways, and, as a result, the epidemic dynamics proceeded differently. The infection curves were different and dependent strongly both on the imposed social distancing measures and the adoption of responsible practices from individuals. One important aspect of the pandemic is to find ways to assess the degree to which the social distancing measures FIG. 3: PINN short term predictions and comparison with existing data. We use data of the second phase of COVID-19 spreading except for the last week, we train the network and predict the evolution during the last week. In most cases the short term prediction is reasonably good. where followed. It is not trivial to extract this information from the data since the infection dynamics are directly related to the imposed social distancing measures. Furthermore, this cannot be done in a completely model free way, and thus assumptions about both the model and the way measures are imposed are important. In the present work, we followed an early attempt 1 and used the publicly available infection data to assess the effectiveness and adherence to social distancing in different countries; in doing this analysis, we assumed that the mathematical model underlying the infection dynamics is the simplest SIR model. Before using the arsenal of ML we tackled the model analytically and produced two basic results; the first one is a general analytical solution for the model obtained through a specific exponential ansatz. The second, also dependent on this ansatz, is a differential equation for the function α(t) that describes the time dependent nature of the infection rate. The latter depends strictly on the imposed social distancing measures as well as the practices of the individuals. We pointed out, as also in reference 1 , that a linear drop in the infection rate leads to an approximate Gaussian functional dependence in the infected population. The specifics of this functional form depends both on the form as well as values of α(t) but also on the removal rate µ. In order to extract the time-dependent infection rate from the data, we used physics-informed neural networks, i.e. a machine learning method that uses input from the actual model assumed, viz. SIR. This input, together with the real infection data from each country we considered, led to a prediction of the assumed linear in time infection rate. The data derived slope σ signifies the adherence of each country to social distancing. In Greece, for instance, the slope is large in absolute value, designating strong application of the imposed measures by the individuals. In the other extreme, we find the USA with a practically zero slope, demonstrating that the measures taken had low efficiency. The other six countries we analyzed fall in intermediate locations between these two extremes. Application to the SIR model of each country, an alternative infection rate that differs by a few percent (±10%) in total from the one obtained through ML gives an estimate of how dependent the infection is on the applied measures. We find that this variation, while it affects the early SIR fast rise strongly, results in quite a different infection decay and horizon in countries like the UK. Once we know how the PINN behaves with the data fort the initial period of the infection, we may use it for the second phase. We consider that the latter starts from the end of the initial period and reaches the present day. Thus, we use country infection data during this period except for the last week to train the network and subsequently make predictions for the last week and compare it with real data. We find that while the short term predictive power of PINN is good, it has large deviations in countries where the data appear to have a rather stochastic character. The basic conclusion of this work is that the use of physics-informed ML may enable the extraction of COVID-19 infection information in different countries, show how different measures and practices are directly reflected in the data and ultimately make predictions. The use of physics in machine learning gives specificity to the data, but, on the other hand, is restricted and some times limited to inserted physics knowledge. The present approach assumes a well-mixed, essentially uniform country, an assumption that is introduced through the use of the SIR model. However, countries have regions, and each region may behave differently for geographical, environmental, cultural, as well as population reasons. If regional data is available, one can go one step further and introduce spatial in addition to temporal distribution in the infection and from this be able to obtain more accurate results and predictions. We believe the methodology used in this work may be extended in this more realistic case and provide a more direct approach to local dynamics and the effectiveness of imposed measures at a local level. For the simpler case of constant α and µ, Eq. (7) becomes Introducing the transformation x = q − µt we turn Eq. (25) into the following form: x + αe xẋ + αµe x = 0 (26) The new initial conditions are x(0) = lnI(0) andẋ(0) = aS(0) − µ. The Eq. (26) is a Lienard Equation that can be turned into an Abel equation through the introduction of the transformation ? We obtain the following Abel equation of the second kind: We introduce further the variable ξ as follows Since y x = y ξ f 1 (x), Eq. (28) becomes The Eq. (31) has the implicit solution where C is an arbitrary constant, or −αe x = y − µln|y + µ| + C Once the solution y = y(x) is substituted to Eq. (31) in the form we have the implicit solution t = t(x) for Eq. (30). Upon inversion of this solution we may obtain q(t) and thus have a solution for the original SIR equation. The Eq. (25) is a second-order equation while the SIR system of Eq. (1,2) constitutes a system of two first-order equations with initial conditions S(0) and I(0). It is easy to see that q(0) = lnI(0) whileq(0) = αS(0). Thus for Eq. (30) we have the following initial conditions x(0) = q(0) = lnI(0) andẋ(0) =q(0) − µ = αS(0) − µ. Since both susceptible and infected variables are percentages over the total population, the range of the q = q(t) variable is (−∞, 0] while x(t) takes similarly values in the same range. Let where λ = α µ and the prime derivative is wrt the new time variable τ = √ αµt. A particular solution of Eq. (39) is given by x(τ ) = − τ λ , or x(t) = −µt. This is, of course, a trivial solution since it corresponds to the solution q(t) = 0 of Eq. (25) and thus to a constant infection population. To find approximate solutions we may look into the limits λ >> 1 and λ << 1. In the former case, we may ignore the last term of Eq. (39) and solve the equation Its solution in the original variables is In the second case we ignore in turn the first derivative term of Eq. (39), i.e. x + e x = 0 (43) and obtain the solution in the original variables In this Appendix we describe the methods used in this work through a flowchart. We note that the after the extraction of the infection rate from the data we use the SIR model for further investigation of the infection. Extract Country's x(t), α(t) and µ. Solve the SIR model using the extracted α(t) and µ. Fit to Country's Real Data using the initial conditions as fitting paramaters. Other Countries into consideration? No Estimating the infection horizon of COVID-19 in eight countries with a data-driven approach A contribution to the mathematical theory of epi-demics Handbook of exact solutions of ordinary differential equations, Chapmand and Hall/CRC, second edition Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations Adam: a method for stochastic optimization Coronavirus Pandemic (COVID-19 Rethinking case fatality rations for COVID-19 from a data-driven viewpoint The first 100 days: Modeling the evolution of the COVID-19 pandemic Covid-19 predictions using a Gauss model based on data from