key: cord-279112-ajdkasah authors: Rojas, S. title: Comment on “Estimation of COVID-19 dynamics “on a back-of-envelope”: Does the simplest SIR model provide quantitative parameters and predictions?” date: 2020-09-13 journal: nan DOI: 10.1016/j.csfx.2020.100047 sha: doc_id: 279112 cord_uid: ajdkasah This comment shows that data regarding cumulative confirmed cases from the coronavirus COVID-19 disease outbreak, in the period December 31, 2019–June 29, 2020 of some countries reported by the European Centre for Disease Prevention and Control, can be adjusted by the exact solution of the Kermack – McKendrick approximation of the SIR epidemiological model. Departamento de Física 1 / 1 Sartenejas, August 27, 2020 Chief Editor Chaos, Solitons and Fractals Dear Editor, The submitted article: • Provides the right numerical solution of the Kermack and McKendrick approximation missed in the article inspiring our comment [1] . • Data from nine countries are shown to be described by the full SIR epidemiological model. With kindest regards, Profesor Sergio Rojas Departamento de Física In a recent article published in this journal [1] , after some (unnecessary) considerations, the author presents the logistic function (equation (8) in [1] ) as an alternative solution of the differential equation known as the Kermack and McKendrick 1927 approximation [2] of the SIR epidemiological model [3, 4] in order to fit data regarding the cumulative confirmed of COVID-19 infected cases from some countries. Clearly, the proposed logistic function (equation (8) in [1] ) does not accomplish the initial condition R(0) = 0. In this note we show that the data of the countries discussed in reference [1] and of a few other countries (see figure 1) can also be fitted using the R(t) solution obtained from the Kermack and McKendrick 1927 approximation of the SIR model, making it unnecessary the use of the Verhulst (logistic) equation (8) in [1] . The SIR model considers a population of size N on which, at time t, S(t) individuals are susceptible of being infected as a consequence that I(t) individuals are already infected and can transmit or spread the disease to the susceptible population. The number of individuals R(t) represents those who has recovered from the disease (which, if lethal, also includes death individuals) and can not be reinfected. Thus, the dynamics of the disease, introduced in 1927 by Kermack and McKendrick [2] , is modeled by the set of differential equations: In these equations, the parameters β (the infection rate) and γ (the recovery or removal rate of infectives) are constants: β controls the transition between S and I, equation (1), while γ controls the transition between I and R, equation (3). For an epidemic to occur [2, 3, 4, 5] , the number of infected individuals needs to increase from the initial number of infected individuals I0. This condition will happen if at time zero, S0 > Sc = ρ = γ/β. That is, ρ represents a critical value for an epidemic to occur and the SIR model reveals a threshold phenomenon [6] . From a dimensional point of view, assigning no units to S, I, R, and N the parameters β and γ have units of inverse of time (measured typically in days, weeks or months in epidemiological records). Quantitatively, while the interaction in the form of the product SI makes it difficult to determine the parameter β from observed epidemiological data, from equation (3) the inverse of the parameter (γ) gives a measure of the time spent by individuals in the infectious stage. Consequently, by carefully observing the development of an infectious disease, the parameter γ can be estimated (as the inverse of the recovered or infectious period) by epidemiologists from epidemiological records. One should be aware that neither of the parameters β or γ remains constant as the infection evolves [2, 3, 4, 5] . Moreover, the assumptions on which the model are built are no longer valid as soon as sanitary interventions are applied to control the infection. As discussed in the epidemiological literature [2, 3, 4] , a straight forward combination of the SIR model equations (1)-(3) leads to a non-linear differential equation for dR/dt, interpreted as the properly counted individuals removed (either because they have recovered or death) from medical units: For not severe epidemics, Kermack and McKendrick (1927) [2] considered R(t)/ρ < 1 and proposes that dR/dt could be approximated by: Considering that S0 (6) can be written in the form [2, 4] where tanh(x) is the hyperbolic tangent of x, and Here tanh −1 (x) is the inverse of the hyperbolic tangent of x. From equation (7), we also obtain the Kermack and McKendrick (1927) approximated solution (or the KM approximation) of the SIR model [2] where sech(x) is the hyperbolic secant of x. Kermack and McKendrick were able to study a Bombay 1905-1906 plague using equation (10) 2 Using cumulative confirmed cases data reported by the European Centre for Disease Prevention and Control [7] regarding the coronavirus COVID-19 pandemic outbreak, we used computing routines to fit data using equation (7) written in the form: setting C 0 = C 1 tanh(C 3 ) to meet the initial condition R(t = 0) = 0. To find numerical solution of the SIR model, equations (1)-(3), in addition to the parameters β and γ we also need to know initial conditions S0 = S(t = 0), I0 = I(t = 0), and R0 = R(t = 0). As required by the SIR model, we set R0 = 0. As already mentioned, an estimated for γ could be obtained from epidemiological records as its inverse (1/γ) determines the average infectious period of the disease [4] . According to the European Centre for Disease Prevention and Control regarding the coronavirus COVID-19 pandemic [8] the infectious period is "· · · estimated to last for 7-12 days in moderate cases and up to two weeks on average in severe cases.". Accordingly, γ values for computation used in this comment were set to yield an infectious period in that range. For β, S0, and I0 it is not easy to have observed estimated values. To find reasonable starting values for them we applied a heuristic approach which turns out to be helpful in order to find numerical solution of the full SIR model adjusting itself to data fitted by the Kermack and McKendrick solution in equation (10) . Then, by a standard Figure 1 : The graph shows the KM approximation R(t) given in equation (7) (with fitting parameters shown in Table 1 , according to equation (11) ) and the full numerical solution of the SIR epidemiological model defined by equations (1)-(3) (with integration parameters compiled in Table 2 ), both with reasonable estimated absolute and relative rmse values are observed to adjust reported cumulative confirmed COVID-19 cases for a number of countries. Data source (available in [7] ) covers the range December 31, 2019 -June 29, 2020. trial and error approach we were able to find suitable parameters for solving the full SIR model adjusting itself to the COVID-19 cases reported in this comment. To have an idea of how well each one of the fit adjust itself to the data, we use the Root Mean Square Error (rmse) and the Relative Root Mean Square Error (rmseRel), defined as follows: Here O i is the i th observation in the considered O data set; F i is the corresponding value obtained by the corresponding fitting method; and max(O) is the maximum value in the considered O data set. As the uncertainty in the observed values O i is unknown [9] , it is unrealistic to emphasize any further statistical measure characterizing the estimated parameters used in the analysis for the COVID-19 pandemic data set. At this point it should be mentioned that the numerical computational work in this comment was carried out via the Python scripting programming language and the Numpy/SciPy/Matplotlib libraries described elsewhere [10, 11] . The data for analysis comes from European Centre for Disease Prevention and Control [7] , and the period covered at the moment of start writing this note was (for most countries) December 31, 2019-June 29, 2020. Compiled in Table 1 are the parameters that best fit the data (from each studied country) to the function R(t) of equation (7) expressed in the form of equation (11) The results are shown in Figure 1 . The reported rmse and relative rmse values are indicative that a reasonable fit has been attained. The corresponding fit is also indicative that it is unnecessary to use the logistic function (i.e. equation (8) Table 1 : The values above were used to fit the COVID-19 confirmed cases data for each country shown in Figure 1 . The fit corresponds to R(t) defined via the equation (7) written in the form of equation (11) . Similarly, in Table 2 we compiled values of the quantities required to find numerical solution of the full SIR epidemiological model defined by equations (1)-(3) for each country whose results is given in Figure 1 . Also, the reported rmse and relative rmse values are indicative that a reasonable match has been attained. The results are indicative that the SIR model is a good choice to get a better understanding of COVID-19 data. Country (7) of the SIR epidemiological model. We were also able to show that the full SIR model could be solved numerically adjusting itself to the analyzed data. Since other, more complex, alternative approaches to the problem has been proposed [12] , at this point it is hard to establish for sure which model better describe the evolution of the coronavirus COVID-19 pandemic [9] . Consequently, given that the SIR model captures some of the COVID-19 data behavior, it could provides guidance to get better insight on the evolution of the pandemic as the only two parameters (β and γ) entering in the model are more or less well understood by epidemiologists and can be guessed from the data. Consequently, before considering more complex models (requiring much more parameters than the SIR model), it is clear that a better qualitatively understanding of the parameters β and γ in addition to the initial condition I0, S0 (restricted to N = I0 + S0) is necessary to give an appropriated quantitative account of an epidemic. We are confident that the methodology applied in the development of this comment could also be extended to analyze other sets of data. The author has no competing interest to declare. Estimation of covid-19 dynamics "on a back-of-envelope": Does the simplest sir model provide quantitative parameters and predictions? Estimation of covid-19 dynamics "on a back-of-envelope": Does the simplest sir model provide quantitative parameters and predictions? A contribution to the mathematical theory of epidemics Mathematical biology. I. An introduction Modeling Infectious Diseases in Humans and Animals An Introduction to Mathematical Modeling of Infectious Diseases Introduction to Phase Transitions and Critical Phenomena Geographic distribution of covid-19 cases worldwide Covid-19 pandemic modeling is fraught with uncertainties Learning SciPy for Numerical and Scientific Computing Numerical and Scientific Computing with SciPy (Book-Video) Analysis and forecast of covid-19 spreading in china, italy and france The author is grateful to an anonymous referee who kindly provided useful comments to improving this article.