key: cord-1039437-hmem8se3
authors: Nesteruk, Igor
title: Long-term predictions for COVID-19 pandemic dynamics in Ukraine, Austria and Italy
date: 2020-04-11
journal: nan
DOI: 10.1101/2020.04.08.20058123
sha: ce3fa97e6a6f00bae587f6225273f189392d2cad
doc_id: 1039437
cord_uid: hmem8se3

The SIR (susceptible-infected-removed) model, statistical approach to the parameter identification and the official WHO daily data about the confirmed cumulative number of cases were used to make some estimations for the dynamics of the coronavirus pandemic dynamics in Ukraine, Italy and Austria. The volume of the data sets and the influence of the information about the initial stages of the epidemics were discussed in order to have reliable long-time predictions. The final sizes and durations for the pandemic in these countries are estimated.

Here we consider the development of epidemic outbreak in Italy, Ukraine and Austria caused by coronavirus COVID-19 (2019-nCoV) (see e.g., [1] ). Some estimations of the epidemic dynamics in these countries can be found in [2] [3] [4] [5] [6] [7] . In particular, the final size of the epidemic in these tables. The data sets presented in Table 1 were used only for comparison with corresponding SIR curves. 

The SIR model for an infectious disease [7] [8] [9] [10] [11] relates the number of susceptible persons S (persons who are sensitive to the pathogen and not protected); the number of infected is I (persons who are sick and spread the infection; please don't confuse with the number of still ill persons, so known active cases) and the number of removed R (persons who no longer spread the infection;

this number is the sum of isolated, recovered, dead, and infected people who left the region);  and  are constants.

To determine the initial conditions for the set of equations (1-3), let us suppose that at the moment of the epidemic outbreak 0 t , [10, 11] : 5  12  3858  47  1  6  13  4636  66  1  7  14  5883  104  1  8  15  7375  112  1  9  16  9172  131  1  10  17  10149  182  1  11  18  12462  302  1  12  19  15113  361  3  13  20  17660  504  3  14  21  21157  800  3  15  22  24747  959  3  16  23  27980  1132  7  17  24  31506  1332  14  18  25  35713  1646  16  19  26  41035  1843  16  20  27  47021  2649  26  21  28  53578  3024  47  22  29  59138  3631  47  23  30  63927  4486  84  24  31  69176  5282  113  25  32  74386  5888  156  26  33  80539  7029  218  27  34  86498  7697  311  28  35  92472  8291  418  29  36  97689  8813  480  30  37  101739  9618  549  31  38  105792  10182  669  1  39  110574  10711  804  2  40  115242  11129  987  3  41  119827  11525  1096  4  42  124632  11766  1251  5  43  128948  11983  1319  6 44 132547 12297 1462 

All rights reserved. No reuse allowed without permission.

the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10.1101/2020.04.08.20058123 doi: medRxiv preprint

Thus, for every set of parameters N,  , , 0 t and a fixed value of V the integral (6) can be calculated and the corresponding moment of time can be determined from (5) . Then functions I(t)

and R(t) can be easily calculated with the of formulas, [11, 12] .

Function I has a maximum at S   and tends to zero at infinity, see [8, 9] . In comparison, the number of susceptible persons at infinity 0 S   , and can be calculated from the non-linear equation, [11, 12] :

The final number of victims (final accumulated number of cases) can be calculated from:

To estimate the duration of an epidemic outbreak, we can use the condition

which means that at final t t  less than one person still spread the infection.

In the case of a new epidemic, the values of this independent four parameters are unknown and must be identified with the use of limited data sets. A statistical approach was developed in [11] and used in [5, 11, 12, 15] to estimate the values of unknown parameters. The registered points for the number of victims V j corresponding to the moments of time t j can be used in order to calculate

for every fixed values N and  with the use of (6) and then to check how the registered points fit the straight line (5) . For this purpose the linear regression can be used, e.g., [13] , and the optimal straight line, minimizing the sum of squared distances between registered and theoretical points, can be defined. Thus we can find the optimal values of  , 0 t and calculate the correlation coefficient r .

All rights reserved. No reuse allowed without permission.

the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10.1101/2020.04.08.20058123 doi: medRxiv preprint

Then the F-test may be applied to check how the null hypothesis that says that the proposed linear relationship (5) fits the data set. The experimental value of the Fisher function can be calculated with the use of the formula:

where n is the number of observations, m=2 is the number of parameters in the regression equation, [13] . The corresponding experimental value F has to be compared with the critical value 

The first preliminary prediction for Italy was published in [5] on March 27, 2020. Its results are presented in the first column of Table 3 . Table 3 ). These moments correspond to May 25-31, 2020.

The average time of spreading infection 1/  could be estimated as 0.7-0.75 days (according to last two predictions, see Table 3 ). By comparison, in South Korea was approximately 4.3 hours, [16] .

Numbers of infected I (green), removed R (black) and the number of victims V=I+R (blue line). the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10.1101/2020.04.08.20058123 doi: medRxiv preprint

Numbers of infected I (green), removed R (black) and the number of victims V=I+R (blue line); "circles" show the cases taken for calculations; "triangles" correspond to the cases during initial stage of the epidemic; "star" -the last data point used only for a verification of the prediction. the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10.1101/2020.04.08.20058123 doi: medRxiv preprint For this country the results of calculations are shown in Table 4 . Fig. 2 represents the results for the prediction No. 2 corresponding to the highest value of / (1, 2) C F F n . It can be seen that first cases of COVID-2019 infection were probably timely identified and sick people were isolated. This may be a reason for much lower saturation level of the epidemic in Austria (approximately 12,000 -13,000 according to the last predictions 2 and 3). The new cases could stop to appear at the moments of time 62 -65 (see two last values in the last row of Table 4 ). These moments correspond to April 24-27, 2020. The average time of spreading infection 1/  could be estimated as 0.44-0.47 days and is much lower than in Italy, but higher than in South Korea, [16] .

The results of calculations are shown in Table 5 . Table 5 ). These moments correspond to April 19-23, 2020. The average time of spreading infection 1/  could be estimated as 0.36-0.4 days.

Numbers of infected I (green), removed R (black) and the number of victims V=I+R (blue line); "circles" show the cases taken for calculations; "triangles" correspond to the cases during initial stage of the epidemic; "star" -last data point used only for a verification of the prediction. 

The accuracy of any mathematical model is limited. The used SIR model is not an exception.

The real processes are much more complicated. In particular, all the parameters in SIR model are supposed to be constant. If the quarantine measures and speed of isolation change or new infected persons are coming in the country, the accuracy of the prediction reduces. The accuracy of predictions increases with increasing the number of observations. On the other hand, the need for forecasts is reduced if an epidemic is stabilized.

The SIR (susceptible-infected-removed) model and statistical approach to the parameter are able to make some reliable estimations for the epidemic outbreaks. The accuracy of long-term predictions is limited by uncertain information, especially at the beginning of an epidemic. The long All rights reserved. No reuse allowed without permission.

the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint (which was not peer-reviewed) is . https://doi.org/10.1101/2020.04.08.20058123 doi: medRxiv preprint enough observations may eliminate the influence of the initial stage data and increase the accuracy of predictions. Even at limited amount of data the SIR model can be used to estimate the final size of an epidemic and its duration. The further course of COVID-19 pandemic in Ukraine, Austria and Italy will show the real accuracy of the proposed method.

Coronavirus disease (COVID-2019) situation reports

Coronavirus epidemic outbreak in Europe. Comparison with the dynamics in mainland China

Comparison of the coronavirus epidemic dynamics in Italy and mainland China

Comparison of the coronavirus pandemic dynamics in Europe. USA and South Korea

Stabilization of the coronavirus pandemic in Italy and global prospects

Coronavirus pandemic dynamics in March

Comparison of the coronavirus pandemic dynamics in Ukraine and neighboring countries

A contribution to the mathematical theory of epidemics

Mathematical biology

Comparison of mathematical models for the dynamics of the Chernivtsi children disease

Statistics-based predictions of coronavirus epidemic spreading in mainland China

Applied Regression Analysis

Maximal speed of underwater locomotion

Estimations of the coronavirus epidemic dynamics in South Korea with the use of SIR model

I would like to express my sincere thanks to Gerhard Demelmair and Ihor Kudybyn for their help in collecting and processing data.