key: cord-0582183-k170fwoh
authors: Lin, Bo-Cyuan; Chen, Yen-Jia; Hung, Yi-Cheng; Chen, Chun-sheng; Wang, Han-Chun; Chern, Jann-Long
title: The Data Forecast in COVID-19 Model with Applications to US, South Korea, Brazil, India, Russia and Italy
date: 2020-11-05
journal: nan
DOI: nan
sha: 80e79ee3137309a0270dec12941ec717c210ec11
doc_id: 582183
cord_uid: k170fwoh

In this paper, we firstly propose SQIARD and SIARD models to investigate the transmission of COVID-19 with quarantine, infected and asymptomatic infected, and discuss the relation between the respective basic reproduction number $R_0, R_Q$ and the stability of the equilibrium points of model. Secondly, after training the related data parameters, in our numerical simulations, we respectively conduct the forecast of the data of US, South Korea, Brazil, India, Russia and Italy, and the effect of prediction of the epidemic situation in each country. Furthermore, we apply US data to compare SQIARD with SIARD, and display the effects of predictions.

As the data in WHO report (WHO Coronavirus (COVID-19) Dashboard, as of 6:46pm CEST, 20 May 2021), globally there have been 164,523,894 confirmed cases of COVID-19, including 3,412,032 deaths. COVID-19 become the most serious infectious disease and has had a dramatic impact on the human health and also damage the worldwide economic and developments now. As the real status which COVID-19 patients have high percentage with mild symptoms or no symptoms. The actual symptomatic cases are much lower than the pre-estimating. In [1] , the author doing the estimate for 11 European countries, lot of peoples had been infected but the patients has been detected are much less than actual infections. Two major reasons to caused this problem. One is limited testing capacity and another is high percentage of mild symptoms or no symptoms. Lavezzo, et al [2] stated that, at Vò, Italy, the asymptomatic cases were 43.2% of the total. In [3] , based on government estimations, the article assumed that the numbers of asymptomatic infected patients is nine times higher than the numbers of symptoms patients. Gudbjartsson, et al [4] also pointed out that the detected data of SARS-CoV-2 which done by screening group of Iceland and also showed 43% of the participants with asymptomatic. In [5] , the percentage of asymptomatic patients increased very fast from 16.1%(35 asymptomatic infections/218 confirmed cases) to 50.6%(314/621) within a week. As this fact, the proportion of asymptomatic infections for COVID-19 has dramatic changed. As the WHO's report, it pointed out that the asymptomatic patients are not non-infectious. Therefore, the proportion of asymptomatic infections is very critical to the impact of epidemic research. This paper will use mathematical models and data analysis methods to analyze the proportion of asymptomatic infections in various countries and predict the future trend of the epidemic. In Table 1 of section 4,we list the symptomatic infections with α in six countries. 1 − α is the proportion of asymptomatic people. In [2] , the authors also stated that, at V'o, Italy, the proportion of asymptomatic cases are 43.2% which was consistent with the 45% of our numerical simulation in the Table 1 for Italy. This also shows that our model can be applied to find out the proportion of asymptomatic people in each country

In the study of the COVID-19 epidemic, one of important topics is to estimate the basic reproduction number R 0 . In [6] and [7] , the relation between the model locally asymptotically stability and R 0 had been found. In [8] , the relation between the globally asymptotically stability of the model and R 0 by the authors. In the best results of [9] , Chen, et al revised the respective model into discrete time difference equations to find the relation between R 0 and model parameters. The relation between the reproduction numbers and sub-threshold en-demic equilibria for compartmental models of disease transmission we refer the nice and interesting paper [10] and et al. In our article, we will revise the differential systems into discrete time difference equations to train the model parameters for the epidemic prediction respectively.

In the previous SARS epidemic, many researchers have discussed the modifications of the SARS epidemic in terms of "quarantine" and "asymptomatic", e.g., please refer to [6] - [7] and the related references. However, because SARS and COVID-19 have different epidemic patterns, the disease patterns of asymptomatic infections are also different. Therefore, we establish two mathematical models, SQIARD and SIARD model, to simulate the COVID-19 epidemic. Obviously the SIARD model is only a simplified form of the SQIARD model. The difference between this two models are one include the parameters "number of people to be screened for the epidemic" and another one exclude it. It is very hard to get the data for this parameter but fortunately US government has complete data on the number of people in quarantine. This data can be used to generate the model parameters. In order to get the model parameters, we remove the parameter Q(t) first and use SIARD model to conduct the prediction data of US, South Korea, Brazil, India, Russia and Italy. The prediction of the SIARD model will be created first and then use US's data which contains Q(t) to show the effect of prediction of the SQIARD model.

The organization of this article are as follows: In Section 2, we utilize SQIARD to investigate the transmission model of COVID-19 with quarantine, infected and asymptomatic infected, and introducing the corresponding numerical simulation algorithm. In Section 3, we first create SIARD model and use FIR algorithm to generate the respective parameters of model. Finally, in Section 4, we present the forecast data of US, South Korea, Brazil, India, Russia and Italy respectively. Meanwhile, we also review the effect of prediction of the epidemic situation in each country, and use US's data to compare SQIARD with SIARD and show the effects of predictions

In this section, we will apply SQIARD to investigate the transmission model of COVID-19 with quarantine, infected and asymptomatic infected, and introducing the corresponding numerical simulation algorithm.

In SQIARD model, the normal people can be infected by infected patients and asymptomatic patients. Those peoples who with negative quarantine will be screened. It create three types after screening: symptomatic, asymptomatic, and negative. In the end, the infected person will be gradually moved to the class of recovery or death. The model has the following assumptions:

1. Q(t) is the number of quarantine(daily screening) people per day. Those with positive quarantine results are further divided into "symptomatic patients"(I(t)) or "asymptomatic patients"(A(t)). Those with negative quarantine results will return to S(t) at a rate of µ 3 .

A(t) is claimed to be had less infectivity, 0 < δ < 1, where δ is the proportion in infectiousness of asymptomatic infectives.

3. Total Population in our model are viewed as the same.

4. We simply the total population as a fixed value without new born and non-epidemic death.

5. There is no infectiousness during the daily quarantine (screening) process.

The variables are given as follows: [S: susceptible population; Q: quarantine population; I: infective population; A: asymptomatic infective population; R: recovered population; D: deaths]. The model parameters are given as follows: [β: the progression rate of susceptible to quarantine classes; δ: the proportion in infectiousness of asymptomatic infectives, where 0 < δ < 1; µ 1 : the progression rate of susceptible to infective classes; µ 2 : the progression rate of quarantine to susceptible classes; a i : with 0 < a i < 1, the proportion of susceptible class Q progressing to positive class I, A, or negative class which will back to the susceptible S, and a 1 + a 2 + a 3 = 1; γ 1 and γ 2 : the recovered rates of infective classes I and A; η 1 and η 2 : are the disease death rates of infective classes I and A]. From above, S(t) , Q(t) , I(t) , A(t) , R(t) and D(t) are conducted by the following differential equations:

Then the corresponding basic reproduction number of (1) is:

Due to daily updating of the COVID-19, we revise (1) into discrete form (3) and (4):

When the disease first spreads, the number of people infected is much smaller than the total population. The total population can be assumed to the same as the number of suspected infections and let q = (a 1 µ 1 + a 2 µ1 + a 3 µ 2 ) further simplify as follows :

The Covid-19 data from the WHO is discrete time-series. For conducting the prediction, we make the following assumptions which are different from the original model.

where α = 0.6 (referred from [11] , [12]), δ = 0.5 (referred from [6] ).

2. Rewrite the differential equations (section 2.1 (1)) as discrete form (5) similar with the consideration in paper [9] . Then, in (5), what data we substitute into the block I, A, R, D is the data obtained by the WHO (i.e S contains the other infectious, asymptotic infectious, recovery and death which is not confirmed to collect to the data).

3. The component β(t)S(t)(I(t)+δA(t))

of equation S(t + 1), is conducted by I(b), A(b), b < t, it means that the susceptible population has already contacted with the infectious population(time t) before time t.

Base on above assumptions, we implement the predicting algorithm and provide our predicting result in the section 4.

If the asymptomatic infected people die in a very low probability, then we can consider η 2 = 0. Then, in this case, since the relation a 1 + a 2 + a 3 = 1, the β(t), γ 1 (t), γ 2 (t), η 1 (t), a 3 (t) and the time-depend basic reproduction number of SQIARD can be evolved as following (6) and (7):

We use Finite Impulse Response filters (FIR) (8) , to predictβ(t),γ 1 (t),γ 2 (t),η 1 (t),â 3 (t). We also note thatâ 1 (t),â 2 (t) can be obtained fromâ 3 (t).

where a j 1 , j 1 = 0, 1, ...,

.., J 4 ; e j 5 , j 5 = 0, 1, ..., J 5 are the coefficients (weight) of the five given FIR filters as above. We will adopt the following Ridge Regularization method (9) which is often used in the machine learning for each FIR models, and use Theorem 2.1 implemented by Algorithm 1 to optimize the respective weights.

where a = (a 0 , a 1 , ...,

.., e J 4 ).

Before processing our numerical algorithm, we need the following theorem. 

Hence at the minimal point x 0 we have:

· f (t − j) + 2mx j , j = 1, 2, .., J.

Thus we easily obtain the following results:

This completes the proof.

Remark 2.1 We note that, for giving a regression parameter m > 0, the cost function F is positive of degree 2 with respect to each x j and F (x; m) → ∞ as x j → ±∞, x = (x 0 , x 1 , · · · , x J ). It follows that F (x 0 ; m) = min 

In order to find the appropriate orders for training each FIR models (β(t),γ 1 (t),γ 2 (t),η 1 (t),â 3 (t)), we divide the parameters data {β(t), γ 1 (t), γ 2 (t), η 1 (t), a 3 (t), 0 ≤ t ≤ T − 2} into respective two parts, the training data set (size: T ) and the validation set (size: In Figure 2 , we give a illustration of the result for the forecast based on the three algorithms. As implementing the Algorithm 2 to find the fit order, we also compute the reference effective interval 5%, 10%, 20% which are intervals of days satisfies (predicting)−(realdata) realdata < 5%, 10%, 20% in the validation set.

Before conducting the tracking Algorithm 3, we already obtain the respective order ofβ,γ 1 ,γ 2 , η 1 ,â 3 by Algorithm 2 and the reference effective intervals for the forecast. To increase the variety, append the predictionβ(t),γ 1 (t),γ 2 (t),η 1 (t),â 3 (t), ≥ T − 1 to the training set as following.

Let f (t), 0 ≤ t ≤ T − 2 be the training set,f (t), t ≥ T − 1 be the prediction depend on the trained model and P be the stopping criteria of the forecast. Then the Appended Training Data defined as: 

Firstly, we apply Algorithm 1 to train model (8) for obtainingβ(t),γ 1 (t),γ 2 (t),η 1 (t) andâ 3 (t). Secondly, by using the following, we can estimateQ(t),Î(t),Â(t),R(t),D(t) for t ≥ T − 1: Calculate {β(t), γ 1 (t), γ 2 (t), η 1 (t), a 3 (t), 0 ≤ t ≤ T − 2} by (6) and append to B, Γ 1 , Γ 2 , H, A, respectively. Train models with (9); J i , 1 ≤ i ≤ 5; m i , 1 ≤ i ≤ 5 and B, Γ 1 , Γ 2 , H, A, respectively by Algorithm 1 Estimateβ(T − 1),γ 1 (T − 1),γ 2 (T − 1),η 1 (T − 1),â 3 (T − 1) by (8) , and append to B, Γ 1 , Γ 2 , H, A, respectively. EstimateQ(T ),Î(T ),Â(T ),R(T ),D(T ) by (10) . while T ≤ t ≤ T + P do Train models with (9); J i , 1 ≤ i ≤ 5; m i , 1 ≤ i ≤ 5 and B, Γ 1 , Γ 2 , H, A, respectively by Algorithm 1. Estimateβ(t),γ 1 (t),γ 2 (t),η 1 (t),â 3 (t) by (8) , and append to B, Γ 1 , Γ 2 , H, A, respectively. EstimateQ(t + 1),Î(t + 1),Â(t + 1),R(t + 1),D(t + 1) by (10) . end while return Appended training data set: B, Γ 1 , Γ 2 , H, A; Predictions of Q, I, A, R, D:

In the previous section, we established and discussed the SQIARD model. In order to implement the forecast for the most countries which don't provide the daily data for the quarantined, therefore we construct a new model in this section.

In order to verify the epidemic effect of prediction of the SIARD model, we will take the following two steps:

1. Remove the parameter Q(t), and simplify the SQIARD infectious disease mathematical model under the other assumptions unchanged. Use the same training method to train the SIARD model and observe its effect of prediction .

2. Use data from countries that have "data on daily quarantine population" to compare the effect of prediction s of the two models on the epidemic.

The variables are given as follows: [S: susceptible population; I: infective population; A: asymptomatic infective population; R: recovered population; D: deaths]. The model parameters are given as follows: [β: the progression rate of susceptible class to infective classes; δ: the reduction in infectiousness of asymptomatic infectives, where 0 < δ < 1; α: the fraction of susceptible from susceptible to I or A, where 0 < α < 1; γ 1 and γ 2 : the recovered rates of infective classes I and A; η 1 and η 2 : are the disease death rates of infective classes I and A]. 

Note that R 0 is simply the basic reproduction number of this system. To further examine the stability condition of such a system , we let

Due to the COVID-19 data is uploaded in days, we revise the differential equation into discrete time difference equation.

When the disease first spreads, the number of people infected is much smaller than the total population, the number of suspected infections is approximated to the total of population. Then above equations can simplified as follows :

For the SIARD, β(t), γ 1 (t), γ 2 (t), η 1 (t) can be evolved as (14) from the discrete form of the SIARD differential equation.

Similarly, in order to estimatingÎ(t),Â(t),R(t),D(t) with SIARD for t > T , we use the Algorithm 1 to train models of (8) withoutâ 3 and obtainβ(t),γ 1 (t),γ 2 (t),η 1 (t). Then, we use it to com-puteÎ(t),Â(t),R(t),D(t) as (15) and also append the predictionβ(t),γ 1 (t),γ 2 (t),η 1 (t), t ≥ T − 1 to B,Γ 1 ,Γ 2 ,H respectively.

where t ≥ T − 1. Also, before implementing the Tracking Algorithm, we have to conduct the orders by Algorithm 2 first, then obtaining Tracking SIARD Algorithm from revised Algorithm 3 by removing the Q(t),Q(t), a 3 (t),â 3 (t). Section 4 is our result of the forecast.

In this section, we apply SIARD model to US, Brazil, South Korea, India, Russia, Italy, and SQIARD model to US with data sets [13] , [14] and [15] , then showing the result in each country. The following Table 1 is the parameters for the forecast in SQIARD and SIARD. 

For the SQIARD model, we consider the case µ 1 = 1 and µ 2 = 0.14 where µ 1 and µ 2 are the rate from quarantine to I and A. In Figure 4 , first, we use 100 data to train model and 20 validation data to find the best fitting orders of respective FIR model, then, we obtain the reference effective intervals with range 9 days in 5 % relative error, 18 days in 10 % relative error and 20 days in 20 % relative error from SQIARD model. In Figure 5 , we use the trained model to predict 20 days and compare to the validation data, and applying three kinds of relative error, 5 %, 10 % and 20 % to compute the amount of days which in relative error respectively. Then, we take these days as our reference effective interval. For example, in Figure 4 , we obtain the reference effective interval with range 9 days in 5 % relative error, 18 days in 10 % relative error and 20 days in 20 % relative error. And, we expand our training data to 120 data and predict 20 data in the future.

On the other hand, for the SIARD model, we obtain the reference effective interval with range 11 days in 5 % relative error and 20 days in 10 % relative error from SIARD model. We also can see the trend of I, A are getting slow down, so the value of R 0 is getting smaller. In Brazil (FIG. 6b) , we obtain the reference effective interval: 1 day in 5% relative error, 10 days in 10 % relative error and 14 days in 20 % relative error.

In South Korea (FIG. 7a) , we obtain the reference effective interval: 1 day in 5% relative error, 2 day in 10 % relative error and 7 days in 20 % relative error. We can see I, A in middle part are getting higher, same as R 0 . Conversely, I, A decrease in last part, then R 0 is getting lower and most of R 0 are lower than 1, so the epidemic in Korea may be controlled.

In India (FIG. 7b) , we obtain the reference effective interval: 5 days in 5% relative error, 20 days in 10 % relative error. In Russia (FIG. 8a) , we obtain the reference effective interval: 15 days in 5% relative error, 20 days in 10 % relative error, it has lots of days in 5 % relative error in six countries. In other words, the trained model of Russia has caught the trend of data.

In Italy (FIG. 8b) , we obtain the reference effective interval: 4 days in 5% relative error, 20 days in 10 % relative error. The trend of I, A are decrease, Covid-19 may be controlled in Italy. Since the data about quarantine of US interrupted on Dec.13, 2020, we compare with real data from Nov.23 to Dec.13, 2020 in US. The followings are the results about the forecast for SQIARD and SIARD, see Figure 9 . In US, we have the data of quarantine, so we can apply SQIARD model to predict data in future. Since we add extra parameters into model, especially the speed rate from quarantine to infected or asymptomatic infected, the data can be predicted more precisely than SIARD model. In Fig. 9 , we found that, the data predicted by SQIARD model are more closed to the real data.

Simultaneously, we use SIARD model to conduct forecast, the reference effective interval has 6 days in 5 % relative error(see Fig. 9(b) ). When we use SQIARD model to predict, the reference effective interval has 9 days in 5 % relative error(see Fig. 9 (a)), so the effect of SQIARD model is better than SIARD model during this period.

The following Figure 10 are the results about the forecast for SIARD from Apr.22 to May.10. 2021 in Brazil, South Korea, India, Russia and Italy. 

Our model finds the proportion of people with symptoms α in six countries. For example, in Table 1 , the proportion of symptomatic infections in Italy is about 55 %, which means that the proportion of asymptomatic infections in the country is about 45 %. This is in line with the 43.2 % of asymptomatic infections obtained by the authors of [2] at V'o, Italy. It proves that our model does have the ability to judge the proportion of asymptomatic infections. By above figures in section 4, it is obvious that the trend of symptomatic infections and asymptomatic infections are in relation with R 0 . When R 0 increases, I and A also increase, same as decrease, so the result of our prediction is accord with the definition of R 0 . Hence, from our data prediction, it also show R 0 can be viewed as an important target of the break or not of the Covid-19 epidemic.

Estimating the effects of non-pharmaceutical inter-ventions on COVID-19 in Europe

Suppression of a SARS-CoV-2 outbreak in the Italian municipality of vo

Avila-Vales, Eric; An SEIARD epidemic model for COVID-19 in Mexico: Mathematical analysis and state-levelforecast

Spread of SARS-CoV-2 in the Icelandic population

Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship

On the Role of Asymptomatic Infection in Transmission Dynamics of Infectious Diseases

Modeling International Measures and Severity-Dependent Public Response During Severe Acute Respiratory Syndrome Outbreak

A mathematical model for COVID-19 transmission dynamics with a case study of India

A Time-dependent SIR model for COVID-19 with Undetectable Infected Persons

Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission

Spread of SARS-CoV-2 in the Icelandic Population List of authors