key: cord-351580-129608e4 authors: Shen, J. title: A Recursive Bifurcation Model for Predicting the Peak of COVID-19 Virus Spread in United States and Germany date: 2020-04-14 journal: nan DOI: 10.1101/2020.04.09.20059329 sha: doc_id: 351580 cord_uid: 129608e4 Prediction on the peak time of COVID-19 virus spread is crucial to decision making on lockdown or closure of cities and states. In this paper we design a recursive bifurcation model for analyzing COVID-19 virus spread in different countries. The bifurcation facilitates a recursive processing of infected population through linear least-squares fitting. In addition, a nonlinear least-squares fitting is utilized to predict the future values of infected populations. Numerical results on the data from three countries (South Korea, United States and Germany) indicate the effectiveness of our approach. model with time delays was also developed for studying the period of incubation and recovery [20, 21] . Although there have been many recent studies with respect to the COVID-19 virus spread, an accurate model to pinpoint the peak time of the virus spread is still elusive. Such a model is crucial to a decision-making process for strategic plans to achieve a balance between reduction in life loss and avoidance of economic crisis due to lockdown. The rest of this paper is organized as follows. In Section 3, a recurve bifurcation model is introduced to model the COVID-19 spread. A bifurcation analysis is given in Section 4 on infected data from South Korea. Section 5 describes the prediction of COVID-19 virus based on our model, followed by some concluding remarks in Section 6. In this paper, we focus on the number of infected population, which is an important metric to measure the extent of the COVID-19 spread in different countries. Although the infected population in most countries follows a pattern of an exponential or sigmoid function, the logarithm of the infected population may provide more information, as shown in Fig. 1 (a) number of infected population (b) logarithm of infected population Figure 1 : The number of infected population in South Korea as of April 5, 2020. The countries that exhibit a bifurcation pattern include South Korea, United States, France, Canada, Germany, Australia, Malaysia, and Ecuador. By utilizing the bifurcation, we can find out the intrinsic parameters in cycle 1 and apply those parameters as a set of starting values in the prediction for cycle 2 or beyond. Following the above idea, we introduce a recursive Tanh function to describe the number of infected population within each cycle of an entire virus spread process: where i refers to the i-th cycle, P is the number of infected population in the i-th cycle, D represents the number of days since the initiation of virus spread, stands for the number of infected population at the end of the i-th cycle, is the spread rate in the i-th cycle, and refers to the number of days at the end of the i-th cycle. The purpose of adding 1 in the logarithm calculation is to avoid an infinity caused by the case where P = 0. Note that Equation (1) is not strictly a recursive formula in a conventional sense. The reason for us to call it as a recursive one is that Equation (1) should be recursively solved starting from cycle 1 toward cycle n, if n is the last cycle for the virus spread. When n=1, this equation is degenerated to a regular Tanh function. In order to validate Equation (1) for the analysis of COVID-19 virus spread, we have to select a complete virus spread process. Among all the countries, South Korea seems to be the best choice for this validation because the country provides reasonably reliable data and the virus spread in that country has been stabilized. in Equation (1) represents an intrinsic attribute of the virus spread rate. It can be estimated by a linear least-squares fitting of the following linear equation in a parameter space: Figure 2(a) shows the result of determining the virus spread rate, 1 . By using this r value, we predict the infected population, , which is very close to the true data, y, as shown in Figure 2 author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.04.09.20059329 doi: medRxiv preprint Furthermore, by using 1 in cycle 2 of Korea data, we also achieve an accurate prediction of infected population and validate to be close to unity (Figure 3) . Here, is a fictious variable that should be of a value of unity: The bifurcation in Figure 1 (a) is easy to identify visually. An automatic algorithm can be created on the basis of discontinuity of tangential direction when traversing the curve. Since it is not the main focus of this paper, we do not explore it any further in this aspect. Based on the model in Section 4, we design an algorithm to predict the incoming peak time of COVID-19 virus in United States and Germany, as given in Table 1 . Since the infected population has not been stabilized in these two countries, it is important to estimate the ultimate infected population at the end of the last cycle, n. We first use the following formula to estimate ̂ through a linear least-squares fitting: =̂ , All rights reserved. No reuse allowed without permission. author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.04.09.20059329 doi: medRxiv preprint where = ( + 1) − ( −1 + 1) and = � 2 Then, Equation (1) is utilized to estimate ̂ for cycle n through a linear least-squares fitting. With ̂ and ̂ being available as a pair of starting values, a nonlinear Levenberg-Marquart leastsquares fitting [22] is computed to determine two unknown parameters ( and ) simultaneously in the following equation: Once and are determined, Equation (5) can be used to predict the future values of infected population. To define the peak time of virus spread, a termination condition is proposed as follows: where j refers to j-th day in cycle n. Equation (6) means that the virus spread approaches its stability when the difference in the logarithm of infected population between two consecutive days is less than 0.01. Step 1 Determine the virus spread rate in cycle 1, 1 , based on a least-squares fitting of Equation (2) Step 2 Recursively analyze the infected population in cycles 2 through n-1. Estimate the virus spread rate in cycle n, ̂, based on a linear least-squares fitting of Equation (2) Step 4 Estimate the logarithm of infected population in cycle n by a linear least-squares fitting of ̂ in Equation (4) Step 5 Determine and by using a nonlinear Levenberg-Marquart least-squares fitting based on [̂,̂] as a pair of starting values through Equation (5) Step 6 Predict the future infected population based on Equation (5) Step 7 Use a termination condition (Eq. 6) to estimate the peak time of virus spread Figure 4 shows the prediction result of infected population in United States. The bifurcation pattern of infected population is given in Figure 4 (a) and the determination of virus spread rate is presented in Figure 4 (b). The virus spread rate in United States ( 1 = 0.072) is smaller than that in South Korea ( 1 = 0.106) because the population density in South Korea is much higher. This may also mean that the peak time of virus spread will be longer than that in South Korea. Figures 4(c) and 4(d) are the predicted data for cycles 1 and 2, respectively. According to Figure 4 (d), the COVID virus spread in United States will roughly peak on April 26, 2020. All rights reserved. No reuse allowed without permission. author/funder, who has granted medRxiv a license to display the preprint in perpetuity. COVID-19 data in Germany can be analyzed in a similar way. Figure 5 (b) indicates that the virus spread will approximately peak on May 1, 2020. The virus spread rate, 1 , in Germany is 0.108, which is close to that in South Korea. These two countries have a higher virus spread rate than United State because of the higher population density in Germany and South Korea. All rights reserved. No reuse allowed without permission. author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.04.09.20059329 doi: medRxiv preprint In this paper, we propose a recursive bifurcation approach to estimate the peak time of COVID-19 virus spread. The infected population data in South Korea is analyzed as an example of stabilized virus spread. An algorithm is developed to predict the future infected population based on ongoing existing data as of April 6, 2020. Our model predicts that the COVID-19 virus spread will approximately peak on April 26 and May 1, 2020, respectively for United States and Germany in terms of infected population. What you need to know about coronavirus disease 2019 (COVID-19) The novel coronavirus 2019-ncov is highly contagious and more infectious than initially estimated Assessing spread risk of Wuhan novel coronavirus within and beyond China Preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in China, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak Epidemic doubling time of the 2019 novel coronavirus outbreak by province in mainland China Serial interval of novel coronavirus (2019-ncov) infections The extent of transmission of novel coronavirus in Wuhan, China, 2020 Using predicted imports of 2019-ncov cases to determine locations that may not be identifying all imported cases Epidemic size of novel coronavirus-infected pneumonia in the epicenter Wuhan: using data of five-countries' evacuation action The effect of travel restrictions on the spread of the 2019 novel coronavirus (2019-ncov) outbreak. medRxiv The impact of traffic isolation in Wuhan on the spread of 2019-nov Feasibility of controlling 2019-ncov outbreaks by isolation of cases and contacts. THE LANCET Global Health Effectiveness of airport screening at detecting travellers infected with 2019-ncov An updated estimation of the risk of transmission of the novel coronavirus (2019-ncov) Lockdown may partially halt the spread of 2019 novel coronavirus in Hubei province Interventions targeting air travellers early in the pandemic may delay local outbreaks of sars-cov-2. medRxiv Simulating the infected population and spread trend of 2019-ncov under different policy by EIR model The lockdown of Hubei province causing different transmission dynamics of the novel coronavirus (2019-ncov) in Wuhan and Beijing A mathematical model for simulating the transmission of Wuhan novel coronavirus Modeling and prediction for the trend of outbreak of NCP based on a time-delay dynamic system A time delay dynamical model for outbreak of 2019-ncov and the parameter identification Numerical Recipes in C: The Art of Scientific Computing All the true data of infected populations is obtained from the Coronavirus Resource Center of Johns Hopkins University. The authors declare no conflict of interests.