key: cord-0997946-2v4tw0ut
authors: Huang, C.-J.; Shen, Y.; Kuo, P.-H.; Chen, Y.-H.
title: Novel Spatiotemporal Feature Extraction Parallel Deep Neural Network for Forecasting Confirmed Cases of Coronavirus Disease 2019
date: 2020-05-05
journal: nan
DOI: 10.1101/2020.04.30.20086538
sha: 035a9b1d0696fe18a8514a49a1c6b353c44f56ce
doc_id: 997946
cord_uid: 2v4tw0ut

The coronavirus disease 2019 pandemic continues as of March 26 and spread to Europe on approximately February 24. A report from April 29 revealed 1.26 million confirmed cases and 125 928 deaths in Europe. This study proposed a novel deep neural network framework, COVID-19Net, which parallelly combines a convolutional neural network (CNN) and bidirectional gated recurrent units (GRUs). Three European countries with severe outbreaks were studied Germany, Italy, and Spain to extract spatiotemporal feature and predict the number of confirmed cases. The prediction results acquired from COVID-19Net were compared to those obtained using a CNN, GRU, and CNN-GRU. The mean absolute error, mean absolute percentage error, and root mean square error, which are commonly used model assessment indices, were used to compare the accuracy of the models. The results verified that COVID-19Net was notably more accurate than the other models. The mean absolute percentage error generated by COVID-19Net was 1.447 for Germany, 1.801 for Italy, and 2.828 for Spain, which were considerably lower than those of the other models. This indicated that the proposed framework can accurately predict the accumulated number of confirmed cases in the three countries and serve as a crucial reference for devising public health strategies.

error, mean absolute percentage error, and root mean square error, which are commonly used model 17 assessment indices, were used to compare the accuracy of the models. The results verified that 18 was notably more accurate than the other models. The mean absolute percentage error generated by COVID-19 19Net was 1.447 for Germany, 1.801 for Italy, and 2.828 for Spain, which were considerably lower than those 20 of the other models. This indicated that the proposed framework can accurately predict the accumulated 21 number of confirmed cases in the three countries and serve as a crucial reference for devising public health 

In December 2019, a pneumonia case of unknown cause was discovered. Shortly afterward, patients with 36 similar symptoms appeared across China and substantially burdened the healthcare system. On January 12, 37 2020, the World Health Organization (WHO) named the pneumonia 2019-nCoV. The disease was then 38 confirmed to be contagious between people by Zhong Nanshan, an academician of the Chinese Academy of 39 Engineering, on January 20, 2020. Ten days later (January 30), the WHO designated 2019-nCoV a public health 40 emergency of international concern. On February 11, 2020, WHO Director-General Tedros Adhanom 41 Ghebreyesus officially renamed the disease COVID-19 during an international meeting in Geneva. On March 42 11, 2020, Tedros announced a global outbreak of COVID-19. Mahase [1] reported in the BMJ that UK Prime 43 Minister Boris Johnson warned citizens to avoid contact with people. Imperial College London indicated that 44 COVID-19 is highly contagious to older adults aged over 70 years and pregnant women. 

However, the number of confirmed cases in Italy is three times of that South Korea. The number of deaths 49 presents the greatest difference; that of Italy (1809) is 24 times that of Korea (75) . Overall, the COVID-19 50 mortality rate in Italy exceeds 7%, which is substantially higher than the global average mortality rate (3.7%) 51 and is the highest rate worldwide. Italy implemented measures against the pandemic earlier than did other 52 countries. The day after COVID-19 was designated a public health emergency of international concern by the 53 WHO 

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 5, 2020. . https://doi.org/10. 1101 /2020 the first time, Merkel publicly addressed her opinion of COVID-19 and stated that the public is currently not 90 immune to this disease. Merkel warned that because no vaccine or medication has been developed for COVID-91 19, 60%-70% of the German population is expected to be infected at the current rate according to expert analysis.

A few hours after the press conference, the WHO characterized COVID-19 as a pandemic, when the 93 accumulated number of confirmed cases in Germany reached 2000, with four cases of death. This marks the 94 zero-death record held by Germany while the pandemic spreads across Europe.

In the past week, the number of confirmed cases in Germany has increased considerably, with daily 96 increases of over 1000 cases. To contain the pandemic, the German government closed its borders with France, 97 Switzerland, and Austria at 08:00 on March 16. The German railway also progressively reduced the number of 98 domestic trains in service. Social activities in each German state are being gradually reduced. Events with more 99 than 50 participants must be canceled; schools, bars, and nightclubs have been closed; some companies and 100 governmental units allow their employees to work from home; and restrictions were imposed on foreign travel.

In response to COVID-19, the German government has conducted research in the Robert Koch Institute since 102 January 6, and the scale of the research continues to increase. In addition to its comprehensive laboratory system 103 and highly trained personnel, Germany began conducting screening tests early to identify patients in advance.

After the contagion rate and number of patients are reduced, the national healthcare system can allocate more 105 capacity to provide treatment for all patients instead of passively conducting screening tests after patients 106 experience severe symptoms and require hospitalization. Germany currently has the lowest COVID-19 107 mortality rate among European countries with populations of more than 10 million people. According to 2017 108 Organization for Economic Cooperation and Development statistics, the number of ward beds per person in 109 Germany was the highest among European countries, and an average of 8 beds are available for every 1000 110 people. A crucial factor in containing pandemics is the density of intensive care units (ICUs). Patients with 111 severe COVID-19 symptoms must receive treatment in ICU wards and generally require artificial ventilation.

According to the German Federal Ministry of Health, Germany currently has 28 000 intensive care beds, 25 000 113 of which are equipped with artificial ventilation systems. We assume that Germany's healthcare system 114 possesses enough facilities to contain the pandemic for the total population. However, many of these beds are 115 already occupied. Consequently, this assumption is valid only if the national healthcare system does not collapse.

A key factor in preventing this pandemic is reducing the viral contagion rate and not overwhelming the 117 healthcare system, which can be extremely difficult. The number of intensive care beds in Italy is roughly 5000.

If the pandemic continues or becomes more severe, Germany's large number of available beds will be a notable 119 advantage in containing the pandemic. 

In response to COVID-19, which is considered a "never before threat" to Germany, the German 131 government clearly stated that additional restrictions may be imposed on the daily life of citizens. Merkel also 132 emphasized that the government is ready to take any necessary measures to help the country overcome the 

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 5, 2020. . https://doi.org/10. 1101 /2020 The remaining sections of this paper are structured as follows. Section 2 presents a literature review on 144 articles recently published by relevant scholars. Section 3 details the model proposed in this study and the 145 principles behind models. Section 4 discusses the analysis results, and Section 5 presents its conclusions.

Commonly used COVID-19 prediction models are divided into two types: those using mathematics or 148 artificial intelligence (AI) black box algorithms to construct models and analyze the progression and contagion CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 5, 2020. 

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 5, 2020. 

LSTM comprises an input gate, output gate, and forget gate.

The input gate is expressed using (1), in which current input and previous information and memory 238 (i.e., ℎ −1 and −1 ) are selectively retained according to corresponding weights and are sent to the forget gate.

The corresponding weight matrices are , ℎ , and , and denotes the bias matrix of the input gate.

= ( + ℎ ℎ −1 + −1 + ) (1)

The forget gate is expressed using (2) 

The output gate is acquired by passing current input , previous information ℎ −1 , and currently retained 249 memory through the sigmoid layer. Then, is standardized using tanh and multiplied to the output of the 250 sigmoid layer to derive output information ℎ . In (4) and (5) 

A BiLSTM model is an LSTM model with backpropagation learning, as presented in Fig. 4 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 5, 2020 . . https://doi.org/10.1101 /2020 network and uses an update gate and reset gate, as displayed in Fig. 5(a) . These two gates control output 261 information, retain previous information, and require fewer parameters than does an LSTM model. Accordingly, 262 a GRU-based neural network is more efficient than an LSTM model.

The update gate controls information sent previously. In (6) 

Step 1: Required data are selected from the aforementioned data sources to perform correlation analysis 280 and identify highly correlated data.

Step 2: All data are divided into testing and training sets according to the data collection time.

Step 3: Data are refactored into a matrix with six factors and five time steps named Input 1. Data are 283 refactored into a matrix with three regions and five time steps named Input 2.

Step 4: Because Input 1 is mainly used to extract temporal features, a 1D-CNN and BiGRU-based parallel 285 deep learning network is employed. Because Input 2 is mainly used to extract spatiotemporal features and each 286 country has a specific geographical location and order, a two-dimensional (2D) CNN is employed. The training 287 set refines the models.

Step 5: The testing set is used to test the models and calculate the mean absolute error (MAE), mean 289 absolute percentage error (MAPE), and root mean square error (RMSE).

The COVID-19Net algorithm proposed in this study is parallelly connected using a 1D-CNN [32], a 2D-291 CNN [33] , and BiGRUs to form a mixed deep learning network. Fig. 7 illustrates this framework.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 5, 2020. 

COVID-19Net was used to process spatiotemporal and temporal data separately. Because Input 298 1 contained data from each country regarding the daily number of newly confirmed cases, deaths, 299 and recovered cases and the accumulated number of these cases in the past 5 days, temporal features 300 related to the accumulated number of confirmed cases were extracted from these factors. Given the 301 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 5, 2020. . https://doi.org/10. 1101 /2020 outstanding performance of BiGRUs in learning from time series data, we serially combined a 1D-302 CNN with BiGRU to extract temporal features from Input 1. The highly contagious nature of 303 COVID-19 and frequent population flows among European countries may result in a strong spatial 304 correlation between their pandemic trends. Therefore, extracting spatial features considerably 305 increased the accuracy of the prediction model. This indicated that features related to changes in the 306 accumulated number of confirmed cases in each country are highly crucial. Based on the parallelly 307 combined 1D-CNN and BiGRUs used for extracting temporal features from Input 1, a 2D-CNN was 308 constructed to extract the spatiotemporal features of the three countries from Input 2 (Fig. 6) . The 309 1D-CNN had 16, 32, 32, 64 convolutional kernels, respectively, each with a kernel size of 3. The 310 size of the corresponding maximum pooling layer was 2. In the 2D-CNN model, the two 311 convolutional layers had 64 and 96 convolution kernels, respectively, and the sizes of the convolution 312 kernels were 3×3 and 4×1, respectively. Because the COVID-19 data employed in this study were 313 insufficient, a dropout layer was used to prevent overfitting. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 5, 2020 . . https://doi.org/10.1101 /2020 

(b) (c) 337 

This study used a CNN, a GRU, a CNN-GRU, and COVID-19Net to predict the accumulated number of 346 confirmed cases in Germany, Italy, and Spain (Fig. 10) . The results indicated that the predictions produced by 347 CNN-GRU were the least accurate and could not reflect any features or patterns. The GRU revealed a highly 348 unstable increasing trend; hence, it could not be used as a reliable reference. The prediction results produced by 349 CNN showed superiority over CNN-GRU and GRU. The main factor of the three countries had obvious spatial 350 characteristics in terms of geographic location and it was unreasonable to consider only the time factor. Overall, 351 the prediction result produced by the proposed COVID-19Net model was more accurate than that of the other 352 models. Therefore, it can serve as a reliable reference.

(a) 355 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 5, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 5, 2020. Tables I, II, and III list Tables I, II , and III. The results indicated that the CNN-GRU model was the least accurate, and the GRU and 382 CNN exhibited less favorable performance than did COVID-19Net. The models in descending order of 383 prediction accuracy were COVID-19Net, CNN, GRU, and CNN-GRU, which verified that the proposed 384 algorithm was the most accurate.

COVID-19 poses considerable challenges to healthcare systems worldwide. By using the proposed 386 algorithm, we predicted that the demand for ward beds and ICU beds in Germany, Italy, and Spain would 387 increase substantially, particularly before the pandemic peaks. If problems from lacking healthcare resources 388 and social distancing cannot be resolved, demand will increase. COVID-19 may overwhelm the capacity of 389 hospitals, particularly ICU nurses. The predicted values produced in this study can help countries develop and 390 implement disease prevention measures and reduce gaps between the strategies employed by countries, 391 including reducing services unrelated to COVID-19 prevention and temporarily increasing the capacity of the 392 healthcare system. Based on the estimation results of Zhang et al. [17] , It can be concluded that COVID-19 will 393 end after the beginning of June, during which healthcare resources will be in heavy demand. However, this 394 demand will also depend on social distancing measures implemented and other measures already imposed by 395 each country. During this pandemic, relevant disease prevention measures must be maintained, and the 396 importance of these measures must be highlighted to reduce the deaths of civilians and healthcare personnel.

The likelihood of flattening the epidemic curve, as discussed in Western media, is overly optimistic 399 because this entails no increase in COVID-19 cases. Currently, only China claims to have achieved this after 400 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 5, 2020. 

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 5, 2020. . https://doi.org/10. 1101 /2020 

Covid-19: UK starts social distancing after new model points to 260 000 potential deaths

Variational Mode Decomposition

Trend and forecasting of the COVID-19 outbreak in China

Serial interval of novel coronavirus (COVID-19) 421 infections

An updated estimation of the risk of 423 transmission of the novel coronavirus (2019-nCov)

Epidemic Spread of the 2019 Novel Coronavirus Driven by Spring Festival Transportation in China: A 426 Population-Based Study

Real-time 428 forecasts of the COVID-19 epidemic in China from

Early Prediction of the

Outbreak in the Mainland China Based on Simple Mathematical Model

Backcalculating the Incidence of Infection with COVID-19 on the Diamond Princess

On the Coronavirus (COVID-19) Outbreak and the Smart City Network

Universal Data Sharing Standards Coupled with Artificial Intelligence (AI) to Benefit Urban Health 437 Monitoring and Management

Reverse Logistics Network Design for Effective Management 439 of Medical Waste in Epidemic Outbreaks: Insights from the Coronavirus Disease

Outbreak in Wuhan (China)

Prevention Is Better Than the Cure: Risk Management of COVID-19

Host and infectivity 446 prediction of Wuhan 2019 novel coronavirus using deep learning algorithm

Modified SEIR and AI prediction of the epidemics trend of

China under public health interventions

Data-Based Analysis, Modelling and Forecasting 452 of the COVID-19 outbreak

Predicting turning point, duration and attack rate of COVID-19 outbreaks 454 in major Western countries

Early dynamics of transmission and 458 control of COVID-19: a mathematical modelling study. The Lancet Infectious Diseases 2020

Multiple-Input Deep Convolutional Neural Network 460 Model for COVID-19 Forecasting in China

An Electricity Price Forecasting Model by Hybrid Structured Deep Neural 465

High Precision Dimensional Measurement with Convolutional 467

Sensors (Basel)

Comparison of CNN Algorithms on Hyperspectral Image Classification in 469 Agricultural Lands

Semi-Supervised Bidirectional Long Short-Term Memory and 471 Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language 472

Research on a Real-Time Monitoring Method for the Wear State of a Tool 474 Based on a

Reducing Exchange Rate Risks in International Trade: A Hybrid 476

An LSTM-Based Autonomous Driving Model Using a Waymo Open 478

An Optimal Feature Parameter Set Based on Gated Recurrent Unit 480

A Gated Recurrent Unit Approach to Bitcoin Price Prediction

A Partially Amended Hybrid Bi-GRU-ARIMA Model (PAHM) for Predicting Solar Irradiance 485 in Short and Very-Short Terms

Divide and Conquer-Based 1D CNN Human Activity Recognition Using Test

implementing 2 months of lockdowns and strict measures. In this study, we proposed a novel model, , to predict the accumulated number of confirmed cases in Germany, Italy, and Spain, which are heavily 402 affected by the pandemic. The accumulated numbers of confirmed cases, deaths, and recovered cases and the 403 daily numbers of newly confirmed cases, deaths, and recovered cases in the past 5 days were used to predict 404 accumulated confirmed cases the next day. Comparing the prediction results and assessment indices of COVID-405 19Net, a CNN, a GRU, and a CNN-GRU verified that the CNN-GRU was the least accurate and generated an 406 MAPE value of 1.447, 1.801 and 2.828 for the three countries. This indicated that CNNs cannot be used to 407 extract features from the data of these countries. The hybrid model CNN-GRU also performed unfavorably.

Although the CNN was slightly more accurate than the CNN and CNN-GRU, it remained unviable for 409 predicting accumulated confirmed cases in the three countries. COVID-19Net was verified to be more accurate 410 than the other three models because COVID-19 trend data contain spatiotemporal features that can be extracted 411 using the deep neural network of COVID-19Net. The results of this study can serve as a crucial reference for 412 devising public health strategies against COVID-19, and the proposed algorithm can serve as an effective tool 413 for improving the allocation of hospital resources.