key: cord-0227478-yh3nef2t
authors: Olumoyin, K. D.; Khaliq, A.Q.M.; Furati, K. M.
title: Multi-variant COVID-19 model with heterogeneous transmission rates using deep neural networks
date: 2022-05-13
journal: nan
DOI: nan
sha: ca088b9500ef5005d23b3582a14376c546845361
doc_id: 227478
cord_uid: yh3nef2t

Mutating variants of COVID-19 have been reported across many US states since 2021. In the fight against COVID-19, it has become imperative to study the heterogeneity in the time-varying transmission rates for each variant in the presence of pharmaceutical and non-pharmaceutical mitigation measures. We develop a Susceptible-Exposed-Infected-Recovered mathematical model to highlight the differences in the transmission of the B.1.617.2 delta variant and the original SARS-CoV-2. Theoretical results for the well-posedness of the model are discussed. A Deep neural network is utilized and a deep learning algorithm is developed to learn the time-varying heterogeneous transmission rates for each variant. The accuracy of the algorithm for the model is shown using error metrics in the data-driven simulation for COVID-19 variants in the US states of Florida, Alabama, Tennessee, and Missouri. Short-term forecasting of daily cases is demonstrated using long short term memory neural network and an adaptive neuro-fuzzy inference system.

COVID-19 was first reported in China in 2019 [1] , it has since become a global pandemic. In recent months, there have been reports of mutating variants of the virus [2] . In 2021, the dominant mutant variant of COVID-19 was the B.1.617.2 delta variant [3] . Effort to combat the spread of COVID-19 have included combinations of pharmaceutical (vaccination and hospitalization) and non-pharmaceutical (social distancing, contact tracing, and 5 facial mask) measures.

Prior to the onset of COVID-19 mutating variants in the US, the progress seen in the data from several states prompted the ease of the various non-pharmaceutical measures. Amid the news that several states had vaccinated over 70% of its population and a few states had vaccinated between 60% − 70% of its population, vaccination effort began to slow down in many US states. As a result, the existence of mutating variants resulted in a resurgence in 10 cases of infections. The Center for Disease Control and Prevention (CDC) reported that the dominant variant in the The paper is organized as follows. In Section 2, we introduce and discuss the multi-variant SEIR model and the time-varying transmission rate of each variant. The well posedness of the model is discussed in Section 2.2. The neural network structure of the Epidemiology neural network EINN is presented in 3. Data-driven simulation of COVID-19 data is shown in Sections 4. A comparison of a recurrent neural network based forecast and an adaptive neuro-fuzzy inference system based forecast is presented in 6. The performance error metrics of EINN is discussed 30 in Section 7. The paper is summarized in Section 8.

We assume that the total population N(t) = N at any given time is distributed among the following compartments: susceptible (S), exposed (E), Infectious (I i ), i = 1, . . . , M, and recovered (R), where M is the number of different variants. The interaction between the compartments is shown in Figure 1 . Figure (1) , the susceptible individuals enter the exposed compartment at the rate 1 N ∑ M i=1 β i (t)I i , where β i (t) is the transmission rate of variant i. The exposed individuals progress to the ith infected compartment at the rate η i . The ith infected compartment recover at the rate of γ i I i .

We assume a natural death rate µ, given by µ = 1 LFE×365 , where LFE denotes the life expectancy. For simplicity, it is assumed that the birth rate of the population is equal to the death rate. The parameter η is the transmission 40 rate from the exposed to the various infectious sub-compartments, γ −1 i is the mean symptomatic infectious period for the ith variant. The parameter ν(t) represent the time-dependent removal rate of the vaccinated individuals from the susceptible compartment. We assume that any variant does not super-infect another variant, so there are no interactions between the infectious sub-compartments.

Based on the transfer diagram depicted in Figure 1 , the mathematical model for a multi-variant COVID-19 45 pandemic with heterogeneous transmission rates is given by:

subject to non-negative initial conditions

The parameter η is defined as: η = ∑ M i=1 η i , and the total population is

The differential equation satisfied by the total population size is obtained by adding all the equations in (1), that is, dN dt = 0 and thus N is constant. The model parameters are summarized in Table 1 .

Time-varying transmission rates have been shown to efficiently model the spread of COVID-19 [4, 11] . Next, will discuss the form of the time-varying transmission rates for each variant. 

Time-varying transmission rates in (1) incorporates the impact of governmental actions, and the public response [12] . We consider the transmission rates of the form

where κ i in (2) is the infectiousness factor for each i th variant. However, we define (1 + τ i ) to be the factor by which a particular variant is more infectious than the original variant SARS-CoV-2. And so, the following relationship exist 55 between each mutating variant and the SARS-CoV-2 variant (3).

In (3) 

Let U be an open subset of R d 1 , and let F : U → R d 2 . We shall call F locally Lipschitz continuous if for every

We consider a more general framework of model (1)

where z(t) = (x 1 (t), x 2 (t), . . . , x n (t)) T and G(z(t)) = (g 1 (z(t)), g 2 (z(t)), . . . , g n (z(t))) T , the initial condition z 0 ∈ R n .

We state the following theorem. 

then the solution of the initial value problem (5) exists for t > 0, and Lipschitz continuous and contnuously differentiable. There exists β min , β max > 0 and ν min ν max > 0 such that β min ≤

Theorem 2.3. The nonlinear first order system of differential equations (1) has at least one solution which exists for t ∈ [0, ∞).

Proof. Let z(t) = (S(t), E(t), I 1 (t), . . . , I M (t), R(t)) T , we can set

G is locally Lipschitz continuous, using supremum norm || f (t)|| := sup

So by Theorem 2.1, and the boundedness of the time-varying nonlinear functions from Lemma 2.2, the nonlinear initial value problem 1 has a solution for all time. 

The basic reproduction number R 0 is the expected number of secondary infections that a single infectious individual will generate on average within a susceptible population.

Definition 2. The disease-free equilibrium of (1) is given by

The basic reproduction number R 0 is calculated for the case when

Applying the next-generation operator approach [16] , the reproduction number R 0 is obtained as the spectral radius of the next

The basic reproduction number R 0 is computed as follows in (9)

Next, we analyze the local asymptotic stability of the disease-free equilibrium in Definition 2.

Theorem 2.4. The disease-free equilibrium (S * , E * , I * 1 , . . . , I * M , R * ) of (1) is locally asymptotically stable if R 0 < 1.

Proof. The Jacobian of the right hand side of (1) at the equilibrium point is given by

If M = 1, the eigenvalues of the Jacobian matrix are given as follows:

Clearly, A, B > 0 and A < B, so that λ 1 , λ 2 , λ 3 , λ 4 < 0. Similarly, we can show negative eigenvalues for M ≥ 2. So the disease-free equilibrium is locally asymptotically stable.

A Feedforward Neural Network (FNN) composed of L layers, t inputs and an output N can be represented as the following function u is some target function, the goal is to find Σ * by solving the following optimization problem

The function 1 K ∑ K k=1 ||N (t k ; Σ) − u k || 2 2 on the right hand side of (11) is called the mean squared error (MSE) loss 

Here, ψ represent the neural network weights and biases while ρ represent the epidemiology parameters and T is the number of days in our dataset. Next, we set-up time-varying transmission rate networks whose outputs are

Each π i represent the weights and biases of each ith network and κ i is the 100 infectiousness factor for each ith variant. The training data is generated using cubicspline and denoted byĨ(t j ) and

Here T ν is an integer that correspond to the vaccination start date in the dataset. The B.1.617.2 delta variant was first reported in the USA in May, T δ is an integer that correspond to May 4th, 2021. We observe that training data is not available for all the compartments in the SEIR model, however, EINN is able to capture the epidemiology interactions between the compartments because the residual of equation (1) is included in the MSE 105 loss function. The MSE loss function for EINN is given by,

where the residual L l , l = 1, . . . 5, is as follows

where η = ∑ M i=1 η i , i = 1, . . . M. The daily infected cases, the vaccinated cases, the known COVID-19 variants facts and the transmission rates are enforced in the mean square error (MSE) (12), see Figure ( 2). For instance, p δ 1 and p δ 2 correspond to the proportion 110 of daily cases that was due to the mutating variants as reported by the CDC [13] . Output layer: S(t j ; ψ; ρ), E(t j ; ψ; ρ), I(t j ; ψ; ρ), R(t j ; ψ; ρ), j = 1, . . . , T 2: construct neural networks: β i , j = 1, . . . , M specify the input: t j , j = 1, . . . , T Initialize the neural network parameter: φ Specify β 0 i obtained by fitting daily cases Output layers :

3: Specify EINN training set

Training data: using cubicspline, generateĨ(t j ) andR(t j ), j = 1, . . . , T .

Specify an MSE loss function: 

We present results of the implementation of the EINN algorithm in Figure ( 2) for COVID-19 data from Alabama, Missouri, Tennessee, and Florida. We consider data from March 2020 to September 2021, during which there were two dominant variants;the original variant SARS-CoV-2 and the delta variant (B.1.617.2). CDC report indicate that 115 1.3% of the total infected cases were due to the delta variant in May 4th 2021 [13] . The EINN algorithm learns the infected cases, and the time-varying transmission rates due to each variant. In Table ( 2)-(5), pre-γ 1 , post-γ 1 , post-γ 2 denote the recovery rate of people infected due to the original variant SARS-CoV-2 before the onset of the delta variant, recovery rate of people infected due to the original variant SARS-CoV-2 after the onset of the delta variant, and recovery rate of people infected due to the delta variant after the onset of the delta variant respectively. 120 The CDC reports that by July 31st, 2021, the proportions of infected cases that are due to the B.1.617.2 delta variant in Alabama was 82.6%, Tennessee was 67.4%, Missouri 53.9%, and in Florida, it was 86.4% [3] . The CDC also reported that in the USA, the delta variant accounted for about 1.3% of the infected cases.

We seek to learn τ i for an ith mutating variant. For the simulations in this section, we observed that the delta variant is a dominant mutating variant therefore we included only two variants, the SARS-CoV-2 and the delta Figure 9 : learned Florida Susceptible, Exposed, and Recovered daily population ants [18, 19, 20] . These statistical methods are not optimal for nonlinear predictive task. This has motivated a shift towards techniques that rely on neural networks and neuro-fuzzy models [21] . In this Section, we present an hybrid neural network that combines the simplicity and nonlinear learning capabilities of the Epidemiology-informed neural network (EINN) as well as the fuzzy inference system (ANFIS).

Adaptive neuro-fuzzy inference system (ANFIS), an hybrid neural network itself, is a combination of fuzzy 135 logic and a feedforward neural network. It incorporates the advantages of both methods including learning capabilities, interpretability, quick convergence, adaptability and high accuracy. ANFIS displays excellent performance in approximation and prediction of nonlinear relationships in various fields [22] .

The Adaptive Neuro-Fuzzy Inference System (ANFIS) was introduced in [23] . It combines a neural network with a fuzzy inference system (FIS) based on "IF-THEN" rules. One major advantage of FIS is that it does not 140 require knowledge of the main physical process as a pre-condition. ANFIS combines FIS with a backpropagation algorithm. These techniques provide a method for the fuzzy modeling procedure to learn from the available dataset, in order to compute the membership function parameters that best allow the fuzzy inference system to track the given input/output data.

To forecast the transmission of a multi-variant COVID-19, we present an efficient deep learning forecast model 145 which combines two neural networks, we solve the ODE system using an Epidemiology Informed Neural Network (EINN) and we forecast using an adaptive neuro-fuzzy system (ANFIS), which we called the EINN-ANFIS model.

We present results of the implementation of ANFIS, EINN-ANFIS, LSTM, EINN-LSTM for COVID-19 data from Alabama, Missouri, Tennessee, and Florida from March 2020 to September 2021. In the ANFIS approach, We 150 used 4 regressors, 12 membership rules, and learning rate of 0.002. Training was done using 300 epochs, where we used the adams optimizer and for the loss function, we used the mean square error. The EINN-ANFIS is a hybrid neural network, where EINN is first used to train the daily cases dataset and a second round of training is done using ANFIS. In the LSTM approach, we used 4 input layers which corresponds to the daily cases at times t, t + 1, t + 2, and t + 3. The adams optimizer is also used in training the LSTM with 20 epochs and the loss function also uses the As can be observed from these Tables (6)-(9) EINN-ANFIS is an improvement over ANFIS and similarly, EINN-LSTM is an improvement over LSTM. The following error metrics are used in our data driven simulation:

• Root Mean Square Error (RMSE):

where Y andỸ are the predicted and original values, respectively.

• Mean Absolute Error (MAE):

• Mean Absolute Percentage Error (MAPE):

• Root Mean Squared Relative Error (RMSRE):

, N s represents the sample size of the data.

In Table 10 We provide a comparison of error metrics for EINN using random splits for the training and test data. 

We have presented a data-driven deep learning algorithm that learns time-varying transmission rates of multivariant in an infectious disease such as COVID-19. The algorithm we presented learns the nonlinear time-varying 175 transmission rates without a pre-assumed pattern as well as predict the daily cases and daily recovered populations.

We learn these population groups using only daily cases data. This approach is found useful when the dynamics of an epidemiological model such as an SEIR model is impacted by various mitigation measures. The algorithm presented in this paper can be adapted to most epidemiology models. Using US daily cases data, we demonstrate that the algorithm presented in this work can be combined together with recurrent neural networks and ANFIS for 180 an improved short-term forecast. This study is seen useful in the event of a pandemic such as COVID-19, where public health interventions and public response and perceptions interfere in the interaction of the compartments in an epidemiology model.

The computer codes will be available at https://github.com/okayode/EINN-COVID.

Archived: WHO Timeline-COVID-19

Making sense of coronavirus mutaions

Delta variant: What we know about the science

Data-driven deep-learning algorithm for asymptomatic covid-19 model with varying mitigation measures and transmission rate

Data-driven deep learning algorithms for time-varying infection rates of covid-19 and mitigation measures

Approximation by superposition of a sigmoidal function, Mathematics of control, signals and systems

Approximation capabilities of multilayer feedforward networks

Physics informed deep learning: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations

On parameter estimation approaches for predicting disease transmission through optimization, deep learning and statistical inference methods

Identification and prediction of time-varying parameters of COVID-19 model: a data-driven deep learning approach

Fast estimation of time-varying infectious disease transmission rates

Inferring the causes of the three waves of the 1918 influenza pandemic in england and wales

Variant proportions

Fractional model for the spread of COVID-19 subject to governmental intervention and public perception

Ordinary Differential Equations: Basics and beyond

Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission

Deep Learning

Development of an artificial neural network as a tool for predicting the targeted phenolic profile of grapevine Vitis vinifera foliar wastes

Genomic selection using principal component regression

Time series forecasting of COVID-19 transmission in canada using lstm networks

Application of the hybrid ANFIS models for long term wind power density prediction with extrapolation 230 capability

Adaptive neuro-fuzzy inference system in modelling fatigue life of multidirectional composite laminates

Anfis: adaptive-network-based fuzzy inference system