key: cord-0529452-6a2xe76y
authors: Wang, Ting; Li, Gui-Yun; Li, Xin-Hui; Zhou, Chi-Chun; Wang, Yuan-Yuan; Li, Li-Juan; Yang, Yan-Ting
title: To Simulate the Spread of Infectious Diseases by the Random Matrix
date: 2022-04-11
journal: nan
DOI: nan
sha: 46509bead34b0ef744fbd9b4515dd86eb95d2ce8
doc_id: 529452
cord_uid: 6a2xe76y

The main aim to build models capable of simulating the spreading of infectious diseases is to control them. And along this way, the key to find the optimal strategy for disease control is to obtain a large number of simulations of disease transitions under different scenarios. Therefore, the models that can simulate the spreading of diseases under scenarios closer to the reality and are with high efficiency are preferred. In the realistic social networks, the random contact, including contacts between people in the public places and the public transits, becomes the important access for the spreading of infectious diseases. In this paper, a model can efficiently simulate the spreading of infectious diseases under random contacts is proposed. In this approach, the random contact between people is characterized by the random matrix with elements randomly generated and the spread of the diseases is simulated by the Markov process. We report an interesting property of the proposed model: the main indicators of the spreading of the diseases such as the death rate are invariant of the size of the population. Therefore, representative simulations can be conducted on models consist of small number of populations. The main advantage of this model is that it can easily simulate the spreading of diseases under more realistic scenarios and thus is able to give a large number of simulations needed for the searching of the optimal control strategy. Based on this work, the reinforcement learning will be introduced to give the optimal control strategy in the following work.

Infectious diseases, especially malignant infectious diseases that can spread on a large scale and cause large-scale deaths, seriously threaten the safety of individual lives and social orders [1] , and are major problems in modern public health events. For example, the Black Death in the mid-14th century swept across Europe, killing about 1/3 of the European population at that time [2] . The SARS, outbreaks at 2003, is estimated to have cost the global economy more than 50 billion dollars worth of damages [3] . The 2019 novel coronavirus pneumonia has killed more than 6 million people, the epidemic is still happening locally, and the death toll will continue to rise. Moreover, they cause a lot of economic losses making the world economy fall into a downturn.

Establishing a model capable of simulating the spreading of infectious diseases is important to predicting the trend of infectious diseases and providing theoretical guidance for adopting the disease prevention measures. Along this way, the key to find the optimal strategy for disease control is to obtain a large number of simulations of disease transitions under different scenarios closer to the reality. Therefore, we need models that can efficiently simulate the spreading of diseases under realistic scenarios.

Researches on infectious disease models can be traced back to 1760, when Dutch physicist Bernoull uses mathematical models to study the effects of vaccinia vaccines on the spreading of smallpox [4] . In 1906, Hamer used a discrete model to study the recurrent epidemic of measles [5] . In 1911, Ross used differential equations to study the transmission of malaria between mosquitoes and people, proving that when the number of mosquitoes is reduced below a critical value, the outbreak of malaria can be controlled [6] .

Recently, the infectious disease models can be divided into three types. (1) The methods based on differential equations. In this approach, the dynamic process of disease transmission is simulated by setting different compartments to represent people in different disease states [7, 8] . For example, the SIR model with S, I, and R the susceptible, infectious and removed populations is a typical compartment model. The disease transition is simulated by introducing the ordinary equations, dS/dt = −βSI, dI/dt = βSI − γI, and dR/dt = γI, where parameters β and γ stand for the infection rate and the recovery rate respectively [9] . In order to simulate the spreading of diseases under more realistic scenarios, various compartment models are proposed. For examples, the SEIR model [10] (with S, E, I, and R the susceptible, exposed, infected, and recover respectively), the SIRS model [10] (with S, I, R, and S the susceptible, infectious, recovered, and susceptible respectively), and the model considers asymptomatic carriers [11] are proposed, where more classes or compartments according to the epidemiological status are considered. Beyond the deterministic compartment models, the stochastic compartment models are introduced to simulate the stochastic factors [9, [12] [13] [14] and other types of differential equations, such as the delay differential equation [10, 15] , nonlinear Volterra integral equations [16] , and the fractional differential equation [14] , are introduced into the compartment models. The differential equation based compartment models are indeed able to simulate various kinds of infectious diseases [17] [18] [19] [20] and evaluate the effectiveness of the control measures [21, 22] however, they ignore the individual differences [14] , are unable to account for disease's true infectious period distribution [16] , and are inflexible to simulate realistic scenarios such as random contact in the public places.

(2) The methods based on the cellular automata. Unlike the compartment models, the cellular automata is individual-based-model, where each individual of the population is represented by a cell of the automata, and the transmission dynamic is simulated by setting a set of simple rules that encode the individual behaviour [23] . The model based on cellular automata can simulate more realistic scenarios. For example, it can simulate the impact of lockdown, migration and vaccination on COVID-19 dynamics [24] , consider social community with varying sex ratio, age structure, population movement, incubation and treatment period, immunity [25] , and address the population heterogeneity and distribution in epidemics models [23, 26] . Beyond the conventional cellular automata models, the probabilistic cellular automata models [27] , the models considers stochastic modeling [28] , and the models combine the differential equation and the cellular automata [29] are proposed. The models based on the cellular automata considers individual differences and can simulate more realistic scenarios. However, there are limitations in its definition of cell morphology and neighbor rules.

(3) The methods based on the complex network. In this approach, the contact between people based on the real social networks is considered [30, 31] and thus are able to simulate the epidemic spreading under scenarios closest to the real-world. In the complex network, nodes represent individuals and the connection between individuals are characterized by links between nodes [30, 31] . To simulate the epidemic spreading, one, for example, has to solve the ordinary differential equations on that network [32] . However, to solve the epidemic models on the underlying topology of complex network is difficult. For example, the Monte Carlo simulations are useless when study the network with a large system size due to memory needs [33] . Therefore, unlike the compartment models and the cellular automata models, the models based on the complex network focus on the profound impact of the complex properties of real-world networks on the behavior of equilibrium and nonequilibrium phenomena [31] , or thresholds [34, 35] of epidemic spreading. For example, an epidemic threshold that is inversely proportional to the largest eigenvalue of the connectivity matrix is proposed [36] and Immunization and epidemic threshold of an SIS model in complex networks is investigated [37, 38] . Moreover, various kinds of complex networks are investigated, such as the correlated complex networks [33] and the scale-free complex networks [32, 38] . The models based on the complex network are indeed the model closest to the realistic, however, it is difficult to obtain a large simulations on those model due to the complexity in a given time.

In the realistic social networks, on the one hand, the random contact, including contacts between people in the public places and the public transits, becomes the important access for the spreading of infectious diseases. On the other hand, the situations are complicated. For example, individuals vary from one to another: those with good physical condition perhaps become the asymptomatic carriers and those with strong sense of prevention might become the vaccine recipients or the mask wearers. Therefore, in order to obtain a large number of simulations of disease spreading supporting the searching of the optimal strategy of disease control, a model can simulate the disease spreading under various conditions and quickly return a simulation result is needed.

In this paper, a model can efficiently simulate the spreading of infectious diseases under random contacts is proposed. In this approach, the random contact between people is characterized by the random matrix with elements randomly generated and the spreading of the diseases is simulated by the Markov-like process, a process has slightly differences between the Markov process. We report an interesting property of the proposed model: the main indicators of the spreading of the diseases such as the death rate are invariant of the size of the population. Therefore, representative simulations can be conducted on models consist of small number of populations. The main advantage of this model is that it can easily simulate the spreading of diseases under more realistic scenarios and thus is able to give a large number of simulations needed for the searching of the optimal control strategy. For example, we simulate over 7 scenarios where the spreading of different infectious diseases, characterized by their infections and deaths, under different control measures are given. This work is the first work in the series studies aimed at giving the optimal disease control strategy. Based on this work, the reinforcement learning will be introduced to give the optimal control strategy in the following work.

The paper is organized as follows: In Sec. 2, we introduce the random matrix and establish the connection between the random matrix and random contact. In Sec. 3, the main method is introduced. In Sec. 4, simulations under different scenarios are conducted. Conclusions and discussions are given in Sec. 5.

In this section, we introduce the main method, including the approach of simulate the random contact by the random matrix and the rule a disease spread on the random matrix.

In this section, we introduce the approach of simulate the random contact by the random matrix.

Random matrix is also called probability transition matrix in mathematics, which is a basic tool for characterizing Markov processes [39] , It can be used to describe complex systems with random interactions. For example, physical system in non-equilibrium state [40] , complex communication system [41] , and complex social network of infectious disease population.

In this approach, we consider a group of individuals indexed from 1 to N . The social connection of these individuals are characterized by a random matrix A with order N . Here are two basic assumptions of the proposed model. (1) The element A i,j at ith row and jth column represents the connection between individuals i and j. For example, if there is connections, including the fixed connection and random contact, between individuals i and j, the matrix element at position i and j will be assigned with a non-negative value. Otherwise the element will be 0. The larger the value, the closer the contact. For example, a random contact with a stranger at a bus is characterized by 0.01 and a contact with a family member is 0.5. In order to simulate both the fixed connection, say the contact between families and colleague, and random contact, say the contact between strangers at public places, we assign randomly generated values, under certain constraint, to the elements of the matrix. As a result, a symmetric random matrix is generated. (2) The element at ith row and jth column also represents the exposure of spreading the diseases from the ith individual to the jth individual and vice versa. Therefore, the diagonal element represent the recover coefficient of the diseases for each individuals. For the sake of clarity, we give an example where 100 individuals are considered, as shown in Fig. (1) . In this example, each individual averagely has connection with 4 other individuals. The number of contacts for each individual is rounded from a value sampled from a normal distribution with mean 4 and variance 4. The matrix elements are either 0 or sampled from a normal distribution with mean 0.4 and variance 0.2. For the sake of convenience, the matrix element are named the contact coefficient. To summarize, the average number of contacts and the average exposure coefficient are two constraint applied on the generation of the random matrix revealing the social activity. Figure 1 . An example of simulate the fixed connection and random contact between 100 individuals by the random matrix: the matrix and the corresponding network. Individual 2 has closer connection with individual 100 than 1. The recover coefficient for each individuals is 0.9.

In this section, we introduce the rule where a disease spread on the random matrix.

After the construction of the random matrix characterizing the social connection between a given group of individuals, we set up the following rule to simulate the spreading of the diseases. (1) A vector x, named the criterion vector, of size N × 1 is given characterizing the disease infection status of each individual. The ith component of the criterion vector evaluates the exposure to the disease. (2) If the exposure to the disease is below a certain value, the pathogenic threshold, the individual is health. Otherwise, he becomes infected. If the exposure beyond the given value, the lethal threshold, the individual will die. If the exposure return to a value below the pathogenic threshold, he is recovered. The diseases are mainly characterized by the recover coefficient and the pathogenic and lethal thresholds. (3) The spreading of the disease is simulated by considering the random matrix A as the transition probability matrix and the criterion vector x as the vector of state in a Markov process. In a transition probability matrix, the summation of a row is 1 making the total probability conserved. However, here, the elements of the matrix is not regarded as a probability and the summation of the row of the random matrix is not 1. Therefore, we name it as the Markov-like process. The spreading is described by the following equation,

where x n+1 is the criterion vector of next round. For the sake of clarity, we give an example where 10 individuals are considered, as shown in Fig. (2) . To summarize, in this approach, the diseases are characterized by the recover coefficient and the pathogenic and lethal thresholds. And the social connections are described by the random matrix generated under the constraint characterized by the average number of contacts and the average exposure coefficient. By applying the rules, we can simulate the spreading of different diseases under various realistic situations, as show in Fig. (3) . 

In this section, before conducting various simulations by the proposed models, we give explanations to the model's parameters since the estimation of model's parameters is always important. Beside, we report an interesting property of the proposed model and give a simple discussion on the outbreak threshold of the epidemic spreading.

Usually, the model's parameters should be estimated and verified based on the real data first, and then the model could be used to simulate the spreading. In this work, the main aim is to propose a model that can give a large amount of simulations in a given time supporting the searching of the optimal disease control strategy. Therefore, we focus on testing the flexibility of the proposed model to simulate various realistic situations by setting the model's parameters manually. The estimating of the parameter will be considered in further researches.

In this section, we consider three types of diseases with low, medium, and high pathogenic and lethal thresholds. These three types of diseases represent the general and malignant infectious diseases. Given that the social connections between individuals are described by the random matrix generated under the constraint characterized by the average number of contacts and the average exposure coefficient and the matrix will be altered during the simulation according to different control measures, we set the average number of contacts 4 and the average exposure coefficient 0.13. An overview of the setting of model's parameter is given in Table. (3.1) Table 1 . The setting of model's parameter. Av-cont-coe, Av-cont-num, Path-thre, Leth-thre, and Rec-coe stand for the average exposure coefficient, the average number of contacts, the pathogenic threshold, the lethal threshold, and the recover coefficient respectively.

Av-exp-coe Av-cont-num Path-thre Leth-thre Rec-coe Malignant 0. Here is a diagram, Fig. (4) , showing the spreading of these three types of diseases. The curve trends and the infected and death populations show the differences.

Usually, to simulate the spreading of diseases closer to the realistic situations, one needs to simulate a system with large number of individuals. For example, a small town consists of 100 thousand populations. In this section, we report an interesting property of the proposed model: the model is invariant of the size. That is, the main indicators of disease transmission, such as the death rate and the infection rate of simulations, remain stable and are invariant of the population sizes, as shown in Table. (3.2).

To investigate this interesting property, we also repeat the experiments on community networks, a kind of networks describing the social connection of individuals in realistic world. The random matrix describing the community networks can be easily constructed. For example, for the network proposed in this work, the elements of the matrix are sampled from a Gauss distribution under given constraint, see section 2.1. For the community networks, between individuals in a community, the random matrix is generated in the same way, however, between individuals in different communities, only selected individuals have connections, say 5. Here is the results of another sets of experiments, as shown in Table. (3.2). 

ant of the size of the population. It means that, without loss of generality, the following simulations can be conducted in a model of small number of population, say 100, which saves computer resources to a desirable extent. Moreover, Fig. (5) shows the difference between simulations of the spreading of diseases on the network proposed in this work and the community networks. Although, the construction of those two random matrices are different, to construct random matrices describing those networks is easy. It shows in Fig.  (5) that the spreading on the community network can be divided into different stages while that on the networks provided in this work has only one stage. It is because the spreading of diseases on different communities is not synchronous. In the following simulations, we consider 100 individuals. Here we only report this interesting property and further research on this phenomenon will be carried out in later works.

In this section, we investigate the outbreak threshold of the epidemic spreading. Instead of a rigorous discussion, we give a simple discussion on whether the infectious disease will outbreaks or not. Further investigation on the equilibrium and nonequilibrium phenomena and thresholds of epidemic spreading will be given in later work.

To explore the outbreak threshold of the epidemic spreading, in this approach, we take the advantages of the proposed method and obtain a large number of simulations where the spreading of different diseases on various situations are considered. Here, the outbreak of the epidemic spreading is defined as over 70% populations are infected. It shows in Fig. (6) that there is a clear boundary between the outbreak and no outbreak cases. For example, the diseases with low pathogenic threshold won't break out regardless of the average number of contacts of the social network. 

In this section, we simulation of the spreading of different diseases under different realistic scenarios. The diseases are mainly characterized by the pathogenic and lethal thresholds and the realistic scenarios are simulated by the random matrix. For instance, to simulate the spreading of the diseases under the scenario where the control measures of isolating the infected is adopted, the random matrix is altered after each round. The elements corresponding to the infected individuals are all set to 0.

In this section, we consider the effectiveness of passive quarantine. The quarantine can be simply simulated by setting the matrix elements corresponding to the individual 0 if we find this individual need to be in quarantine. In this simulation only those who go to hospital seeking for a treatment themselves are considered to be in quarantine. Usually, not all infected individuals go to hospital seeking for medical treatment and not all be diagnosed due to wrong diagnosis. Therefore, in this simulation, we set the probability of being diagnosed 0.3. That is for those who are infected, only 30% will be diagnosed and quarantined. It shows in Fig. (7) and Table. (4.1) that the quarantine contributes greatly to reduce the infected and death cases in the spreading of the diseases. Here are another simulations where the probability of being diagnosed 0.4, 0.2, and 0.1 respectively, as shown in Fig. (8) . The simulations suggest that the probability of being diagnosed to be an infected is crucial for the deaths. For example, when the probability is below 0.1, there are deaths. 

In this section, we consider the effectiveness of active quarantine. Unlike the passive quarantine, in the active quarantine, we conduct a disease detection for all individuals and those found to be infected are in quarantine. The active quarantine considers those who don't go to hospital for treatment themselves. Given that the method we use for disease detection is imperfect, that is, the method can't find all infected individuals. We simulate the cases where the detection recall is 99%, 90%, and 80% respectively. In realistic life, conducting an overall disease detection costs considerable human and financial costs, therefore, in the simulation, the overall disease detection is conducted every 5 and 10 rounds. Figs. (9) and (10) show the effectiveness of active quarantine. The simulations of proposed model show that, the overall disease detection helps little with the control of spreading of the diseases. Increasing the frequency and accuracy of the overall disease detection contributs little to the control of spreading of the diseases too. 

In this section, we consider the effectiveness of the green code, a measure used in China to control the COVID-19. In this control measure, not only those who are diagnosed are quarantined but also those who have direct contact with the infected, namely close contacts, are quarantined too. In this simulations, those who go to hospital seeking for a medical treatment themselves are possibly be diagnosed as infected. The result of simulations are shown in Figs. (11) and (12) . It shows that the green code measure contributes to the control of the spreading of the diseases obviously. Especially from the curve trends in Fig.  (12) , the infected population is reduced obviously. 

In this section, we consider the effectiveness of the green code and overall detection. That is, the overall detection is conducted to find the infected population such as those who never go to hospital. Moreover, the infected and the close contacts are quarantined.

The result of simulations are shown in Fig. (13) . It shows that the green code measure and overall detection together contribute to the control of the spreading of the diseases to some degree. For the malignant infectious disease the control measure is with higher effectiveness. The simulation suggests that to effectively control the spreading of the disease, to find the infected and their close contacts and make them quarantined are the keys. 

In this section, we consider the asymptomatic carriers. The asymptomatic carriers are individuals who are infected and are contagious but have no uncomfortable signs and symptoms. Usually, the asymptomatic carriers are hard to be found. In the following simulations, we consider the spreading of diseases with asymptomatic carriers under the most strict control measures, the green code and overall detection. Here, we make the assumption that the asymptomatic carriers can not be found even in the overall detection.

The result of simulations are shown in Fig. (14) . It shows that the asymptomatic carriers are the main group to break down the strict control measure, the green code and overall detection. Especially when there are over 10% of the populations are asymptomatic carriers, there are deaths. 

In this section, we consider the spreading of diseases with vaccination. The vaccination is a useful method to control the spreading of diseases. In this simulation, we make the assumption that the vaccination will increase the resistance of the pathogens. That is the pathogenic and lethal thresholds stay the same, but the recover coefficient will the lowered. The lower the recover coefficient the quicker the infected will recover. Here, after vaccination, the recover coefficient multiplies 0.7.

The result of simulations are shown in Fig. (15) . It shows that the vaccination can effectively control the spreading of diseases, especially for the general infectious diseases. However, to avoid deaths, only vaccination is not enough. That is, we need control measures such as the quarantine and green code. 

In this section, we consider diseases spread by air, and to wear a mask can efficiently prevent the spreading. Here, we consider two kinds of mask, one is medical mask and the other is the general mask. In this simulation, the medical mask can reduce the exposure coefficient by multiplying 0.9 and the general mask 0.95. Without loss of generality, we make assumption that all individuals wear masks.

The result of simulations are shown in Fig. (16) . It shows that wearing a mask can control the spreading of diseases, especially for the general infectious diseases the medical mask reduces the deaths to 0. 

Establishing a model that can simulate the spread of infectious diseases is the key to studying the law of infectious disease transmission and predicting the trend of transmission, and it is of great significance to control the spread of infectious diseases. To find the optimal control strategy, we need a large number of simulations to support the searching process. Therefore, one needs the model that can at one hand simulate the spreading of diseases under realistic conditions and at the other hand quickly return the result of the simulations. In the realistic social network, on the one hand, the random contact, including the contact with strangers at public places, becomes more and more important access for the spreading of diseases. One the other hand, the individuals are different form one to the other, and the individual differences will effect the spreading of diseases. For example, the asymptomatic carriers will cause a wider spread of the disease.

In this work, a model that can simulate the spreading of diseases under realistic situations with random contact is proposed. Unlike the conventional approaches, such as the differential equation based compartment models, the cellular automata and the complex network based models, the social connections, in this approach, are described by using the random matrix which is generated under given constraint, the average contact population and the average exposure coefficient. The spreading of the disease is simulated by using the Markov-like process, where the random matrix is considered as a probability transition matrix with the exposure coefficient (the elements of the random matrix) the transition probability. The diseases are characterized by the recover coefficient and the pathogenic and lethal thresholds. Instead of estimating the model's parameters according to the real data, we firstly manually choose the parameters, in order to simulate the spreading of diseases under various realistic scenarios. We report an interesting property of the proposed method: the major indicators such as the infection and death rates are almost invariant of the size of the model.

A set of experiments show that the proposed model can efficiently simulate the spreading of various diseases under various realistic situations. Therefore, the proposed model can give a large number of simulations supporting the searching of optimal diseases control strategy in the following researches. In the next research, the reinforcement learning method, a method that is good at finding the optimal strategy, will be introduced. Together with the proposed model, we give a further investigation to the optimal control strategy of the infectious diseases.

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the present manuscript.

Massively parallel simulations of spread of infectious diseases over realistic social networks

Selectivity of black death mortality with respect to preexisting health

Economic effects of coronavirus outbreak (covid-19) on the world economy

Daniel bernoulli's epidemiological model revisited

Epidemic disease in england-the evidence of variability and of persistency of type

The prevention of malaria

The legacy of kermack and mckendrick

On the formulation of epidemic models (an appraisal of kermack and mckendrick)

Infectious disease spread analysis using stochastic differential equations for sir model

Modelling and analysis of seir with delay differential equation

Implications of asymptomatic carriers for infectious disease transmission and control

A stochastic differential equation sis epidemic model

A primer on stochastic epidemic models: Formulation, numerical simulation, and analysis

Mathematical modeling of infectious diseases using ordinary and fractional differential equations

Lyapunov functionals for delay differential equations model of viral infections

A generalized differential equation compartmental model of infectious disease transmission

Direct transmission via households informs models of disease and intervention dynamics in cholera

Modeling the epidemiology of cholera to prevent disease transmission in developing countries

Spatial heterogeneity, host movement and mosquito-borne disease transmission

Modeling wolbachia spread in mosquitoes through delay differential equations

Infectious disease transmission models to predict, evaluate, and improve understanding of covid-19 trajectory and interventions

Occurrence of backward bifurcation and prediction of disease transmission with imperfect lockdown: A case study on covid-19

Addressing population heterogeneity and distribution in epidemics models using a cellular automata approach

A model based on cellular automata for investigating the impact of lockdown, migration and vaccination on covid-19 dynamics

Modeling the spread of epidemics based on cellular automata

Epidemiological modeling with a population density map-based cellular automata simulation system

A data-driven understanding of covid-19 dynamics using sequential genetic algorithm based probabilistic cellular automata

Stochastic modeling for dynamics of hiv-1 infection using cellular automata: A review

Spread of infectious disease modeling and analysis of different factors on spread of infectious disease based on cellular automata

Epidemic processes on complex networks

Epidemic processes in complex networks

Epidemic spreading in scale-free networks

Epidemic incidence in correlated complex networks

Thresholds for epidemic spreading in networks

Push-and pull-based epidemic spreading in networks: Thresholds and deeper insights

Epidemic spreading in correlated complex networks

Immunization and epidemic threshold of an sis model in complex networks

Epidemic spreading on a complex network with partial immunization

Random matrix theory

Random-matrix theories in quantum physics: common concepts

Random matrix theory and wireless communications

We are very indebted to Prof. Wu-Sheng Dai for his enlightenment and encouragement.