key: cord-0581646-43vqes9w
authors: Wang, Yuan; Ishii, Hideaki; Bonnet, Franccois; D'efago, Xavier
title: Resilient Consensus for Multi-Agent Systems under Adversarial Spreading Processes
date: 2020-12-26
journal: nan
DOI: nan
sha: d551e59602a40ff38768dd0a157e41f896beabc4
doc_id: 581646
cord_uid: 43vqes9w

This paper addresses novel consensus problems for multi-agent systems operating in an unreliable environment where adversaries are spreading. The dynamics of the adversarial spreading processes follows the susceptible-infected-recovered (SIR) model, where the infection induces faulty behaviors in the agents and affects their state values. Such a problem setting serves as a model of opinion dynamics in social networks where consensus is to be formed at the time of pandemic and infected individuals may deviate from their true opinions. To ensure resilient consensus among the noninfectious agents, the difficulty is that the number of infectious agents changes over time. We assume that a local policy maker announces the local level of infection in real-time, which can be adopted by the agent for its preventative measures. It is demonstrated that this problem can be formulated as resilient consensus in the presence of the socalled mobile malicious models, where the mean subsequence reduced (MSR) algorithms are known to be effective. We characterize sufficient conditions on the network structures for different policies regarding the announced infection levels and the strength of the epidemic. Numerical simulations are carried out for random graphs to verify the effectiveness of our approach.

Recently, the pandemic of COVID-19 has highlighted the necessity and effectiveness of unknown disease peak control. Since it may take up to several years to develop and supply vaccines sufficiently broadly [12] , a large ratio of population may become infected simultaneously, which would cause huge pressure to hospitals and medical sectors. Initial measures such as keeping social distance and avoiding unnecessary gatherings are introduced to slow down the disease spreading. By temporarily reducing contacts with others at proper periods, peaks can be reduced in frequency and the number of infected patients. Recent studies related to theoretical studies on peak control can be found in, e.g, [18] , [21] .

In this paper, we are interested in studying the impact of adversarial spreading processes in the context of multiagent systems. In such systems, agents locally interact with each other to carry out certain global tasks, but infected agents may deviate from their normal behaviors, which may even be harmful to the system. Hence, the healthy agents must pay attention to the level of epidemics in the environment and accordingly regulate their interactions with others.

As the global task of the agents, we limit our study to the problem of consensus, which is also important in applications such as those in opinion dynamics from social networks [2] , [9] , [13] , [31] . Under conventional consensus algorithms, it is well known that agents can become influential by simply being stubborn, keeping their state values unchanged. This means that infected agents may falsely lead the state values of others. In the worst case, they may split the agents into several clusters in their states and prevent them from consensus forming.

In particular, we formulate a novel problem where the agents may fall in infectious statuses depending on the susceptible-infected-recovered (SIR) epidemic dynamics in the environment. Infected agents will take unexpected behaviors in their values and agents free from the infections should avoid using such state values when updating their own ones. Here, we employ resilient versions of consensus algorithms, which are designed to be used in the presence of adversaries, proposed in the works such as [19] , [33] , [35] . While we follow the line of research on fault tolerant consensus algorithms (e.g., [19] ), the main difference is that the number of faulty agents is time varying. This information may be partially known from a local policy maker who attempts to estimate the epidemic condition and makes announcements for the local area about it in real time. For example, medical experts may announce the current statistics of the number of patients through the media; we however point out that a policy maker will not assist the consensus forming that locally takes place among agents. We believe that the studied problem is helpful to enhance the resilience for consensus-based applications under adversarial spreading processes such as opinion dynamics at the time of pandemic. To the best of our knowledge, this paper is the first dealing with such a problem.

It is emphasized that to deal with this problem, the notion of mobile malicious agents studied in [35] becomes critical. Most existing works on resilient consensus focus on adversaries that remain fixed at certain agents. Inspired by epidemic peak control [18] , [21] , we extend the mobile malicious model to the case with pandemics. Here, the malicious agents no longer move, but may infect other regular agents. For the infected agents, their behaviors as well as their values can be changed. Meanwhile, the infected agents will also recover from the infection at a certain rate. Once cured, they should follow the designed update rules even if their own values may be corrupted.

Related works: Epidemic spreading models have been long studied among many different fields including mathematical biology [16] , computer science [22] , [25] , [29] , social science [2] , [13] , [15] , [30] , [31] and so on. Various epidemic models have been studied, but the most common ones include the susceptible-infected-susceptible (SIS) model and the SIR model. In the traditional SIS model, the agents are divided into two groups taking the susceptible and infected states; the ratios of the two groups may exhibit dynamic behaviors over time. More recently, the SIS model has been incorporated with networks representing interactions of agents, where the pandemic process evolves over timevarying networks [26] , [27] , [34] . Similar trends can be found in the SIR model, where the agents may be in susceptible, infected, or recovered state. Once agents are recovered from their infected states, they will not be infected again. Recent works on SIR-type models focus on improving the model based on real data [6] , considering time delay issues [3] , and applying the SIR model in other problems such as information source detection [7] and information epidemics in social networks [20] . In [24] , the development, analysis and control problems for epidemic models are reviewed from the viewpoint of systems control. More recently, several works attempt to combine opinion dynamics models with epidemic models in, e.g., [30] , [37] , where the two models are assumed to have the same network structure and the coupling in their dynamics is highlighted.

In contrast, our work focuses on consensus forming and follows the line of fault-tolerant algorithms for multiagent systems. We assume that the infected agents follow the so-called malicious adversary model. The infected agents may lose their original state values and broadcast their corrupted values. The goal for other non-infected agents is to reach consensus at a safe value, within the range of the original values of the non-infectious agents. Moreover, our results are motivated by the class of Mean Subsequence Reduced (MSR) algorithms, which has been studied in [19] , [33] , [35] . In such algorithms, the non-infected agents ignore suspicious values from their neighbors. Such algorithms do not need to detect the adversarial neighbors, but require a certain level of connectivity in the network. In the area of opinion dynamics, MSR-like algorithms can be found such as the Hegselmann-Krause (HK) model [2] , [13] , [31] . There, each agent removes all values that are sufficiently different from its own opinion at eachtime step before taking the average of the rest in their updates. Different from the MSR algorithm, the number of removed values is not restricted.

Other applications: There are other interpretations of the proposed two-layer multi-agent consensus model under adversarial spreading processes studied in this paper. One is to consider the epidemic and the behaviors of the malicious agents as outcomes of rumors spreading in the community. In fact rumors can be viewed as an "infection of the mind." Studies on transmission of rumors over the Internet and the behaviors of social networks can be found in, e.g., [14] , [23] , [38] . In this setting, one may obtain the information about the level of rumors spreading from SNS and other sources.

On the other hand, in the context of computer networks and wireless sensor networks, propagation of viruses may follow epidemic models as discussed in [8] , [22] , [29] . Indeed, distributed algorithms for consensus have a range of applications such as load balancing in multiprocessor networks [1] , averaging [5] , clock synchronization in wireless sensor networks [17] , and rendezvous in robotic networks [28] . Cautions for viruses may be transmitted by security software companies, to which individual agents can adopt.

Contributions: The contribution of this work is threefold: First, we extend the mobile adversary model of one-to-one mobile behavior to an epidemic case. In our previous work [35] , a malicious agent moves to another agent and leaves a corrupted value. The agent that just recovered is considered as a cured agent. Note however that such an agent may have a corrupted value and hence should be treated as a malicious one for another round so as to apply a protocol to adjust its value. Second, we introduce an intervention measure to the epidemic model to enhance the resiliency of the consensus protocols employing the modified MSR algorithms. In conventional epidemic works such as [21] , the intervention takes the form of the ratio at which the general public should avoid contacts with others. In this work, we relate such intervention ratio with the parameters employed in MSR algorithms for pruning the neighbors' values; we formally analyze their role in resilient consensus. Third, based on the modified MSR algorithms used for mobile malicious models, we propose two protocols with static and adaptive policies for the intervention ratio globally announced and hence the the pruning parameters locally at the agents. For both protocols, we characterize the graph conditions and tolerable pandemic level under which resilient consensus can be guaranteed. Compared with the conventional mobile malicious model, the epidemic malicious model is more powerful since it has dynamic malicious agents. The pruning number is no longer given, but has to be chosen properly according to the infection level at the time. The existence of pruning numbers that can guarantee resilient consensus in a given graph is also discussed.

Outline: This paper is organized as follows. In Section 2, preliminaries and the general problem setting are introduced. In Section 3, two protocols for resilient consensus are proposed based on static and realtime information regarding the infection. Our results for complete and noncomplete graphs are presented in Sections 4 and 5, respectively. We demonstrate the efficacy of the algorithms by numerical examples in Section 6. Section 7 gives concluding remarks. A preliminary version of this paper will appear as a conference paper [36] . The current paper contains all proofs for theoretical results and extensive simulations are carried out as well.

Denote by G = (V, E) the graph consisting of n nodes, where V = {1, 2, . . . , n} is the set of nodes and E ⊂ V × V is the set of edges. The edge (j, i) ∈ E indicates that node j can send a message to node i and is called an incoming edge of node i. Directed graphs are considered, in which (j, i) ∈ E does not necessarily imply (i, j) ∈ E. Let N i = {j : (j, i) ∈ E} be the set of in-neighbors of node i. The in-degree d i of node i indicates the number of its in-neighbors, i.e., d i = |N i |.

The path from node i 1 to node i p is denoted as the sequence (i 1 , i 2 , . . . , i p ), where (i j , i j+1 ) ∈ E for j = 1, . . . , p − 1. The graph G is said to contain a spanning tree if from some node, there are paths to all other nodes in the graph. The graph G is partitioned into m disjoint subgraphs and we denote them by G s = (V s , E s ), s ∈ {1, . . . , m}, where m ≤ n, V s ∩ V r = ∅ for all s, r ∈ {1, . . . , m}, V 1 ∪ · · · ∪ V m = V and E s ⊂ V s × V s .

In this subsection, we provide an overview of the problem setting of resilient consensus for multi-agent systems in an environment where adversaries are spreading.

The overall system considered here consists of two layers, representing different models related to (i) the environment and (ii) multi-agent systems. This is shown in Fig. 1 , where the two layers interact with each other and there is an overall feedback structure. The layer of the environment is determined by the SIR epidemic model. In the agent network layer, there are multiple local policy makers placed in the subgroups. They can estimate the ratio of the infection in the subgroup and then to announce how every agent in the subgroup should reduce their interactions with others. The announcement is made by adjusting the socalled reduction parameter.

In the first layer, to describe the adversarial spreading, we consider a variant of the standard SIR spreading model with policy makers who attempt to regulate spreading of the adversaries by making announcements so that the possible malicious contacts within the environment decrease [21] . In the SIR model, the average fractions of the agents susceptible, infectious, and recovered at continuous time are denoted, respectively, by S(t), I(t), and R(t), where S(t) + I(t) + R(t) = 1 at all t. The adversaries spread at the transmitting rate β > 0 while the infection recovers at the recovering rate γ > 0. It is common to denote the basic reproduction number by R 0 = β/γ. It represents the reproduction ability, which may correspond to the strength of the general adversarial spreading behavior.

The role of the policy makers is to regulate the transmitting rates by forcing all the local members to reduce their contacts with others. The overall transmission reduction rate is represented by the global parameter b(t) ∈ [0, 1]. It can be regarded as the overall feedback control input from the agent network layer. This results in a smaller transmitting rate at b(t)β. In the conventional SIR model where no regulation is made, this parameter remains at b(t) ≡ 1.

With the transmission reduction based on b(t) at time t, the pandemic process is described by the SIR model aṡ

(1)

In this paper, we deal with the system in the discrete-time domain by discretizing the SIR model above. Let ∆T be the sampling period. Denote the variables at time k∆T by S(k), I(k), and R(k) and so on. For simplicity, we take the sampling period for the epidemics to be the same as that of the agents. Based on the Euler method, when ∆T is sufficiently small, the continuous-time dynamics in (1) can be approximately described by the following discrete-time dynamics:

S(k + 1) = S(k) − b(k)βS(k)I(k)∆T, I(k + 1) = I(k) + b(k)βS(k)I(k)∆T − γI(k)∆T, R(k + 1) = R(k) + γI(k)∆T.

(2)

As mentioned above, b(k) is the key parameter to control the pandemic peak. It is determined by the policy maker who performs the analysis on I(k).

Different policies can be considered for the choice of the transmission reduction. We illustrate this point by a numerical example with the basic parameters taken as β = 0.4, γ = 0.1, and ∆T = 0.01. In Notice that in the base case (i), the transmitting rate β is high, resulting in the peak of I(k) greater than 0.4. In the other cases, the peaks are about 0.3. The simple policy (ii) with b(k) ≡ 0.7 is the most demanding in terms of transmission reduction over time; the peak is smaller than 0.3, but occurs late, delaying the recovery. The adaptive policy (iii) is also demanding, but the reduction increases slowly in response to I(k); the peak is less than 0.25 and occurs early. The time-limited method (iv) is also effective to keep the peak to appear early in time.

The second layer is at the lower level, containing a number of agents represented by a network. As described above, the adversarial dynamics is represented by the SIR epidemic model. This determines the ratios of agents that are susceptible, infectious, and recovered. Now, we partition the graph into m subgraphs G 1 , . . . , G m . The agents in each subgroup are subject to infections though the infection ratios may be different in the subgroups. Moreover, each subgroup is equipped with a local policy maker who makes the local announcement for real-time transmission reduction in the subgroup. Since the agents in the subgroups follow the announcements, the overall infection dynamics in the environment is regulated as we shall see more later. The agent set V is partitioned into three disjoint sets: S(k) of susceptible agents, I(k) of infectious agents, and R(k) of recovered agents, where S(k) ∪ I(k) ∪ R(k) = V. Denote the local ratio of the infection I s (k) by

We consider the heterogenous case where I s (k) is different from subgroups. The global ratio of infection I(k) is

Then for all s ∈ {1, . . . , m}, the local infectious ratio dynamic can be can be described by

where |w s (k)| ≤ w ∈ [0, 1] is the real time heterogeneity for subgraph G s . Clearly, if w = 0, then it is a homogenous spreading process for all subgroups.

In the agent network layer, there is a local policy maker for each subgroup s in charge to estimate the local ratio of the infection I s (k) and then to announce how the agents in subgraph G s should reduce their interactions with others. The announcement is made by adjusting the socalled local reduction parameter b s (k). The global reduction parameter b(k) is the key parameter that provides a feedback control mechanism to suppress the infection peak in the SIR spreading process. It is represented by the interval containing all the local reduction parameters b s (k) and thus given by

Furthermore, in the second layer, the regular agents in the susceptible and recovered states employ a resilient variant of consensus protocols called the MSR algorithms. In the application example of opinion dynamics under pandemic, the states of the agents represent the opinions of the individuals on certain issues (social, political, and so on). Each agent will follow the announced level of reduction parameter from its corresponding local policy maker. Specifically, they reduce their contacts by ignoring some of the values received from neighbors. Notice that due to the SIR epidemic model, the number of neighbors who may be infectious is time varying. In particular, the recovered agents have to be careful not to restart their consensus protocols using the values left from the infected periods. Under such circumstances, conventional resilient consensus algorithms are not capable to eliminate the adversarial effects. Instead, one must resort to the notion of mobile adversarial models, where the identities of the malicious agents switch, and also to algorithms robust to such models (e.g., [4] , [11] , [35] ).

The objective of the multi-agent system in the lower layer is to reach consensus on their state values, and this must be achieved without being influenced by the infected agents. The susceptible agents may not become infected by solely interacting with infectious agents. However, the states of the infectious agents may be affected by the adversaries. Thus, the susceptible and recovered agents must take preventative measures at the time of updating their own state values.

In our problem setting, three issues are present, creating difficulties for the decision making of the agents to reach consensus. First, the infection may spread quickly depending on the parameters that determine the strength of the adversarial spreading, and large peaks may appear. Second, the ratio of infectious agents is unknown in general. In the second layer, this means that the identities of the infectious agents are unknown to the susceptible and recovered ones. As a third hurdle, we impose that the agents must continue with their interactions during the high peaks. Note that if sufficiently many agents become infected over time, the original values of their states may become lost from the system, which will make it difficult for the agents to arrive at a safe, reasonable value for consensus. We mention this point since in the SIR epidemic model, the infections are bound to cease in the long run. Hence, the most critical incident that must be avoided is to lose safe values from the entire multi-agent system.

In this section, we explain the resilient consensus in the lower layer of the overall system, where the multi-agent system is placed.

The network of agents is represented by the directed graph G. In our setting, the status of the agents is determined by the condition of the adversaries in the environment as follows. At each time k, in accordance with the fractions of S(k), I(k), and R(k) in the SIR epidemic model. Each agent i ∈ V has a state value denoted by x i (k) at time k. At nominal times when no outbreak of the adversaries is present, all agents would follow the protocol below for seeking consensus in their values:

where the weights a ij (k) ∈ [0, 1) satisfy a ii (k) + j∈Ni a ij (k) = 1. It is well known that the agents will arrive at consensus, i.e.,

We call the agents to be regular if they are in the susceptible status S(k) and the recovered status R(k). These agents are capable to execute the given algorithm and maintain their values accordingly. On the other hand, the infected agents may have corrupted values. In particular, for each agent in the infectious status I(k), its value is updated as

where the input u i (k) can be set arbitrarily due to the adversaries. Now, we introduce the notion of resilient consensus for the multi-agent system under the epidemic model.

and behaviors of the infectious agents and any state values of the regular agents, the following two conditions are satisfied, then we say that the regular agents reach resilient consensus:

1) Safety condition: There exists a bounded interval B ⊂ R determined by the initial states of the regular (susceptible and recovered) agents such that x i (k) ∈ B for all i ∈ S(k) ∪ R(k), k ∈ Z + . 2) Consensus condition: The regular agents eventually take the same value as

Under adversarial spreading, it is possible that resilient consensus is achieved before adversaries die out, but as long as spreading adversaries remain in the system, the values of the agents in the normal status, i.e. those in the set S(k) ∪ R(k), may change over time. This is because agents that just recovered can have corrupted values, and when they rejoin the system, it may take time for the normal agents to reach consensus again. However, in Definition 3.1 above, resilient consensus means that such transient behaviours will eventually stop. The definition is consistent with that for the case of adversaries at fixed agents as in [10] , [19] , where once the regular agents achieve resilient consensus, the agents will remain in that status.

In our approach, the regular agents follow the socalled MSR algorithm to protect their values from being corrupted by using those received from infected agents. Their states are updated as in (5), but with a restricted number of neighbor values using the pruning number

The algorithm is outlined below. In particular, we present a modified version of the MSR algorithm from [35] . Algorithm 3.1. At each round k, each regular agent i ∈ S(k) ∪ R(k) executes the following three steps:

1) Agent i sorts the values x j (k), j ∈ N i , received from its neighbors and its own value x i (k) in descending order. 2) After sorting, agent i deletes the F i (k) largest and the F i (k) smallest values. The deleted data will not be used in the update. The set of indices of agents whose values remained is written as

3) Finally, agent i updates its value by

We would like to introduce the following locally homogenous assumption as the first step of this study. The local infection ratio I s (k) represents the infected ratio around each agent in subgroup s:

Note that this assumption indicates that the infection ratio around each agent in the subgraph G s is homogenous. However, as a special case, if |V s | = 1, ∀s ∈ {1, . . . , m} (i.e., each agent is equipped with a policy maker), then based on (3), we always have (8) . This locally homogenous assumption is clearly an assumption that simplifies the problem, and it is left for future studies to use more sophisticated models such as [26] , [32] .

To design a successful MSR algorithm to prevent the corrupted values from affecting the agents, it is critical that the regular agents have sufficient information regarding the ratio of local infections. In particular, the designer must guarantee that all the local pruning number F i (k) is greater than the number of infected neighbors and smaller than half of the local degree at all k as

Otherwise it is impossible to remove all the corrupted values. Note however that the exact information of I(k)

is not available to anyone in the system. In our approach, we connect the pruning number F i (k) used in the MSR algorithm and the local transmission reduction parameter b s (k), where i ∈ V s . This point is discussed in the next subsection.

The local policy maker has an estimate of the ratio I s (k) of local infectious agents and decide the local transmission reduction parameter b s (k). Note that 1 − b s (k) represents the ratio of contacts that node i should cut down on. The nominal case without infected agents means b s (k) = 0 and all neighbor values can be used. On the other hand, there are at most d i I s (k) corrupted values based on (8) . Under the modified MSR algorithm, node i ∈ V s should remove

Considering the bound for subgroup heterogeneity w in (4) in addition, we assume that the local policy maker chooses the local transmission reduction parameter b s (k) by

Hence, upon receiving the announced value of b s (k), to follow the policy maker, the regular agents need to choose their

Moreover, F i (k) < d i /2 is necessary for agents using the MSR algorithm to have at least one neighbor at each update after the removal of extreme values. Thus, F i (k) must be chosen as

In fact, from (8), (10) and (11), we can obtain the bound (9) . Moreover, by local transmission reduction parameter b s (k) in (10), based on (4) 

It is important that each local policy maker of node i is sufficiently knowledgeable that he always selects the transmission reduction parameter b i (k) satisfying (10) . As long as this condition is met, the frequency of the updates in b i (k) need not be the same as that of the agent states.

Here, we characterise the model for the infectious agents in terms of their behaviors when they are infected and then recovered. The infectious agents are considered to be adversarial in this work. In particular, we follow the malicious model of [19] , where the classification is based on their number, locations, and behaviors.

is said to be malicious if it can arbitrarily modify its local variables as in (6) and send the same value to all of its neighbors each time a transmission is made.

The motivation for considering malicious agents as defined above comes, for example, from the applications of social networks and computer networks where agents communicate by broadcasting their data. Especially the infected agents might loudly announce their extreme opinions/data to their neighbors..

It is important to notice that under our pandemic model, the identities of the infected, malicious agents change over time. This is in contrast to the conventional models in, e.g., [19] , where the malicious agents remain the same. To this end, we must incorporate the more general model known as the mobile malicious agents studied in, e.g., [4] , [11] .

Under such mobile malicious models, the infectious agents have two properties different from the conventional static model (see Fig. 3 ). First, a malicious agent may infect regular agents so that their statuses change. While infected, agent i ∈ I(k) broadcasts its corrupted state x i (k) (controlled as in (6)) to its neighbors, but then becomes recovered at the recovering rate in the SIR epidemic model. Second, once recovered, agent i ∈ R(k) collects and updates its own as a regular agent. However, in the first round after the recovery, the agent may still possess a corrupted value left from its infected period. Hence, such an agent should be considered still infected and will be said to be in the cured status. Moreover, the agent should refrain from using its own value in the cured status. These aspects will be taken into account in the proposed algorithm. 

In this paper, we address the resilient consensus problem where the agents' statuses change based on the epidemic model. Regarding the local transmission reduction parameter b i (k), two policies are studied: Static and dynamic. The analyses for the two policies will follow in Section 4 for the complete graph case and Section 5 for the noncomplete graph case.

We establish conditions under which the agents equipped with Algorithm 3.1 can reach resilient consensus in the epidemic malicious model. In this section, we first present the results for networks in a complete graph, where all nodes interact with each other. Hence, from (3), there is no subgroup heterogeneity and ∀s ∈ {1, . . . , m}, b s (k) = b(k), I s (k) = I(k) in this section.

The static policy is a special case, where the local transmission reduction parameter is fixed as b(k) ≡ b 0 ∈ [0, 1] for the entire time horizon. This constraint may limit the feasible strategies for the overall system since the condition in (10) reduces to b 0 ≤ 1 − 2I(k) for all k. As a consequence, from (11), each regular agent i can use a constant for the

In fact, for this case, an analytic bound for the pandemic peak can be obtained. Suppose that the local transmission reduction parameter b 0 is large enough that

Note that this relation requires R 0 > 1. From the work [16] , it is known that under these conditions, the maximum of I(t) in (1) can be obtained as

Since I(k) is the sampled value from (1) and the sampling period ∆T is small enough, approximate of the maximum I(t) can be taken as max k∈Z+ I(k) ≈ max t>0 I(t). Here, for simplicity, we assume that S(0) ≈ 1 (and thus I(0) ≈ 0) and use the upper bound I max (b 0 ) given by

Note that if (13) does not hold, i.e., b 0 R 0 ≤ 1, then the announced transmission reduction parameter is so small that I(k) becomes a nonincreasing function of time and thus max k∈Z+ I(k) = I(0) ≈ 0. This is a trivial case with no infection and hence not of interest in this paper.

We next introduce a lemma from our previous mobile malicious work [35] . This lemma will be used in the proofs of the results for the complete graph case studied in this section. We slightly modified the statement for the problem setting in this paper. (14), whose network G forms a complete graph. Then, under (13) the regular agents using Algorithm 3.1 reach resilient consensus if and only if n > 2 max i∈V F i (k) + 1 ≥ 2I(k)n + 1.

When the policy for the transmission reduction parameter is static, we obtain the following result. Let b * be the solution to the equation

Proposition 4.1. Consider the multi-agent system under the SIR epidemic malicious model whose network G forms a complete graph. If R 0 > 1, then the solution b * ∈ (1/R 0 , 1] to (15) always exists. By

the regular agents using Algorithm 3.1 with the static policy reach resilient consensus.

For the complete graph, it holds d i = n − 1 for all i ∈ V. Thus, from (11), the choice of F i0 becomes that in (16) . For later use, let

Proof : We first show that b * ∈ (1/R 0 , 1] satisfying (15) always exists. Substituting (14) 

We discuss the property of f

Then, at b = 1, it holds

In fact, f (b) is an increasing function in (1/R 0 , 1] since

Thus, it follows that there always exists b * ∈ (1/R 0 , 1] such that f (b * ) = 0 and, moreover, for each b ∈ (1/R 0 , b * ], it holds f (b) ≤ 0. This indicates that the chosen b 0 satisfies

Hence, we have that F i0 can be taken as in (16) . From (16) , it is immediate that the conditions in Lemma 4.1 are satisfied. As a result, resilient consensus can be achieved.

Using the dynamic policy for the transmission reduction b(k), we can adapt it as the epidemic level changes. The following lemma will be instrumental. It gives an upper bound on the infectious ratio I(k) under the dynamic policy. 

Proof : By (10), we have b(k) ≤ 1 − 2I(k). It is clear that the infectious ratio I(k) is smaller if b(k) is larger. Hence, we take b(k) the largest as b(k) = 1 − 2I(k). Substituting this into the epidemic model (2), we obtain the dynamics for I(k) as

By the definition of R 0 , this can be written as

Since I(k), ∆T , and β are nonnegative, clearly, the sign of the increment I(k + 1) − I(k) is determined by

By the assumption S(0) ≈ 1, we have I(0) ≈ 0 and thus,

Hence, in the initial period, I(k) is nondecreasing while g(k) ≥ 0. The value of g(k) decreases until it becomes negative at some time k 1 > 0. That is, it holds that g(k) ≥ 0 for k ∈ [0, k 1 − 1] and g(k 1 ) < 0. Then, by (19) ,

Now, we look at I(k) for k > k 1 . Suppose that at some time k 2 > k 1 , we have I(k 2 ) ≥ I(k 1 ). From (19), we have

Hence, it holds g(k 2 ) < 0 and thus I(k 2 ) < 1/2 − 1/(2R 0 ). Therefore, for all k, we attain I(k) < 1/2 − 1/(2R 0 ). We finally characterize conditions for resilient consensus. Proposition 4.2. Consider the multi-agent system under the complete network G where the malicious agents follow the SIR epidemic model with R 0 > 1. If the pruning number satisfies I(k) · n ≤ F i (k) < n/2 for i ∈ S(k) ∪ R(k), then Algorithm 3.1 with the dynamic policy can guarantee resilient consensus.

Proof : Lemma 4.2 shows in particular that under the dynamic policy (10) for b(k), it holds I(k) < 1/2 at all times. This is in fact critical for the policy to maintain b(k) ≥ 0. Moreover, the pruning number F i (k) can always be selected as in (11) . It is then straightforward to show that the conditions in Lemma 4.1 hold. As a result, we conclude that resilient consensus is established.

To compare the two proposed policies, we test them in a range of R 0 ∈ [1, 19] , where initial states are set as S(0) = 0.9, I(0) = 0.1, ∆T = 0.01, γ = 0.1. Moreover, we change the infectious rate β ∈ [0.1, 1.9] and observe how the maximum infectious ratio changes. The result can be found in Fig. 4 . We can see that both proposed policies can suppress the maximum infectious under 0.5 for all R 0 ∈ [1, 19] . The other policies such as fixed reduction b = 0.5 or relaxed dynamic policy b(k) = 1 − I(k) cannot guarantee the maximum infectious under 0.5 when R 0 becomes large. This indicates that our proposed policies are well-designed and the infection ratio could be suppressed as a minor part during the whole pandemic process. Next, to check the approximate lengths of the pandemic periods under different policies, we show in Fig. 5 the time when I(k) < 0.1 is reached. From the plots, we can see that for proposed static and dynamic policies, when R 0 ∈ [1, 2], there is a significant increase in the pandemic period. This indicates that a small R 0 may not infect a major part of the agent network. The proposed policies could suppress such weak pandemic processes in a short time with small infectious peaks. When R 0 > 3, the pandemic period does not change too much as R 0 increases. The reason is that the pandemic is so powerful that the infectious agents increase rapidly, and then the susceptible agents decrease rapidly so that the pandemic cannot continue for long. With the same recovering rate γ, they will have similar pandemic periods since the infectious peaks decrease at the same speed. Note that with fixed reduction b = 0.5, there is a delay in the increase in the pandemic period. In particular, when bR 0 becomes greater than 1, a significant change happens.

In this section, we demonstrate that for agents operating over a noncomplete graph in the epidemic environment, resilient consensus can be attained by the proposed algorithm. Again, both static and dynamic policies are considered, and we derive conditions on network structures.

In the noncomplete graph case, the pruning number F i (k) for i ∈ V should be taken slightly differently from (11) as

where d min = min i∈V d i . For the noncomplete graph case, a sufficient condition on graph structures is provided in [35] , which we state as a lemma. 

We demonstrate the effectiveness of the static policy for the noncomplete graph case and provide a condition on the graph structure for achieving resilient consensus under the epidemic malicious model. Considering the subgroup heterogeneity, we know that there is at most increment w from the global ratio of infection I(k) to the local ratio of infection I s (k). Let b * ∈ (1/R 0 , 1] be the solution to the equation

Denote W s as the heterogeneity upper bound for the proposed static policy. We must limit the heterogeneity by assuming w < W s = 1 2 (1 − 1 R0 ). (1 − b) . Substituting (14) into this, we have

We discuss the property of f (b) for b ∈ (1/R 0 , 1]. Since R 0 > 1 and w < 1

Then, at b = 1, it holds

Thus, it follows that there always exists

The design procedure is as follows. First, each agent i ∈ V must have sufficiently many neighbors that

Note that by the assumption that 1 < R 0 < 2, such d i always exists. Here, we further assume that

Then, take the local transmission reduction b i0 satisfying

The pruning number F i0 for agent i should be taken as in (20) so that

Then the following result ensures that the parameters appearing in the procedure above can always be found and they will enable the agents to form consensus. Finally, by (24) , it holds that I max (b 0 ) + w < (1 − b 0 )/2, indicating that max i∈V I i (k) < (1 − b 0 )/2, and thus F i0 from (26) satisfies the condition of Lemma 5.1. Therefore, resilient consensus can be achieved.

Now, we give the result when the dynamic policy is used for the noncomplete graphs case. Denote by W d the heterogeneity upper bound for the proposed dynamic policy. We must limit the level of adversarial spreading and heterogeneity by assuming R 0 ∈ (1, 2) and w < W d = 1 2R0 − 1 4 . Then, each agent i ∈ V takes enough neighbors so that

Here, due to the condition on R 0 and w, it holds 3/2 − 1/R 0 + 2w ∈ (1/2, 1) and hence, by (27) ∈ (1, 2) , where the malicious agents follow the SIR epidemic model with heterogeneity bounded from above by w < W d = 1 2R0 − 1 4 . Then the regular agents using Algorithm 3.1 with the dynamic policy (10) using parameters in (20) and (27) reach resilient consensus.

Proof : To establish resilient consensus, we must show that the condition in Lemma 5.1 holds. Under the dynamic policy, by Lemma 4.2, it holds 2I(k) < 1 − 1/R 0 at all times. Hence, by (27), Thus, the part of the condition (21) in Lemma 5.1 holds. By the choice of F i (k) in (20) and the dynamic policy (10), it follows that (21) is satisfied. Note that this result is quite conservative since Lemmas 4.2 and 5.1 have conservatisms. The actual bound for R 0 may be much more relaxed. We will discuss this point by simulations in the next section.

We illustrate the performance of our proposed protocols under epidemic adversary models by a numerical example.

Networks with 1000 nodes were generated by randomly placing nodes having the communication radius of r in the area of 100 × 100. The connectivity requirements are in general difficult to check. There are two local policy makers placed in the network, and thus s = 2. Each subgroup contains 500 nodes, and the local ratio of the infection I s (k) is available for the policy maker. The initial number of infected agents is set to 10 so that I(0) = 0.01, S(0) = 0.99, and R(0) = 0. To ensure the cardinality of the sets S(k), I(k), and R(k) to be integers, we took |S(k)| = ⌈S(k) · n⌉, |R(k)| = ⌈R(k) · n⌉, and |I(k)| = n − |S(k)| − |R(k)|. For simplicity, all agents use the same pruning number, denoted by F . To check the success rate of resilient consensus under different conditions, Monte Carlo simulations were made. Initial states of the non-infectious agents were randomly taken in the interval of [0, 1]. On the other hand, the infected agents were forced to take negative state values at −1. Fig. 6 . Static policy: Success rates for resilient consensus versus reproduction number R 0 and pruning number F i0 with parameter r = 90. Fig. 7 . Static policy: Success rates for resilient consensus versus reproduction number R 0 and pruning number F i0 with parameter r = 50.

To build complete graphs, we set the communication radius to be large (r = 150). Based on Proposition 4.1, we know that for the reproduction number satisfying R 0 > 1, we can always find b 0 ∈ [1/R 0 , b * ) so that we can choose F ∈ ((1 − b 0 )(n − 1)/2, (n − 1)/2) to guarantee resilient consensus. To verify this result, we chose R 0 = 200 and F = 499 and ran 50 Monte-Carlo simulations. In all cases, resilient consensus was achieved. For the dynamic policy, we took F (k) = ⌈I(k) · n⌉; the results also confirmed successful resilient consensus in all 50 simulations.

For non-complete graphs, we examined how the parameters R 0 , r, and F affect resilient consensus under the proposed algorithm. In this subsection, we test the homogeneous adversarial spreading and thus w = 0. Hence, the heterogeneity requirements w < W s and w < W d in Theorems 5.1 and 5.2 are both satisfied. We demonstrate the effects of infectious heterogeneity in the next subsection. First, we focus on the resilience resulting from the network structure. To this end, the communication radii were chosen as r = 90 and r = 50, which represent dense and sparse networks, respectively. For each pair (R 0 , F ) ∈ [1, 3] × [10, 350] of the reproduction number and the pruning number, we performed 50 Monte Carlo simulations to find the success rates for resilient consensus.

The results are shown in Figs. 6 and 7 in the form of heatmaps, where colors varying from yellow to blue indicate the success rates from 1 to 0. Fig. 6 shows that in dense networks (with r = 90), it is possible to achieve resilient consensus even when the reproduction number R 0 is as large as 3. As the epidemic becomes more powerful with larger R 0 , the yellow area shrinks; the lower bound on F increases while its upper bound remains about the same. The upper bound of F is determined by the graph connectivity; when a large pruning number (such as F > 320) is used, the network will become too sparse for the MSR to perform properly, leading to failure in reaching consensus. From Fig. 7 , we notice that in sparse networks (with r = 50), the yellow area becomes limited in size compared with dense networks (with r = 90). (Note however that the scales in y-axis of the plots are slightly different, and the former is not a subset of the latter.) In particular, resilient consensus is almost impossible for R 0 > 2.9. The upper bound of pruning number F for this graph is also much smaller, around 100. Therefore, to summarize, with more connectivity in the network, the multi-agent system can tolerate more powerful epidemics. For sparse networks, resilient consensus may be hard to guarantee and a feasible transmission reduction parameter may not exist. These observations are in alignment with the discussions related to Theorem 5.1.

Next, we slightly change our viewpoint and look at the effects of strength in the adversarial spreading through the reproduction number R 0 on the performance of resilient consensus algorithms. Here, we set R 0 = 1.5 for the weaker case and R 0 = 3 for the stronger case. The success rates for resilient consensus are computed for the pairs For the results of R 0 = 1.5 in Fig. 8 , it is demonstrated that the minimum of the pruning number F is about 50 while the maximum increases for denser networks with larger radii r ≥ 40. This indicates that in the simulations, the number of infectious agents grew from the initial number of 10 to 50. The shaded area in the plot indicates the bound obtained from (27) necessary radius r to meet the formula using d min .

From the simulation results, if the communication radius reaches r = 40, we can almost guarantee the success rate to be 1 if F = 50. However, observe the minimum degree of each agent in this graph, it shows that it is smaller than n/2. From the theoretical bound of this paper, we need the minimum connection of each agent should reach 600. We confirmed that for r ≥ 90, this condition was guaranteed in the 50 runs. This indicates the difference between the actual bound and the theoretical bound.

Under the stronger epidemics with R 0 = 3, we can make similar observations from Fig. 9 . However, clearly more connectivity is required so that larger values of F can be used to guarantee resilient consensus. The minimum requirements are roughly r ≥ 70 and F ≥ 150.

Here, we discuss the performance of the proposed algorithms with dynamic F i (k). Comparisons are made with the static approach using F i0 = 100. The heatmaps representing the success rates of resilient consensus are displayed in Figs. 10 and 11 for (R 0 , r) ∈ [1, 3] × [10, 150] .

In comparison, the dynamic policy works well under a range of conditions. This is because the regular agents can adapt their pruning number based on the current infectious level as F i (k) ≥ I(k) · n guaranteeing F i (k) to be always larger than the actual number of infectious agents in realtime. In Fig. 10 , observe that as R 0 increases, by choosing denser networks (with larger r), the adaptive rule attains resilient consensus. Note that our theoretical result in Theorem 5.2 for the dynamic policy guarantees resilient consensus in dense networks with limited R 0 , located in the shaded area at the left top of Fig. 10 . Fig. 11 exhibits the results for the static policy without policy maker where the pruning number is set as F i0 = 100. It is clear by comparing this plot with Fig. 10 that this protocol is much less capable because of the fixed pruning number. The minimum requirement on the radius is r = 60 since each agent must remove 200 values from it neighbors. The advantage of the static policy is that the agents do not need the realtime information of b(k). Under mild epidemics with R 0 < 2, the agents may choose the communication radius r ≥ 60. Then, with F i0 = 100, resilient consensus can be guaranteed.

In this section, we would like to highlight the effects of the subgroup heterogeneity used in the analysis regarding the spreading of adversaries in the network. We test both static and dynamic policies in the homogeneous infected environment and in the gathered infected environment. Note that in the gathered infected environment, the infection spread primarily in subgroup 2. Only after all the 500 agents in subgroup 2 are infected or recovered, the newly infected nodes will appear in subgroup 1. Intuitively, the local infection ratio in subgroup 2 will be much higher than the global infection ratio I(k). Thus, if the network is sparse (i.e., when the communication radius r is small), then the heterogeneity, which may be measured by the size of w, could be large, and in the simulations, it is possible that w > W s or w > W d .

We test the proposed static policy without the theoretical connectivity conditions given in (23)- (26) . Instead, in each subgroup s ∈ {1, 2}, the local policy makers choose b 0 = b * and the regular agents choose F i0 = ⌈(1 − b 0 ) n 2 ⌉. On the other hand, for the dynamic policy, the regular agents in each subgroup s follow the local transmission reduction parameter b s (k) as in (10) and set their pruning number as F i (k) = ⌈(1 − b s (k)) n 2 ⌉. The condition in (27) may not be met as w may be too large. We run these two policies in both homogeneous infected environment and gathered infected environment. The heatmaps representing the success rates of resilient consensus for four different cases are displayed in Fig. 12 for (R 0 , r) ∈ [1, 5] × [0, 150].

We first discuss the results for the static policy. As shown by the yellow regions in Figs. 12(a) and 12(b), for dense networks (r > 100), the static policy works for almost any R 0 ∈ [1, 5] . This is in alignment with what we have shown in Section 4. For sparse networks (r < 50), we can see that the static policy works with limited R 0 . The results also show that the proposed static policy works similarly in both homogeneous and gathered infected environments. The reason is that we chose F i0 = ⌈(1 − b 0 ) n 2 ⌉, which is based on the worst-case analysis. When the network connection is satisfied so that all agents have enough neighbors, even the gathered infectious distribution is a minor part so that all infected agents can be removed in the MSR algorithm as F i0 ≥ (I max + w)n > |N i ∩ I(k)|.

Similar explanations also apply to the dynamic policy whose results are shown in Figs. 12(c) and 12 (d) . However, note that under small R 0 , this policy performs better than the static one especially when the network is sparse (r < 60). Since the policy can dynamically adjust the pruning number in the MSR to the situation in real time, even with fewer neighbors, the policy can manage to reach resilient consensus under a wider range of conditions as expected. We choose another local static policy for comparison. The policy maker chooses b 0 = b * while the regular agents in this case set their pruning numbers individually according to the numbers of their neighbors as F i0 = ⌈(1 − b 0 ) di 2 ⌉. Compared to the local policy discussed above, this local version may be less resilient since the pruning number is smaller in general when the number of infected neighbors is large. The heatmaps for homogeneous and gathered environments are displayed in Fig. 13 . From Fig. 13(a) , we observe that in a homogeneous environment, the policy works well in a wide range of R 0 and communication radius (r ≥ 80). In sparse networks, the part that has limited neighbors must have fewer infected nodes. The regular agents in such areas may remove less neighbors since F i0 = ⌈(1 − b 0 ) di 2 ⌉ and d i is small. However, as shown in Fig. 13(b) , in the gathered infection environment, the proposed local static policy performs much worse. Since in the gathered areas, regular agents use the pruning numbers smaller than what are necessary, the agents fail to reach resilient consensus. The results for the local dynamic policy by choosing (10) and F i (k) = ⌈(1−b s (k)) di 2 ⌉, which satisfies (11), but does not satisfy (20) , have similar profiles as the local static policy and the related arguments also hold here.

In this part of simulations, we check the time responses for the proposed static and dynamic policies. For this part, a small random network with 100 nodes is chosen in the area of 100 × 100. Two local policy makers are placed in the two subgroups, where each subgroup contains 50 nodes. The communication radius is 100. The initial SIR ratio is S(0) = 0.99, I(0) = 0.01, and R(0) = 0. The sampling period is ∆T = 0.01. Susceptible and recovered agents randomly take initial values from [0, 1], which is considered as the safety interval. Once an agent is infected, its value is changed to −1 by the adversaries. The infected agents are randomly chosen, and hence the homogenous condition is not guaranteed. The time responses for the two cases with R 0 = 5, r = 70 and R 0 = 5, r = 100 are shown in Fig.14 . We place the curves of S(k), I(k), R(k) in the plots so that the real time infectious situation is clear. Here, we also plot the ratio of the agents taking negative numbers in the system. Under normal situations, this ratio should match that of the infected agents since the recovered agents will attain states within the safe interval shortly after their recovery. However, under severe conditions when the recovered agents have too many infected neighbors, this ratio may grow over time. We will see that a phase transition where this ratio becomes 1 can happen.

We first look at the proposed static and dynamic policy, shown in Figs. 14(a), 14(b), 14(c), and 14 (d) . We can see that both policies guarantee resilient consensus and the infected values are almost all coming from the infected agents. To compare the static and dynamic policies with the same R 0 , the result shows that the dynamic policy has an earlier infectious peak. At the beginning, when I(k) is low, the dynamic policy usually has fewer pruning numbers so that the peak may appear earlier. However, the I max in the two policies is almost the same, which is also indicated in Fig. 4 .

Then, we look at the results for local static and dynamic policy, which are shown in Figs. 14(e), 14(f), 14(g) and 14(h). Within the sparse network (r = 70), both local static and dynamic policies fail to reach resilient consensus. The regular agents fail to keep healthy states when the infected agents reach a certain level. Within the denser network (r = 100), the local static policy reaches resilient consensus, but the local dynamic policy fails. These results indicate that for both local static and dynamic policies, the homogenous condition is important for reaching resilient consensus. Moreover, compared with the local static policy, the local dynamic policy is more fragile since they remove less neighbor values. Once there is any agent who has more than I(k)d i /2 infected neighbors, unsafe values dominate the agents, indicating that consensus is reached but it is not resilient.

Next, we would like to show how the initial SIR ratios affect the performance of proposed two policies. We slightly changed the SIR ratios to S(0) = 0.9, I(0) = 0.1, R(0) = 0 and generated the time responses for all policies. The results are shown in Fig. 15 . From the results in Figs. 15(a) and 15(c), we can see both the static policy and the local static policy fail to reach resilient consensus even within the dense network (r = 100). The reason for the failure under the static policy comes from the approximate calculation for I max in (14) , where S(0) ≈ 1, I(0) ≈ 0 are assumed. Here, we found that the initial ratios used here S(0) = 0.9, I(0) = 0.1, R(0) = 0 do not satisfy the assumption so that the calculated b = b * does not match the real I max . When I(k) increases over a bound (in this simulation it is I(k) ≥ 0.2), there are more infected agents than the MSR algorithm is prepared for, preventing resilient consensus. On the other hand, for the dynamic policy, we do not have such an issue and the ratio of negative values reduced to zero together with the infectious ratio I(k) as shown in Fig. 15(b) . We also found that if S(0) ≥ 0.97, I(0) ≤ 0.03, R(0) = 0, then the static policy can guarantee resilient consensus with R 0 = 5.

In this paper, we have considered resilient consensus problems in the presence of misbehaving agents, whose number changes according to the level of pandemic. Resilient protocols have been proposed to mitigate their influence on regular agents. Analyses have been made for both static and dynamic policies for the transmission reduction parameter. Moreover, we characterized the relation between graph conditions and the pandemic reproduction number. Numerical simulations further studied the theoretical and practical bound, homogeneous conditions and time responses of the proposed protocols in random graphs. Future directions include extending this study to other distributed epidemic models and to multi-rate sampling settings, and developing other measures for reduction of contacts and social distancing among agents. We would also like to extend our work to include more sophisticated models for the epidemics such as those incorporating network structures among subcommunities [26] , [32] .

Approximate consensus in stochastic networks with application to load balancing

On Krause's multi-agent consensus model with state-dependent connectivity

A new delay-SIR model for pulse vaccination

Optimal resiliency against mobile faults

Design and analysis of distributed averaging with quantized communication

Time-dependent SIR model for COVID-19 with undetectable infected persons

Detecting multiple information sources in networks under the SIR model

An epidemic theoretic framework for vulnerability analysis of broadcast protocols in wireless sensor networks

Reaching a consensus

Resilient randomized quantized consensus

Reaching (and maintaining) agreement in the presence of mobile faults

Novel vaccine technologies: Essential components of an adequate response to emerging viral diseases

Opinion dynamics and bounded confidence models, analysis, and simulation

Epidemiological modeling of news and rumors on twitter

Maximizing the spread of influence through a social network

A contribution to the mathematical theory of epidemics

Fault-tolerant clock synchronization over unreliable channels in wireless sensor networks

The timing of oneshot interventions for epidemic control, medRxiv

Resilient asymptotic consensus in robust networks

Optimal control for heterogeneous node-based information epidemics over social networks

Optimal, near-optimal, and robust epidemic control

Global stability of a delayed SIRS computer virus propagation model

Theory of rumour spreading in complex social networks

Analysis and control of epidemics: A survey of spreading processes on complex networks

How to withstand mobile virus attacks

Epidemic processes over time-varying networks

Discrete time virus spread processes: Analysis, identification, and validation

An efficient algorithm for faulttolerant rendezvous of multi-robot systems with controllable sens-Ratio S(k) I(k) R(k) Ratio of negative values

Time responses for different policies with S(0) = 0.9, I(0) = 0.1, R(0) = 0 ing range

Computer virus propagation models

On a network SIS epidemic model with cooperative and antagonistic opinion dynamics

Noise leads to quasi-consensus of Hegselmann-Krause opinion dynamics

Decentralized protection strategies against SIS epidemics in networks

Iterative approximate Byzantine consensus in arbitrary directed graphs

Overcoming challenges for estimating virus spread dynamics from data

Resilient real-valued consensus in spite of mobile malicious agents on directed graphs

Resilient consensus against epidemic malicious agents

On a network SIS model with opinion dynamics

SIR rumor spreading model in the new media age

Xavier has been working on various aspects of dependable computing such as distributed agreement, state machine replication, failure detection, and fault-tolerant group communication in general. His interest include also robotics, embedded systems, and programming languages.