key: cord-0981375-czep9fwf
authors: Péni, T.; Szederkényi, G.
title: Convex output feedback model predictive control for mitigation of COVID-19 pandemic()
date: 2021-10-27
journal: Annu Rev Control
DOI: 10.1016/j.arcontrol.2021.10.003
sha: ffbe045bb50cddc34112711aef172a4b3917ea59
doc_id: 981375
cord_uid: czep9fwf

In this paper, a model predictive control approach is proposed for epidemic mitigation. The disease spreading dynamics is described by an 8-compartment smooth nonlinear model of the COVID-19 pandemic in Hungary known from the literature, where the manipulable control input is the stringency of the introduced non-pharmaceutical measures. It is assumed that only the number of hospitalized people is measured on-line, and the other state variables are computed using a state observer which is based on the dynamic inversion of a linear sub-system of the model. The objective function contains a measure of the direct harmful consequences of the restrictions, and the constraints refer to input bounds and to the capacity of the healthcare system. By exploiting the special properties of the model, the nonlinear optimization problem required by the control design is reformulated to convex tasks, allowing a computationally efficient solution. Two approaches are proposed: the first finds a suboptimal solution by geometric programming, while the second one further simplifies the problem and transforms it to a linear programming task. Simulations show that both suboptimal solutions fulfill the design specifications even in the presence of parameter uncertainties.

The COVID-19 pandemic is one of the biggest challenges the world is currently facing with. Until the vaccine and effective treatment are widely available, carefully planned measures have to be introduced in every country to avoid the serious effects of the spreading disease. Choosing a right management policy 5 is a sensitive task where several potentially contradicting objectives have to be taken into consideration. The most important limiting constraint is the capacity of the healthcare system, which can be easily overwhelmed if the spread of the disease is not controlled. It is clear that the transmission of the virus can be efficiently slowed down by appropriate restrictions (social distancing, lockdown), 10 but these measures have negative impacts on the society and the economy that cannot be neglected. At the moment, governments are continuously evaluating the control measures, trying to find balance between public health concerns and the costs of social distancing measures.

The population-level dynamical modeling of disease spread is an intensively 15 studied area with a large number of applicable model structures corresponding to different assumptions [1] . There is also a wide literature on epidemic control design where the input can comprise vaccination and/or non-pharmaceutical measures [2, 3]. Among the possible techniques, model predictive control (MPC) is a straightforward approach for handling the often contradicting goals and strict 20 constraints in this context, therefore, the systematic design of non-pharmaceutical interventions against COVID-19 using MPC has been addressed in several recent publications. In [4] it was shown that the number of fatalities can be efficiently reduced using an adaptive feedback strategy to compute the stringency of social distancing even if there are significant parametric and input uncertainties 25 in the model. At the same time, it is also clearly illustrated in the paper that neglecting the feedback principle might be dangerous in handling the pandemic. In [5] a binary input representing full or no isolation is computed using MPC and a nonlinear epidemic model to reduce the peak of infected individuals while minimizing the economic loss due to the restrictions. This solution is refined 30 and made more realistic in [6] , where weekly piecewise constant input with actuation dynamics are assumed. In [7] , a nonlinear MPC-based solution was proposed for epidemic management, where the main contribution is the application of temporal logic for the incorporation of interdependent and possibly time-varying constraints into the design. 35 As the plenty of successful solutions show, nonlinear MPC is a suitable tool to transform the various control goals and objectives related to the epidemic management into a transparent optimization problem. However, the numerical solution of this problem requires significant computational power even in the case of a simple 4-8 state compartment model. This makes it difficult to extend 40 the algorithms for large dimensional (e.g. multi-scale or multi-region) epidemic 2 J o u r n a l P r e -p r o o f models, to compute control sequence for long horizons and to apply advanced robust control design algorithms (e.g. tube or scenario-based methods). Therefore, an important aim of this work is to reformulate the MPC design in such a framework, where the numerical computations can be performed more effi-45 ciently. From this respect, a recent related work is [8] , where a data-driven model is proposed to describe the spread of the COVID-19 pandemic, and an optimal controller is designed in a convex framework using geometric programming to minimize the final number of deaths while taking into consideration cost and hospital capacity constraints. Moreover, a detailed and realistic actuation 50 model identified from mobility data is also presented in [8] .

A well-known characteristic of predictive control is that it generates a full state feedback policy, which can be applied only if all of the states are measured or estimated. In the work of [7] the possibility of output feedback control was demonstrated by designing a state observer for the epidemic model. In this 55 paper an improved state estimator is proposed, which by exploiting the specific structure of the compartmental model can be constructed in an elegant way and provides reliable state estimates even in the presence of modeling uncertainty. The observer and the convexified nonlinear model predictive controller give an output feedback structure that is proposed as a possible solution to the control 60 problem related to the mitigation of the pandemic.

The transmission dynamics of the epidemic is usually described by compartmental models where each state variable represents one group of the population. The disjoint groups collect individuals of the same infectious status. Depending 65 on the modeling assumptions and goals, different compartmental models can be constructed [9, 10] . In this paper we use the 8 state model introduced in [7] . This specific model structure has been developed to capture the dynamic characteristics of the epidemic that are relevant to our control design objectives. In the model construction it has been important that all parameters have clear 70 physical interpretation and they can be determined with sufficient precision by using the available medical literature and epidemiological data.

In our model the total population is divided into 8 classes: S denotes the susceptibles, i.e. those who can be infected by the disease, L (latent) collects individuals, who have already contracted the disease but do not show symptoms 75 and are not infectious yet; P is the pre-symptomatic compartment for those who are infected, but they need a few days to develop any symptoms. After the incubation period elapses the infected individuals are transferred to the asymptomatic (A) and symptomatic (I) compartments. Those in A will always recover, while the more severe cases in I may require hospitalization and move The differential equations of the compartmental model are written aṡ

The state variables represent the number of individuals in the compartments 85 and are assumed to be normalized to 1, i.e. i∈{S,L,P,I,A,H,R,D} x i = 1, where 1 represents the total population. By construction, i∈{S,L,P,I,A,H,R,D}ẋ i = 0 so the size of the population remains constant. The parameters of the model are summarized in table 1. The nominal values are based on the estimates calculated from the epidemiological data collected in Hungary and also worldwide. For 90 more details, see [7] , [11] .

Until vaccine or appropriate treatment is available, the spread of the virus can only be controlled by limiting the contacts and reducing the infection probability between individuals. The effect of different interventions (from mandatory mask wearing to total lockdown) can be quantified by different scalings of the 95 transmission rate β, therefore it is straightforward to choose the scaling factor v as control input. Assuming that the number of hospitalized and deceased individuals can be reliably documented, state variables x H and x D are chosen for measured state variables. The official number of daily new infections and active cases cannot be used directly, since it is known that only a fraction of the actual cases are detected [12] .

The observability of the model was briefly analyzed in an LPV framework in [7] where it was found that the model is observable from x H , although the observability matrices are numerically badly conditioned. Following the methodology of [13] and using the STRIKE-GOLDD software tool [14] , the identifiability 105 of model parameters was also checked with the assumption that x H and x D are observed. Computation results showed that the model itself is not structurally identifiable, since the parameters δ and q cannot be uniquely determined by measuring only x H and x D . This is an important fact from the point of view of model analysis and parameter estimation, but does not affect the solvability of 110 the control task.

Discretization. Adapting to the predictive control framework, the epidemic model will be used later in discrete time. To discretize the dynamics, we make the following observation: (1) can be considered as a linear time invariant (LTI) system driven by a nonlinear input function: The discretized model can be given formally as follows:

We will use two different sampling times in our study. To obtain acceptable predictions for the considered long horizon, the predictive controller (section 4.) uses T s = 0.5 day for updating the states, while T s = 1 day is used for the observer design (section 3). We further assume that the update of the input can 115 be performed weekly.

The goal of the state observer is to reconstruct the states of model (1) from the available output measurements. The proposed observer is designed in 4 steps, these steps are presented in detail in this section. The observer is 120 designed in discrete time and as we have mentioned above, the sampling time is T s = 1 day. This means that the output measurements can be updated only on a daily basis, which is feasible in practice. It is important to emphasize that the the observer is tested on uncertain models to analyse its properties on realistic scenarios.

We start from the LTI model defined by equations (1c)-(1f), where state x L can be considered as an unmeasured external input and x H is the measured 130 output. Let this dynamical system be converted to discrete time domain and let the resulting transfer function be denoted by G LH (z). One possible approach to reconstruct x L is to invert this dynamics and drive the inverse with input x H . If the model is accurate enough and a causal, stable inverse exists, then a reliable estimatex L can be obtained. Note also that the other 3 state variables 135 are not needed for inversion because the system is linear, so the inverse can be started from any (e.g. zero) initial state.

By examining G LH (z) it turns out that the system is strictly proper and contains unstable zeros. Therefore, the construction of the inverse requires the numerical differentiation of the output and the inverse obtained will not be stable. Fortunately, the first problem is easy to handle. By computing the Hankel singular values of the x L → x H dynamics we obtain the following result: [0.1175, 0.0289, 0.0026, 0]. A zero singular value is present because x A does not affect the x L → x H transfer. This state can therefore be trivially removed from the model. The third singular value is significantly smaller than the other two, so we expect that the removal of the corresponding state does not generate significant error. Doing so, two states are removed from the model. By performing a balanced reduction (function 'balred' in MATLAB) we obtain the following 2-state, proper transfer function:

where p 1 = 0.92787, p 2 = 0.94446 and z 1,2 = 1.27205±0.37353i. (The numerical values of the poles and zeros have been obtained by using the nominal model parameters in Table 1 .) Note that G r LH (z) can be easily inverted, without

The second problem (instability of the inverse) still holds, because G r LH (z) has a complex, unstable zero pair. To make the inverse stable, the zero is transformed back to continuous time byz 1,2 = log(z 1,2 )/T s , then reflected across the imaginary axis and finally it is transformed back to discrete time 145 byz 1,2 = exp(z 1,2 T s ). Then G r,i LH (z) = tf(p 1 , p 2 ,z 1 ,z 2 ) is considered as the stable, approximate inverse of G r LH (z). Due to the reduction and the transformation of the zeros, G r,i LH (z) does not properly cancel G LH (z). The product of them generates a phase shift corresponding to t d = 3.5 days time delay. This can be seen on the bode diagram in Fig 2, where

with the 2nd order Pade approximation of the t d = 3.5 days time delay. It can also be seen that the magnitude of G LH (z)G r,i LH (z) significantly differs from 1 at higher frequencies. Since the epidemic model is controlled by piecewise constant input that changes relatively rarely compared to the sampling time, this approximate inverse will be suitable for computing an estimate for x L . 3.2. Reconstruction of x P , x I , x A By using G r,i LH (z), we can obtain at time t = k·T s , the delayed estimatex L (t− t d ). Using this input, together with the measured, delayed output x H (t − t d ) a state observer can be designed for the state-space counterpart of G LH (z). In this 7 J o u r n a l P r e -p r o o f paper a standard Kalman filter is proposed for this purpose. The Kalman filter 160 also has the practical advantage of being able to handle state and measurement noise, which are typically present in a real system. By using the time shifted inputs, the Kalman filter can produce estimate for the delayed states, that is,

In this step the first 5 equations of the epidemic model are numerically in-

and is regularly updated as we describe later. At the beginning it is initialized to 1 (assuming that the state estimation starts at the beginning of the epidemic). The integra-170 tion of the model requires also the past control input sequence that has been applied on time interval [t − t d , t]. This control sequence is assumed to have been previously stored and is available. The integration of the model is performed in two steps. First the state trajecto-

can be obtained. The compartment of recovered patients will not be used later in the control design, but if it is necessary to estimate x R , it can be easily done by using the estimated states and measured output x D : as the sum of the states

The complete algorithm of state reconstruction is summarized in Algorithm 1.

by updating the states and the output of

We remark that another possible approach is to design an unknown input observer to estimate x P , x I , x A and x L using the theory in [15] . However, for 185 8 J o u r n a l P r e -p r o o f this the first and some higher derivatives of x H are also needed as auxiliary outputs to fulfill the computability conditions.

The epidemic model has one manipulable input: the scalar factor v that 190 scales the transmission rate β (see, Eq. (1)). This variable represents the effect of the measures that are applied on the society in order to control the contact rate between individuals. Regarding the control goals, several scenarios can be reasonable. We are focusing on the mitigation of the epidemic, where the goal is to avoid the overwhelming of the healthcare system in a finite time window by 195 applying less stringent interventions whenever possible. Strict measures have serious adversarial effects on society and economy so avoiding them is desirable [16] , [17] . The time window is typically chosen to be long enough to make acceptable long-term prediction for the future behavior of the epidemic. It is also an option to align it to the estimated date when a suitable vaccine or treatment 200 becomes available.

Following a similar concept as in our earlier paper [7] , we formulate a nonlinear model predictive controller for the mitigation problem above. The control input is computed periodically based on the most recent measurements and the predicted future behavior of the model. Since the control window is fixed (not shifted in each time step), a shrinking horizon MPC is designed. The controller is implemented in discrete domain by using T s = 0.5 sampling time. At a time instant t = k · T s the control input is computed by solving the following optimization problem:

are the predicted state and control input at the k+i-th time step and Since the control input determines rules and restrictions that have to be performed by the society, it takes time to have impact on the dynamics. Therefore it is no use changing the control input at each sampling instant. We therefore 215 assume that v is updated only at every L-th time step (6g).

Assume the control window is [0, H], where H is a priori fixed. The prediction horizon is N (k) + K, where N (k) = H − k is shrinking as k moves towards H. K defines a short time interval after the control window, where the control input is not updated, but the constraints have to be satisfied. This prevents 220 the optimization to simply improve the cost function by generating less effective (in our case higher) control actions at the end of the horizon. Turning off the controller in the last few steps leads to high increase in the constrained state variable, which can therefore easily exceed the limit right after leaving the control window. This safety constraint can be used together with a small weight on 225 the last control action (e.g. w M (k) < min i =M (k) w i ) allowing strict intervention at the end of the control window. This procedure implements the following concept: the system is left at time H in a state, where there exist (at least a strict) intervention that is able to keep the states below the prescribed limits at least for K · T s time period after the end of the control window.

The last parameter of the procedure is M (k) = N (k)/L + 1, which denotes the number of free control moves to be designed. 

The optimization problem (6) is nonlinear and nonconvex. In addition, the control horizon is long, considering that the control window is at least about six months long, but might be even longer. These factors make the control design computationally challenging. Our goal is therefore to recast the original problem to a convex optimization task. For this, the geometric programming (GP) framework [18] , [19] will be used. Geometric programming is a powerful tool to solve nonlinear and nonconvex optimization problems if the decision variables are positive and the cost and the constraints are polynomial functions with positive coefficients. More precisely, a geometric program in standard form 10 J o u r n a l P r e -p r o o f Journal Pre-proof can be given as follows: , c k > 0 and a k,j ∈ R. The main advantage of GP is that (7) can be systematically transformed to a 240 nonlinear convex optimization problem by taking the logarithm of the variables. It is clear that this transformation converts (7c) to linear constraints, but it can also be shown that the logarithm of the posynomials are also convex in the transformed variables. The standard GP can be extended by allowing further functions of posynomials, e.g. max, positive (fractional) power, etc. in the 245 construction of the optimization problem. It can be shown that these operators preserve the convexity. The convex counterpart of a standard (or extended) GP can be constructed manually or by using a suitable software tool, e.g. the CVX toolbox [20] . More details on GP and its applications can be found in the paper and book [18, 19] .

In the case of epidemic control, the GP method is motivated by the following properties of (6):

• the model is polynomial in the state and input To eliminate the problem caused by the dynamics of x S , we can make the following observations: first, x S continuously decreasing; second, a successful mitigation control keeps the number of infections low, so only a limited portion of the society is expected to get infected by the end of the control window. S . By using this upper bound, stricter control actions are generated that may degrade the performance (results in larger control cost). However, the simulations will show that this is an acceptable price to get a significantly simpler optimization task.

Note also that the original cost function cannot be transformed in the geometric programming framework. Therefore, it is replaced by 

wherex,Ã d andB d are obtained from x, A d and B d by removing the first state variable x S . Variable ε is introduced to make constraint (8e) soft. A common solution of adding ε to the right hand side of x (i)

H ≤x H does not work here, because this formulation cannot be transformed to a geometric programming constraint. Therefore ε is used as a scaling ofx H and ε ≥ 1 is prescribed.

The value of ε is minimized by completing the cost with an additive term w ε ε penalizing the constraint violation. By using the soft constraint, the numerical infeasibility caused by x (i) H >x H can be avoided. After the optimization is completed, the value of ε gives information on the magnitude of the constraint violation that has been occurred during the synthesis.

This problem can be systematically transformed to a convex geometric program, which can be solved efficiently by numerical optimization. To perform the simulations presented in this paper we use CVX optimization package [20] to formulate the geometric program. CVX also calls an external solver, which in our case is MOSEK [21] , to solve the convex optimization problem.

Remark 2. Further ingredients can be added to (8) to decrease the gap between the convex and the original optimization problems. Since the upper boundx S is constant over the entire horizon, the simplified and the original model have similar behavior, if the trajectory of x S does not significantly differ fromx S . To keep x S −x S under control, the dynamics of x Z := 1 − x S can be added to the 285 model. Since the dynamics of x Z in discrete time is polynomial with positive coefficients, thus the optimization problem can still be converted to a geometric program. By prescribing upper bound for x Z , we can limit the changes of x S . The drawback of this modification is that constraining x S narrows the set of feasible control policies. To overcome this limitation an iterative design can 290 be applied. After each iteration, the trajectory of x S (computed from x Z ) is stored and used in the next iteration in the place of the constantx S . (The simulations for this paper (sec. 5) have been performed without this fine tuning 12 J o u r n a l P r e -p r o o f step, so the control input has been obtained in one step by solving the geometric optimization problem (8) only once. 295 Remark 3. Though in most cases the convex program above produces an acceptable result, it is possible to further improve the control policy by nonlinear optimization. For this, the control sequence generated by (8) can be used as a warm start for the nonlinear optimization (6).

We have seen that geometric programming provides an efficient way to simplify the optimization task required by the predictive control design. Now we go further. In this section we show that by applying additional relaxations it is possible to convert the original problem to a linear program (LP). LP is the simplest convex optimization task, it can be solved very efficiently even for 305 thousands of free variables [22] .

To generate a linear program from (6) the key step is choosing

A )x S to be the new optimization variable. Herex S is the constant initial value of x S and it substitutes x (i) S , because we can assume that x S does not change significantly 310 during the control period. Clearly, the dynamics are linear in ν (i) . The input

A )x S , which is again linear in all variables. After computing the sequence ν (0) , ν (1) , . . . ν (N (k)−1) the control input can be obtained by v

This reformulation greatly simplifies the optimization, but generates also some new problems. One of the problems is that, we can no longer prescribe piecewise constant input by simple linear constraints. This requirement should therefore be dropped from the optimization procedure and has to be handled by post processing the result. If the control input is allowed to change only at 320 every L · T s time step, the simplest way is to take the mean of the L v (i) values in the same L · T s window and to consider the result as the constant input over this time interval. This 'approximation' gives acceptable result only if v does not change significantly inside the L · T s intervals. Unfortunately this does not hold in our case: the LP tends to generate highly oscillating input sequence at 325 certain sections of the control window. The piecewise constant approximation of this signal results in significant performance loss and even infeasibility of the optimization in the later time steps. Since the epidemic dynamics is relatively slow, we have found that not the term x (i)

A is responsible for the oscillation. Consequently, if ν is smoothed out, so can be expected from v. We linear in the parameters. Simulation results will show that this strategy works in practice: the spline formulation smoothes out the solution of the optimization and results in input sequences that can be suitably approximated by a piecewise constant function thereafter.

The second problem with LP relaxation is that the original cost is nonlinear in the optimization variable ν. Therefore it is replaced by maximizing J LP = N (k) i=0 w i ν (i) . This cost is of course, not equivalent to the original one, but again, it expresses similar intention: we seek the less stringent interventions that do not result in violation of state constraints. J LP can be maximized by maximizing 345 v (i) -s (which is the original task) and the terms x

A . Note that, these two requirements are not contradictory, because applying less stringent interventions induces the increase of the number of infected individuals, i.e. the values of x I , x A and x P get larger. The increase is controlled by the upper bound prescribed for x H .

To guarantee numerical feasibility the constraint x H ≤x H is again transformed to a soft constraint x H ≤x H + ε, where ε ≥ 0 is an additional free variable. Its value is minimized by adding −w ε ε to the linear cost above.

The control design by LP optimization can be summarized as follows:

where Φ(·) denotes the row vector containing the values of B-spline basis functions evaluated at iT s and ϑ is the vector of the coefficients. The spline is 355 defined over the [0, (N (k) − 1)T s ] interval.

In this section an output feedback controller is formed by using the state observer of section 3 and the two state feedback MPC controllers of section 4. The properties of the control structure are analysed via numerical simulations. 360 

In these simulations, the sensitivity of the controller to the uncertainty of the model parameters is also examined.

To analyse the proposed observer structure, numerical simulations have been performed. The observer is tested in open loop without controller. The control input is chosen to be constant 1, which corresponds to applying no intervention. The initial state is selected to represent the beginning of the epidemic, that is the number of infected persons is very low, there are no hospitalised patients yet and nobody has been recovered or died. Numerically x 0 = [x 0,S ; 200; 50; 2; 1; 0; 0]/X, where x 0,S = X − 200 − 50 − 2 − 1 and X is the total population, which is X = 9800000 in Hungary. The Kalman filter required to reconstruct x P , x I , x A (subsection 3.2) has been designed by assuming only a Gaussian output noise (i.e. a noise on x H ) with covariance R cov = 0.1. In the simulations we assumed that the model parameters are not known exactly: the observer is designed for a nominal model and then applied to different "real" models generated by perturbing the nominal parameters. The parameters have been perturbed according to the following uncertainty bounds:

These bounds are based on the latest uncertainty estimates calculated by using the epidemiological data collected worldwide. More information on the identi-365 fication and uncertainty bounds of the model parameters can be found e.g. in [24] , [25] .

Taking the uncertainty intervals above, 100 different parameter sets (i.e. 100 different models) have been generated randomly by using uniform distribution over the uncertainty intervals. The relative absolute estimation error (i.e. the 370 error divided by the maximal value of the state) are depicted for all scenarios in Fig. 3 . It is not surprising that in the uncontrolled case the epidemic reaches a peak where the number of infected individuals are extremely high. We have found that the estimation error is maximal in the neighborhood of the extremal values of the states. Since the goal of the control is to avoid (delay) the epidemic 375 peak, therefore smaller estimation errors are expected in closed loop. This will be justified in the next section.

By analysing Fig. 3 , a small estimation error can be seen even in the nominal case. This can be explained by the effects of model reduction and the numerical integrations needed to compensate the time delay. Considering the uncertain 380 cases, the relative error is the highest in the estimation of x A , while at the other states it is less than 20%. This is not surprising as the model is nonlinear, several parameters are allowed to deviate from the nominal value and the observer is designed without explicitly taking the uncertainty into consideration. Nevertheless the observer well tracks the trend of the states and the accuracy 385 it provides is sufficient to use the estimated states for feedback control design. 

In this section the MPC controller introduced in section 4 is applied to our particular epidemic model. The controller uses the state estimation provided by the observer (section 3), they thus form an output feedback structure. The 390 parameters of the controller have been chosen as follows. The control window is 190 days, so H = 190/T s . K is set to 3 weeks, so K = 21/T s . The upper bound for the hospitalized patients (state x H ) is chosen tox H = 0.001. This bound is based on the available capacity of the healthcare system in Hungary. The system is assumed to start from the initial state (10), but the controller the total lockdown of the cities. This value was determined using [26] studying the situation in Wuhan in 2019-2020.

In the simulations we have used 6 models: beside the nominal one, 5 others with different parameter sets have been chosen to represent uncertain scenarios. Actually, these 5 models have been obtained by selecting the 5 worst case scenarios from the 100 random examples used in the observer analysis in subsection 5.1. The selection was based on the maximal estimation error. The controllers 405 have been tested on the continuous time model, i.e. in each sampling instant the actual control input is applied on the continuous model.

We have performed two simulation scenarios. In the first case (simulation scenario I.) all of the weights w i in the cost function are set to 1, i.e. no weighting is applied. In the second case (simulation scenario II.), we have chosen the 410 weights to reflect the tolerance of the society for the interventions. In the first four weeks the weight is 1, which means that strict measures are tolerated. In the next four weeks the control input weights are 4 to force less stringent interventions. From the 9th week the weight is 12, that means strict interventions are definitely undesired. In compact form, the initial weight sequence at k = 0 415 is w 0,...,3 = 1, w 4,...,15 = 4, w 16,...,M (0)−2 = 12, w M (0)−1 = 1. As time index k gets closer to H, M (k) decreases (shrinking horizon), the actual weights are obtained by taking the last M (k) elements of the initial weight sequence above. The small weight of the last control action is part of the terminal constraints (see section 4 for details). The weight w ε penalizing the soft constraint violation 420 has been set to 10.

Figures 4, 5, 6 and 7, 8, 9 present the state trajectories and control inputs obtained in the two simulation scenarios. The nominal case is distinguished from the uncertain ones. It can be seen that in the nominal case, the safety ingredients guarantee the constraint satisfaction over the entire horizon, as well 425 as beyond the control window. On the other hand if the model parameters significantly differ from the nominal ones, x H may slightly exceed the limit after the end of the control window. This motivates the design of a robust controller, e.g. by a simple scenario method introduced in the next subsection.

It can be seen that the two different weightings result in two different con-430 trol strategies. In the first case (when w i = 1, ∀i), an almost constant, mild intervention is proposed, which is gradually eased towards the end of the control window. In the second case, the control input follows the same pattern as the weighting: in the first 4 weeks strict intervention is applied, that is significantly eased in the next month. From the 8-th week, the control input is pushed as 435 close as possible to 1. Finally, we analyse the computation time needed to solve the optimization task (8) . The computation consists of two main steps: first, formulating the GP problem (CVX) and second, finding the optimal solution (MOSEK). 

This section presents the results obtained by using the LP based control design presented in Subsection 4.3. To formulate and solve the LP optimization problem we used again the CVX environment. As we have mentioned, the main 455 drawback of CVX is that the optimization problem has to be built up every time its solution is needed. In case of LP, CVX proposes a tool, called CVXGEN [27] which can compile the code and allows to generate parameterized optimizer objects. In this paper, we do not use this tool as we have found standard CVX suitably fast. However, by exploiting the benefit of code optimization and By performing a control synthesis with the same parameters as in the GP example above (i.e. N = 190/T s , K = 21/T s , L = 7/T s ), the computation time with LP is less than 10 seconds. From this, the construction of the optimization task takes approximately 8-8.5 seconds, the solve-time is always below 2 seconds. Since the solution of an LP requires significantly less time than GP, therefore we implemented a simple scenario method to improve the robustness of the controller. This approach can be applied to the GP-based control synthesis as well.

The basic concept of scenario based predictive control is generating a pool 470 of possible system models and designing a control input that is feasible (and suboptimal) for all. If the size of the pool is suitably large, the probability of the constraint violation (infeasibility) on an arbitrary new model is close to zero. The details of scenario methods and scenario based control design can be found e.g. in [28, 29] . The main drawback of this approach is that the 475 computation of the control input generally requires to include the entire pool in the optimization process, which highly increases the dimension of the problem. Due to the specific structure of the epidemic model, this difficulty is not to be reckoned with. Since smaller control inputs mean stricter interventions, if feasible control inputs are computed separately for the models of the pool, then This result suggests that the scenario method is a potential approach to design robust controller for our epidemic model. This motivates us to continue the analysis of this method and to make further efforts to derive mathematical guarantees for feasibility and optimality. This will be 495 part of the future research. After constructing the pool of models, we have tested the scenario controller in closed loop on the same 6 models that have been used in the earlier simulations. As before, we have not applied any weighting in the cost, i.e. w i = 1 has been chosen in (9) for all i. The weight w ε penalizing the soft constraint 500 violation has been set to 100. The simulation results can be seen in Figs 10, 11 and 12. Similar to the results obtained by the GP-based design (simulation scenario I.), a mild intervention is applied in the most part of the control window, which is gradually relaxed towards the end of the horizon. The main difference between the two control strategies is at the beginning of the control window: the 505 LP strategy starts with very mild measures, the control is practically activated only after the 4th week. In contrast, the GP-based approach applies significant interventions right from the beginning. By comparing the cost of the control we have computed the L 1 norm of the control signals in both cases: by using the GP-based synthesis this norm is between 105 and 125, while it is between 115 510 and 135 in case of the LP. The LP is slightly better, but the difference is not significant, so we can say that practically the performance of the two control strategies is very similar. Note also that the characteristic and performance of either control strategy can be further tuned by modifying the weights w i of the cost function.

By analysing the trajectories of x H we can see that with the LP controller there is no constraint violation. Due to the scenario design, a slightly conservative control action is designed, which prevents x H from reaching the prescribed limit.

An output feedback control approach was proposed in this paper to design the stringency of the non-pharmaceutical measures for mitigating the COVID-19 pandemic. The linearity of a subsystem of the used compartmental model allowed us to use dynamic inversion to give an estimation for the number of people in the latent compartment. The nonnegativity of the terms in the discrete-time 525 model was exploited to formulate the control problem in a convex geometric programming framework. Additionally, by relaxing the optimization task further, the input can be computed by solving a simple linear programming problem. A state observer was designed and analyzed for the tracking of non-measured variables. The maximal estimation errors occur around the peak of the epi- the effects of the studied model uncertainties, and the control goals and constraints can be reached within an acceptable tolerance even in the linear programming case. The computations can be done in real-time and significantly faster than in the case of a previous nonlinear MPC solution [7] . The convexity 535 of the re-formulated optimization problems guarantees a unique solution which is suboptimal with respect to the original optimal control problem, and makes possible to extend the control design to larger dimensional (e.g. multi-scale or multi-region) compartmental models [9] , as well as to apply more advanced uncertainty analysis techniques, e.g. scenario methods, in the future. Another 540 subject of further work will be the application of the obtained suboptimal inputs as initial solutions in the original nonlinear MPC design. 

Mathematical models in population biology and epidemiology

An Optimal Predictive Control Strategy for COVID-19 (SARS-CoV-2) Social Distancing Policies in Brazil

A 560 parametrized nonlinear predictive control strategy for relaxing COVID-19 social distancing measures in brazil

Nonlinear model predictive control with logic constraints for COVID-19 management

Data-Driven Control of the COVID-19 Outbreak via Non-Pharmaceutical Interventions: A Geometric Programming Approach

Model predic-570 tive control to mitigate the COVID-19 outbreak in a multi-region scenario

Modelling a pandemic with asymptomatic patients, impact of lockdown and 575 herd immunity, with applications to SARS-CoV-2

Early phase of the COVID-19 outbreak in Hungary and post-lockdown scenarios

Robust estimates of the true (population) infection rate for COVID-19: A backcasting approach

Structural identifiability and 585 observability of compartmental models of the COVID-19 pandemic

Structural identifiability of dynamic systems biology models

Observer design for linear systems with unknown inputs

The psychological and social impact of COVID-19: new perspectives of well-being

The economic cost of COVID lockdowns: An outof-equilibrium analysis

A tutorial on geometric programming

Convex Optimization

CVX: Matlab Software for Disciplined Convex Pro-605 gramming

The MOSEK optimization toolbox for MATLAB 9

Linear programming: methods and applications, Courier Corporation

Spline Toolbox for use with MATLAB, Release 2020a

Identification and Estimation of the SEIRD Epidemic Model for COVID-19

Estimating and simulating a SIRD model of COVID-19 for many countries, states, and cities

Phase-adjusted estimation of the number 620 of coronavirus disease

CVXGEN: a code generator for embedded convex optimization

Robust Model Predictive Control via Scenario Optimization

MIFM) based on the charter of bolster issued by the NRDI Office under the auspices of the Ministry for Innovation and Technology. The research was also supported by the Ministry of Innovation and Technology NRDI Office within the framework of the Autonomous Systems National Laboratory Program.The authors thank Balázs Csutak and Gergely Horváth for their help in the model and control loop analysis. The authors are grateful to the anonymous reviewers for their constructive comments.