key: cord-0713489-2hxpq20i
authors: Ehsan Hesam Sadati, Mir; Aksen, Deniz; Aras, Necati
title: A Trilevel r-Interdiction Selective Multi-Depot Vehicle Routing Problem With Depot Protection
date: 2020-05-25
journal: Comput Oper Res
DOI: 10.1016/j.cor.2020.104996
sha: b315e51fa9f1a9f4601bcd3b1c9235942ca4096a
doc_id: 713489
cord_uid: 2hxpq20i

The determination of critical facilities in supply chain networks has been attracting the interest of the Operations Research community. Critical facilities refer to structures including bridges, railways, train/metro stations, medical facilities, roads, warehouses, and power stations among others, which are vital to the functioning of the network. In this study we address a trilevel optimization problem for the protection of depots of utmost importance in a routing network against an intelligent adversary. We formulate the problem as a defender-attacker-defender game and refer to it as the trilevel r-interdiction selective multi-depot vehicle routing problem (3LRI-SMDVRP). The defender is the decision maker in the upper level problem (ULP) who picks u depots to protect among m existing ones. In the middle level problem (MLP), the attacker destroys r depots among the (m–u) unprotected ones to bring about the biggest disruption. Finally, in the lower level problem (LLP), the decision maker is again the defender who optimizes the vehicle routes and thereby selects which customers to visit and serve in the wake of the attack. All three levels have an identical objective function which is comprised of three components. (i) Operating or acquisition cost of the vehicles. (ii) Traveling cost incurred by the vehicles. (iii) Outsourcing cost due to unvisited customers. The defender aspires to minimize this objective function while the attacker tries to maximize it. As a solution approach to this trilevel discrete optimization problem, we resort to a smart exhaustive enumeration in the ULP and MLP. For the LLP we design a metaheuristic algorithm that hybridizes Variable Neighborhood Descent and Tabu Search techniques adapted to the Selective MDVRP (SMDVRP). The performance of this algorithm is demonstrated on 33 MDVRP benchmark instances existing in the literature and 41 SMDVRP instances generated from them. Numerical experiments on a large number of 3LRI-SMDVRP instances attest that our comprehensive method is effective in dealing with the defender-attacker-defender game on multi-depot routing networks.

There has been an increasing interest in the protection and interdiction of critical structures such as bridges, railways, medical facilities, roads, warehouses, train/metro stations, police and power stations among others. This has given rise to the so-called interdiction models. The first interdiction problem was studied by Wollmer (1964) . In this seminal study the author considered the interdiction of a fixed number of capacitated arcs in a network so as to reduce the capacity of the network as much as possible. In another study Israeli and Wood (2002) addressed the shortest path problem and determined the critical arcs in it. Other network interdiction models developed in the literature are due to McMasters and Mustin (1970) , Salmeron et al. (2004) , and Arroyo and Galiana (2005) . The study of Church et al. (2004) is the first published work about interdicting nodes in supply chain networks. The authors considered two problems called the r-interdiction median problem (RIM) and the r-interdiction covering problem (RIC). The aim in these problems is finding a subset of r facilities among p existing ones the interdiction of which yields the maximum disruption in a median and coverage type network, respectively.

Interdiction models are typically nested in a bilevel optimization structure, which is known as the Stackelberg or Leader-Follower Game (Von Stackelberg, 1952) . The decision maker of the upper level problem (ULP) is the attacker, while that of the lower level problem (LLP) is the defender. The interdiction model as such becomes an attacker-defender game. In other interdiction models such as the one proposed in Church and Scaparra (2007) , the goal is to establish the best protection policy of the defender against the disruptive attacks of its opponent. This corresponds to a bilevel defender-attacker game where the roles of the attacker and the defender are switched. In Church and Scaparra (2007) the protection decision is embedded in the RIM model of Church et al. (2004) , and the resulting model in named as the r-interdiction median problem with fortification (RIMF). The objective of this problem is to obtain a subset of facilities that have to be fortified so that the total cost of satisfying the customer demands from operational facilities after the attack is minimized. Aksen et al. (2013) analyzed another bilevel p-median problem where the defender in the ULP identifies the facilities to be opened and those to be protected simultaneously. In response the attacker decides in the LLP the facilities to destroy. Sadati et al. (2019) proposed the rinterdiction selective multi-depot vehicle routing problem (RI-SMDVRP) which is modelled as an attackerdefender game from the viewpoint of an intelligent adversary whose aim is to cause the highest damage on a routing network. The adversary is the decision maker in the ULP. He determines r out of m existing depots to attack, while the defender in the LLP finds the best vehicle tours after the attack. Other bilevel interdiction models can be found in Brown et al. (2005) , Scaparra and Church (2008) , Aksen et al. (2010) , Aksen and Aras (2012) , Aksen et al. (2014) and Aliakbarian et al. (2015) .

The study of interdiction in routing networks is a comparatively unexplored venue in the literature confined to the interdiction of arcs within the framework of a Stackelberg game. We are aware of only four papers. Lozano et al. (2017) solve a problem in the context of the traveling salesman problem (TSP) using a backward sampling framework-based exact approach tailored to the bilevel TSP. In their problem the defender decides on a subset of arcs to protect in the first level, the attacker attacks a subset of unprotected arcs (thus increasing their costs) in the second level, and the defender eventually determines the best TSP tour in the network. The other three papers are due to Kheirkhah et al. (2016a Kheirkhah et al. ( , 2016b and Bidgoli and Kheirkhah (2018) . They all model the phenomenon of budget-constrained arc interdiction as an attackerdefender game where the attacker interdicts the arcs of a routing network to maximize the defender's routing cost on the surviving network. The success probability of interdiction of a targeted arc is one. Each arc has its own interdiction cost. The attacker has a limited budget which prevents him from interdicting all arcs. The defender must deliver all customer demands after the attack. The network graph under consideration is complete, and there are arcs between all pairs of customer nodes. There exists only one depot with impunity against destruction from where vehicles are dispatched to visit customers.

We are aware of only five published works in the context of trilevel interdiction problems. Yao et al. (2007) address a trilevel optimization problem to allocate resources in defending an electric power network. The aim is to determine the most critical components to protect against potential terrorist attacks. The problem is modeled as a defender-attacker-defender game where the defender's objective is to minimize the economic cost. The authors develop an exact solution technique based on decomposition which consists of iteratively solving smaller nested bilevel problems.

San Martin (2007) proposes and solves in his Master's thesis a trilevel programming model to generate an optimal plan to defend an infrastructure from a malicious man-made attack. In his model, the defender uses restricted defensive resources for the protection of system components. Then, the attacker deploys finite offensive resources to attack components that are not protected. Finally, the defender guides system operations with an optimization model. The author suggests four decomposition algorithms: direct, nested, reformulation-based, and reordering-based to solve the model.

The RIMF model in Church and Scaparra (2007) is extended in Scaparra and Church (2012) by taking into account capacitated facilities that are vulnerable to attack. The new problem is investigated by means of a trilevel optimization model, where the defender (system planner) considers the protection of certain facilities in the ULP with the goal of minimizing the total cost caused by an attack. In the middle level problem (MLP), r facilities are interdicted by the attacker to inflict the highest damage on the network. Lastly, the defender minimizes in the LLP the demand-weighted total traveling cost by rerouting demands to the operational facilities. They proposed a solution approach consisting of a tree search strategy and showed how the MLP and LLP can be reduced to a single-level problem. Mahmoodjanloo et al. (2016) address a trilevel facility defense location model for the full coverage in RIM. Their trilevel model is based on the defender-attacker-defender framework. The aim of the proposed model is to construct an appropriate service system so that its full capacity is utilized in order to provide the services during the post-attack period. As a solution approach they proposed three methods which consist of explicit enumeration, genetic and biogeography-based algorithms. Akbari-Jafarabadi et al. (2017) investigate a trilevel RIM with the goal of minimizing the sum of the pre-interdiction and post-interdiction costs. The attacker disrupts the system under uncertainty where facility opening costs and customer demands are several of the uncertain parameters. The authors develop four hybrid meta-heuristics based on Tabu Search, Rainfall Optimization, and Random Greedy Search.

We introduce in this paper a novel trilevel optimization problem to identify the depots of utmost importance in a multi-depot vehicle routing problem which is modelled as a defender-attacker-defender game from the viewpoint of a defender who intends to protect a fixed number of depots to minimize the total cost after the interdiction of several unprotected depots by the attacker. We call this problem the trilevel r-interdiction selective multi-depot vehicle routing problem (3LRI-SMDVRP). The defender in the ULP makes a decision on the depots to be protected. In the MLP, the attacker identifies which r depots to strike among the unprotected ones so that the defender's total post-attack cost is maximized. Finally, in the LLP, the defender aims to minimize the same total cost which covers also demand outsourcing. To this end, he optimizes the vehicle routes using the non-interdicted (active) depots.

In our trilevel problem, each customer's demand should be satisfied by the defender using one of the following two options: dispatching vehicles from active depots and outsourcing the demand to a third-party logistics service provider (3PL). This implies that the defender must decide on the following: which customers should be visited by the fleet of vehicles of the defender's company and which ones should be outsourced to a 3PL. We account for the outsourcing option of the defender by incorporating the cost of outsourcing into his objective function, and refer to the LLP as the Selective Multi-Depot Vehicle Routing Problem (SMDVRP). SMDVRP is an extension of MDVRP which has shown to be -hard (Lenstra and Rinnooy Kan, 1981) . Therefore, finding an optimal solution is difficult with exact approaches even for moderate size instances. As a matter of fact, heuristic techniques have turned out to be a viable method for the solution of SMDVRP. The most relevant papers about SMDVRP are due to Aras et al. (2011 ), Stenger et al. (2013a and Stenger et al. (2013b) . In this paper, we propose a hybrid metaheuristic algorithm that combines Variable Neighborhood Descent and Tabu Search heuristics to solve SMDVRP. We then encapsulate this hybrid algorithm in a smart exhaustive enumeration which is complemented by two reduction techniques. This comprehensive methodology constitutes a viable solution approach to the trilevel optimization problem 3LRI-SMDVRP.

A possible application of 3LRI-SMDVRP arises in healthcare management, particularly in drug distribution, where drugs are carried every day from depots to pharmacies and hospitals so as to ensure quick and reliable service. Actually, these depots take the role of transshipment nodes between drug producers and their customers such as pharmacies and hospitals. The disruption in the operations of one or more depots in a metropolitan area would impair the distribution of drugs, which put patients in a cumbersome situation. In the wake of an attack, clients that were previously served by a vehicle dispatched from a destroyed depot should be visited by a vehicle to be departed from an undamaged, thus still operational depot. Another context in which the considered problem can be useful is the military where large firearms such as cannon or rockets have to be shipped from the armories to the combat or training zones. Obviously, if an armory is destroyed by the enemy, the combat capability of the defender decreases severely.

The remainder of this paper is structured as follows. In section 2, we present a mathematical formulation for the 3LRI-SMDVRP. Section 3 elaborates on the solution methodology developed for 3LRI-SMDVRP. Section 4 consists of the results generated on the basis of a great number of experiments. The last section, Section 5, presents concluding remarks and directions for future study.

The 3LRI-SMDVRP has the same objective function in all levels of the problem, but in opposite directions. In the ULP the defender determines which depots to protect, whereas in the MLP the attacker selects which unprotected depots to destroy. Finally, in the LLP the defender identifies how the demand of each customer must be satisfied as efficiently as possible after the attack by optimizing the vehicle routes. This problem can be modeled by means of a trilevel programming framework in which the defender is the player making the decisions in the ULP and LLP, whereas the attacker is the other player who makes the decision in the MLP. A mixed-integer linear programming (MILP) formulation of the 3LRI-SMDVRP is given below.

:

The set of customers, = {1, …, }

The set of depots, = { + 1, …, + }

The set of all nodes. = ∪

:

Traveling cost from node to node , 

In this formulation, (2)-(4), (5)-(8), and (9)-(26) correspond to the ULP, the MLP, and the LLP, respectively. The defender's objective function is represented by Expression (2) which is comprised of three cost components. (i) Traveling cost incurred by the vehicles. (ii) Operating or acquisition cost of the vehicles. (iii) Outsourcing cost resulting from unsatisfied demand of customers. Constraint (3) guarantees that the defender can only protect u depots. Defender's binary protection variables are defined in (4). Constraint (6) ensures that the attacker can interdict only depots. Constraints (7) prevent the interdiction of the depots protected by the defender in the ULP. The binary restriction of the attacker's interdiction variables is ensured in constraints (8). The objective function of the defender in the LLP is given by expression (9) which is the same as the ones in (2) and (5). Constraints (10) put a lower limit on the necessary number of vehicles. The condition that no vehicle can travel between two depots is ensured by constraint (11). Constraints (12) guarantee that at most one depot can be assigned to a customer.

Constraints (13) and (14) are logical constraints that make sure that if customer is not assigned to depot , then no vehicle can travel from customer to depot or vice versa. Constraints (15) and (16) imply that two customers must be assigned to the same depot j if a vehicle visits them consecutively.

Constraints (17) represent the subtour elimination constraints (SECs) adapted from the singlecommodity flow formulation of Gavish and Graves (1978) conceived originally for the TSP. In constraints (18)(19) we enforce tight bounds on the continuous flow variables of that formulation. Note that these ij F bounds were also proposed in Gouveia (1995) . Constraints (20) are balance equations, namely the same number of vehicles must enter and leave each depot as well as each customer. Constraints (21) correspond to valid inequalities that help to eliminate subtours of size two. Although they are redundant under the existence of constraints (17)(19), they help strengthen the SMDVRP formulation in the LLP. Aksen et al. (2018) empirically show that these inequalities provide significant advantage in solving the singlecommodity flow formulation of the capacitated vehicle routing problem (CVRP). Constraints (22) are also valid equalities, and make sure that a customer must be visited by a vehicle if it is assigned to a depot. Constraints (12), (20) and (22) together imply that a customer is visited at most once. Constraints (23) link the LLP with the MLP and guarantee that the assignment of customers to an interdicted depot is prohibited. Finally, constraints (24) and (25) 

(1 )

(1 ) (27) and (28) Moore and Bard (1990) showed that even linear bilevel programming is -hard. LLP in the 3LRI-SMDVRP is the selective version of the MDVRP, which is known to be -hard (Lenstra and Rinnooy Kan, 1981) . ULP and MLP are binary optimization problems; the bilevel problem comprised of the two is proven to be -hard (Sadati et al., 2019) . A generic instance of this bilevel problem is reducible to a special case of the 3LRI-SMDVRP in which the defender's protective resources are rescinded, i.e. is = 0 the case. From this simple polynomial-time reduction we deduce that , a heuristic approach may prove viable for the discovery of an effective solution for it in reasonable time.

The ULP in 3LRI-SMDVRP includes only the protection decision variables that identify the depots to be protected. MLP consists of interdiction decision variables that indicate the unprotected depots to be attacked. We solve ULP and MLP via a smart exhaustive enumeration method that explores a subset of all possible combinations of protecting out of existing depots and interdicting among ( )( -) (unprotected depots assuming that . As mentioned above, the LLP is an SMDVRP where the ) ≥ + defender simultaneously optimizes the vehicle routes and makes the outsourcing decisions. For its solution we develop and tailor a hybrid metaheuristic algorithm by combining Variable Neighborhood Descent and Tabu Search Heuristic (VNDTSH).

Note that from the routing problem's point of view there is no difference in the LLP between protected depots and unprotected, but non-interdicted depots; both types of depots remain operational and can dispatch vehicles in the wake of the attack. The defender optimizes the vehicle routes through operational depots only. Therefore, some of the protection/interdiction combinations are repeated which ( )( -) include exactly the same interdicted and operational depots, but differ in the identity of the protected depots.

To clarify, consider a simple example where the number of depots is equal to , and the cardinality of = 4 protected and interdicted depots is equal to 1 . All =12 protection/interdiction ( = = 1) ( 4 1 )( 3 1 )

combinations are presented in Table 1 . Note that hereafter we use the terms 'pattern' and 'combination' interchangeably. In this table, the indices of the protected and interdicted depots are given for each pattern; the last column shows the active (non-interdicted) depots in the LLP. By considering all 12 patterns it can be seen that the final solutions of patterns 1, 8, and 11 lead to the same result. In the first pattern the defender protects the first depot, and in response the attacker interdicts the second depot. Finally, the defender optimizes the problem with the operational depots 1, 3, and 4. In pattern #8, the defender protects the third depot, and in response the attacker interdicts the second depot. Again, in the same way as pattern #1, the defender optimizes the problem by using operational depots 1, 3, and 4. In pattern #11, the defender protects the fourth depot, and in response the attacker interdicts the second depot. The defender optimizes the problem again with the operational depots of 1, 3, and 4. In these three patterns (#1, #8, #11), the index of the interdicted depot is fixed to two, and the indices of the operational depots in LLP are fixed to 1, 3, and 4; they differ only in the indices of the protected depots. This observation holds true for the pattern subsets (#2, #5, #12), (#3, #6, #9), and (#4, #7, #10). So, we need to solve only 4 combinations in total ( ) = ( 4 1 ) = instead of all 12 combinations. This means that the total number of patterns to be considered depends on and , but not on . In order to avoid solving the same patterns over and over again, we can use a hashing function which will be discussed later in this section. In consequence, we need to solve the LLP in the 3LRI-SMDVRP for distinct patterns instead of all possible combinations. We refer to this Note that if the underlying MDVRP network of the problem has a symmetric topology comprised of equidistant depots and an equal number of identically clustered customer nodes dispersed around each depot, then the defender's protection problem ULP and the attacker's interdiction problem MLP can have multiple optimal solutions. Due to this perfect symmetry and clustering the optimal objective value of the defender in the LLP would then be the same for a number of distinct protection and * ( ) H W associated interdiction strategies ( , ).

To tackle the 3LRI-SMDVRP, we propose two reduction techniques which depend on the cardinality of protected and interdicted depots. The first method is used when , this is when the sum total of + < protected and interdicted depots is less than the number of all depots. The second one is employed when + . The aim of these two reduction techniques is to obtain the final solution of 3LRI-SMDVRP by = solving the associated LLP for a lesser number of patterns instead of for all patterns.

We refer to the overall solution approach which combines the two reduction techniques and encapsulates the VNDTSH algorithm therein as the Smart Reduced Enumeration (SRE). The reduction techniques in this comprehensive methodology tackle the ULP and MLP of the trilevel problem, while VNDTSH solves the LLP heuristically.

In this case, we solve the related MLP and LLP pair for each protection pattern that is produced during the execution of the SEE method. To this end, we resort to the well-known Parallel Savings Heuristic of Clarke and Wright (1964) and modify it by incorporating with the 1-Node and 2-Node iterative marginal cost analysis (iMCA) explained in the sequel of the paper so that it can be used for solving the SMDVR.

We designate the hybrid heuristic as CWHiMCA, and use it to find the attacker's best interdiction pattern, i.e. the pattern that provides the maximum LLP objective value. For that interdiction pattern, the associated LLP is solved once again with the VND+TSH algorithm by utilizing the CWHiMCA solution as an initial feasible solution. Then the objective value given by VND+TSH is compared with the one corresponding to the second best interdiction pattern (the pattern for which CWHiMCA yields the second highest LLP objective value). If the former is larger, we stop because it becomes clear that no interdiction pattern among the remaining ones is able to produce a better (higher) LLP objective value from the perspective of the attacker. Otherwise, for each remaining interdiction pattern for which CWHiMCA provides a higher LLP objective value than the best objective value of the VND+TSH solution at hand, we solve the associated LLP once again using VND+TSH hoping that a better interdiction pattern for the attacker can be found. At the end of this process, we find the best interdiction pattern for each protection pattern. The final solution of 3LRI-SMDVRP is that particular protection pattern which has the smallest objective function value, thus is the most favorable one from the defender's perspective.

The detailed description of the proposed reduction technique for CASE I (REDUCTION 1) is given in ALGORITHM 1. Note that the arrays and store composite records each of which As discussed above, thanks to the proposed SEE method, we do not need to explore all ( )( -) protection/interdiction patterns. In fact, we revisit several of the previously generated patterns during the implementation of REDUCTION 1 (Lines 3-5 and Lines 10-23). In order to avoid solving the same LLP more than once, we associate with each pattern a unique decimal integer number called hash value. This number represents a pattern as a unique string made of binary digits (bits) 0 and 1. The bits 0 and 1 correspond to a protected/non-interdicted and an interdicted depot, respectively. Let be the binary string ℎ representation of a pattern where the status of the depot is shown by for . The unique

hash value is obtained by the following hash function:

The hash function considerably reduces the computational effort needed to check whether a newly generated protection/interdiction pattern is already stored in the explicit memory. If a pattern is found to be already stored, so must be its associated LLP solution. Thus, the LLP associated with that pattern will not be solved again saving unnecessary runs of the CWHiMCA and VNDTSH algorithms.

In this case, the defender protects depots, and in response the attacker interdicts all remainingunprotected depots. In other words, the attacker does not have any decision to make and he/she interdicts all remaining depots and MLP will be voided, i.e., the trilevel problem becomes a one-level problem. When , we need to explore exactly combinations and the final

solution of the 3LRI-SMDVRP is the one which has the minimum objective value among all combinations. In lines 10 through 23 of REDUCTION 1 we try to obtain the best interdiction pattern for a given protection pattern with a lesser number of VNDTSH calls. In CASE II, however, for a given protection pattern there exists only a unique interdiction pattern; therefore, the reduction technique in REDUCTION 1 cannot be applied to CASE II. 

VNDTSH can be treated as a hybrid metaheuristic algorithm which combines the Variable Neighborhood Descent method (VND) with the Tabu Search heuristic (TSH) originally devised by Glover (1986) . The main constituent of the algorithm is VND which is a deterministic version of the well-known Variable Neighborhood Search (VNS) algorithm invented by Mladenović and Hansen (1997) . In VND, the shaking step of VNS that is responsible for the perturbation of the current solution is dismissed, and only the local search step is executed. This local search procedure is implemented efficiently in VND+TSH by applying TSH that makes use of different neighborhood structures which are changed systematically within the VND framework. 

Generating an initial solution is accomplished by using the method outlined in Sadati et al. (2020) . Here, we briefly mention the main steps of the method referred to as CWHiMCA. The first stage of CWHiMCA involves solving the given SMDVRP as a traditional MDVRP via the classical Clarke and Wright parallel savings heuristic (CWH) and improving the solution by a rich neighborhood search procedure consisting of the following move types: 1-0 Move, 2-0 Move, 1-1 Exchange, 2-2 Exchange, 2-Opt, 3-Opt, and Or-Opt. It is worth mentioning that because of their complexity 3-Opt and Or-Opt are applied only on single routes as an intra-route operator whereas the other move types are performed within the same route as well as on two different routes. The second stage of CWH+iMCA performs 1-Node and 2-Node iterative marginal cost analysis (iMCA) so as to solve the SMDVRP. The marginal cost of each customer on the basis of routes generated in the first stage is calculated, and subsequently 1-Node and 2-Node iMCA are applied in a row, where single customers and two consecutive customers are either kept or removed based on their marginal costs, respectively. When customers are removed, their demands must be outsourced. This process repeats itself until a negative marginal cost is associated with every customer remaining in the CWH solution. The details of 1-Node and 2-Node iMCA as well as the pseudo code of 1-Node iMCA can be seen in Appendix A of Sadati et al. (2020) .

There are 10 move types that are utilized in the local search step of VND+TSH given as follows: k N i. Routing move types: 1-0 Move, 1-1 Exchange, 1-2 Move, 2-2 Exchange, 2-Opt.

ii. Customer selection move types: 1-Add, 2-Add, 1-Drop, 2-Drop and 1-Swap.

We remark that the routing move types are applied both within the same route and on two different routes. The first four customer selection move types are explained in detail in Sadati et al. (2020) . It suffices to say here that 1-Add inserts a customer not included in a route yet in the best possible position by checking all possible insertion positions. 1-Drop discards a customer currently visited in a route, and connects its predecessor and successor. 2-Add and 2-Drop are the 2-node versions of 1-Add and 1-Drop, respectively. 1-Swap simultaneously adds one customer and drops another one.

From the implementation viewpoint, the move types mentioned above are explored in a cyclic sequential order as proposed by Mladenović and Hansen (1997) starting with 1-0 Move and ending with 1-

Swap. The exploration of all aforementioned neighborhood structures in VND+TSH consumes long computation times. In order to speed the neighborhood exploration, we apply granular neighborhood search proposed by Toth and Vigo (2003) . The main rationale is the exploration of only a fraction of promising moves by discarding a significant number of unpromising ones. An arc is considered in a move (thus admissible) if its cost (distance) is less than a granularity threshold value defined as where ( ) A crucial element of any tabu search heuristic is the definition of the tabu conditions that are utilized in order to forestall cycling. To this end, tabu conditions associated with each neighborhood structure (move type) are introduced. These conditions prevent each move type from being undone in the next iterations where denotes the tabu duration used in the TSH. For example, when 1-Add is applied, the customer that is inserted into a given route cannot be taken off from that route by other move types during the r subsequent iterations. The tabu condition is revoked for each move either at the end of iterations or if it provides a better (smaller) objective value than the incumbent. The latter is called the aspiration criterion.

In our TSH implementation, the tabu duration is set to (Cordeau et al., 1997) . We also 10 7.5 log n       implement a diversification procedure to explore a larger solution space. In this regard, we apply the diversification strategy that we successfully made use of in Sadati et al. (2020) . It was suggested by Gendreau et al. (2008) , which explores parts of the solution space unexplored previously. Let be the 

, then the penalty term is added to .

The VND+TSH algorithm is stopped when a predetermined number of successive non-improving iterations is performed. This number is denoted by . We set on the basis of the

number of customers in the problem. It also depends on the type of the problem at hand, i.e. whether an MDVRP or an SMDVRP is to be solved. Table 2 shows the specifications of for both problem MaxNonImp types. The detailed pseudo code of the proposed TSH is given in ALGORITHM 4 in Appendix B. We remark that the TSH subroutine which serves as a local search procedure of the hybrid metaheuristic VND+TSH proposed in this paper has several similarities with the TSH proposed in Sadati et al. (2020) . These are: (i) Implementation of a granular neighborhood search structure adopted from Toth and Vigo (2003) . (ii) Use of strategic oscillation and the definition of the aspiration criterion therein. (ii) Tabu conditions and the diversification procedure. However, the new TSH differs in the following features: These new features in the proposed TSH alongside its hybridization with VND lead to a significant performance improvement over the previous TSH in Sadati et al. (2020) . The results of our numerical experiments which we present in the next section corroborate the superiority of the new TSH.

In order to assess the performance of VND+TSH on MDVRP and SMDVRP instances, we solved 33 benchmark MDVRP instances and 41 randomly generated SMDVRP instances. Experiments were carried out on a Dell Precision T7810 model computer running on 64-bit Windows 7 with Intel Xeon ® E5-2690 v4 2.60 GHz processor and 32 GB ECC RAM. All algorithms were coded in C# of Microsoft ® Visual Studio 2017. First, we compared the VND+TSH solutions of 33 MDVRP test instances with the best known solutions in the literature (VRP Web, 2020) . We also included in our comparison the results of another tabu search heuristic TSH that was developed earlier in Sadati et al. (2020) to tackle the bilevel version of the rinterdiction SMDVRP which was modeled as an attacker-defender game. The employment of the same computing platform enables us to make a fair comparison between the two methods.

The results of this initial comparison can be seen in Table 3 . The first column of the table gives the instance name; the second and third columns and show the number of customers and the number of depots, respectively; the fourth column Type indicates the type of instance where C stands for capacity constraint and D for maximum tour duration constraint; and the fifth column BKS stands for the best known solution's objective value. These are followed by columns 6, 7 and 8 hosting the Obj. Val. (the objective value returned), the PD (percent deviation from BKS), and the CPU time in seconds for the TSH in Sadati et al. (2020) The average PD of 0.69% on 33 standard MDVRP instances is promising for the new hybrid algorithm VND+TSH, and better than the 1.13% deviation of the TSH in Sadati et al. (2020) . In 9 instances PD obtained by VND+TSH is equal to 0% (cells in bold), while this number is 8 for TSH. In two instances (p08 and pr07) VND+TSH beats the BKS, and attains negative PD values which are bolded as well. Also remarkable is the superiority of VND+TSH over the previous TSH in terms of solution speed. The average solution time of TSH in 33 MDVRP instances is about 535 seconds, whereas VND+TSH consumes 326 seconds in average. Note that bolded figures in the "Obj. Val." columns of Table 3 optimal objective values and the respective zero gaps are indicated in bold. For those instances which cannot be solved to proven optimality by CPLEX, the best objective value and its respective Gap (%) are also indicated in bold. Moreover, the faster of the solution times for each instance among the three solution methods is indicated in italic. Table 4 reveals that CPLEX could solve all 15 instances of pr01 to proven optimality within the allowed time limit. VND+TSH and the former TSH developed in Sadati et al. (2020) attained the optimal solution in five and three of these instances, respectively. The average deviation of VND+TSH from the optimal objective value is measured at 0.33%. This deviation amounts to 1.05% in TSH. When it comes to the solution speed, needless to say, both heuristics outpace CPLEX which spends 385.20 seconds in average until optimality is reached. Among the two, VND+TSH leads TSH with an average CPU time of 50.96 seconds against 54.24 seconds. CPLEX cannot report proven optimality in the remainder of our SMDVRP test bed which comprises 26 instances of pr03. It achieves the best solution in 10 of them. The same KPI for our proposed VND+TSH is 15, whereas that is just one for the TSH. VND+TSH produced an average deviation of 0.09% from CPLEX (UB) running for 103.38 seconds in average. On the other hand, the average deviation of TSH is as high as 2.46%, but its average CPU time requirement is relatively less, measuring 85.92 seconds.

Considering the entire SMDVRP test bed, we observe from Table 4 that VND+TSH beats the former TSH of Sadati et al. (2020) in solution accuracy at the expense of about 10 seconds in the average CPU time. In 21 out of 41 instances VND+TSH yields the same or a better solution than CPLEX. The former TSH can do so in only four instances. Being satisfied with the solution quality and efficiency of VND+TSH in MDVRP and SMDVRP test instances, we decided to employ it in our further experiments. 

In order to evaluate the performance of our comprehensive solution methodology SRE on the 3LRI-SMDVRP we built an initial test bed of 39 instances by taking subsets of customers found in three standard MDVRP instances, namely p01, p02 and pr01. The number of depots in these instances is equal to 4. The parameter is fixed to 1, and the parameter varies between 1 and 3. For each one of p01 and p02, we take 30, 25 and 20 customers into consideration with and . This way we obtain = 1 = 1, 2, 3 2 × (3 × 3)

distinct 3LRI-SMDVRP instances. For pr01, the numbers of customers considered are 48, 45, 40, = 18 35, 30, 25 and 20 where each case produces 3 instances with and . This gives rise to = 1 = 1, 2, 3 (7 × 3)

distinct 3LRI-SMDVRP instances generated from pr01. Hence, we experiment with a test bed of 39 = 21 instances in total. The unit outsourcing cost in each instance is assigned to the values shown in Table 6 . Since the best solutions of these instances are unknown, we compare the VND+TSH results with those obtained by CPLEX within a time limit of 3 hours.

The comparison results are provided in Table 5 where the columns under CPLEX and VND+TSH show IDs of the protected depots, IDs of the interdicted depots, the objective value, CPU time in seconds, and the number of times CPLEX or VND+TSH is called, respectively. The last column indicates the relative gap between CPLEX and VND+TSH. In all CPLEX results, the term "# Calls" is equal to .

( ) CPLEX solves the LLP in all 3LRI-SMDVRP instances to optimality within the allowed time of three hours. This implies that using our proposed smart exhaustive enumeration (SEE) we can obtain the optimal protection and interdiction patterns along with the optimal objective values for all 39 instances. We observe from Table 5 that SRE with VND+TSH can find optimal protection and interdiction patterns in 36 instances. In 17 of these 36 instances it also yields the optimal objective value (bolded figures). SRE with VND+TSH attains an average percent deviation of 0.30% from the optimal objective value found by SEE using CPLEX. The overall solution accuracy and speed of VND+TSH incorporated into the two reduction methods REDUCTION 1 and REDUCTION 2 is promising on all 39 3LRI-SMDVRP instances. Its average CPU time is 70.88 seconds and the average number of calls is 3.38, whilst CPLEX requires 527.84 seconds to converge to the same respective solution with 4.67 calls in average. Our second test bed contains 287 3LRI-SMDVRP instances which are constructed from 33 MDVRP benchmark instances existing in the literature. These 33 instances differ in the number of depots and customers. The parameters of protection and interdiction resources, namely and vary between 1 and 5 with increments of one. The value of the outsourcing cost per unit demand, namely is given in Table 6 for each instance. We tried to assign values in such a way that neither all customers are outsourced (which would be caused by a too low ) nor no customer is outsourced at all (which would be caused by a too high ) in the resulting solution. The results of the two proposed reduction techniques applied to 33 3LRI-SMDVRP instances all with and are given in Table 7 . We report the solution value ( ) and solution time for each

instance. The column shows which reduction technique is used ( : REDUCTION 1, : REDUCTION 2).

The ID columns indicate the IDs of the protected depots and interdicted depots . The performance of the proposed solution methods is summarized in the following four tables in terms of the average CPU time and average counts of VND+TSH calls for different and values. Table 8 and   Table 9 show the average CPU time for each method. We observe that for a given value of ( ) the average CPU time increases as the cardinality of interdiction (protection) resources, namely ( ) increases. Table 10 and Table 11 the average MNPP and average VND+TSH call counts are reported in the first and second columns, respectively, for a given pair of and . The obtained results prove the effectiveness of the methods for solving the 3LRI-SMDVRP with a lesser number of VND+TSH calls instead of solving all possible protection and interdiction patterns exhaustively. In all results, the average number of VND+TSH calls is smaller than the average MNPP. The largest MNPP is found in three instances that have been derived from p21, p22 and p23 where we have 9 and 5. Therefore, MNPP is given by 126. In order to determine a suitable value for the tolerance parameter , we select three 3LRI-SMDVRP  test instances that have been derived from the MDVRP benchmark instances p07 (  4), p03 ( 5) and  pr08 (  6). We solved all possible cases of with both REDUCTION 1 and REDUCTION 2 using + = different values of  0.02, 0.04, 0.06 and 0.08. The results are presented in Table 12, Table 13 and Table    14. For each reduction technique we report the best objective value of the trilevel problem, the number of VND+TSH calls, and CPU time in seconds in the first, second and third columns, respectively. Column "REDUCTION 1" in the tables shows the best solution which is obtained after solving all possible ( ) patterns. The bold values in Table 12, Table 13 and Table 14 

In this paper we introduce a trilevel optimization problem for the protection of the most critical depots in a multi-depot vehicle routing network. The problem is modelled as a defender-attacker-defender game from the perspective of the defender who needs to protect a limited number of depots on an existing network against interdiction by an adversary agent whom we designate as the attacker. The attacker's objective is to inflict the maximum disruption on this network by annihilating a certain number of unprotected depots beyond repair. We refer to this problem as the trilevel r-interdiction selective multi-depot vehicle routing problem (3LRI-SMDVRP). The defender is the decision maker in the upper level problem (ULP) who decides which depots to protect. In the middle level problem (MLP), the attacker chooses r depots to interdict among the unprotected ones. Finally, in the lower level problem (LLP), the decision maker is again the defender who optimizes the vehicle routes and thereby selects which customers are to be visited and which ones to be outsourced to a third-party logistics company. As a solution approach to this trilevel discrete optimization problem, we propose two reduction techniques (REDUCTION 1 and REDUCTION 2) which solve the ULP and MLP through smart exhaustive enumeration. For the LLP we implement a hybrid metaheuristic method combining Variable Neighborhood Descent and Tabu Search Heuristic (VND+TSH) techniques adapted to the selective multi-depot VRP (SMDVRP). Our preliminary numerical experiments validate the performance of VND+TSH on 33 standard MDVRP and 41 randomly generated SMDVRP instances. We then carry out experiments on two separate test beds of 39 and 287 3LRI-SMDVRP instances that were constructed from 33 standard MDVRP instances with different parameters for the outsourcing cost per unit demand, the number of depots to be protected by the defender, and the number of depots to be interdicted by the attacker.

The computational results attest to the effectiveness of the proposed reduction techniques for solving the 3LRI-SMDVRP. We accomplish to solve the trilevel problem with a smaller number of VND+TSH calls compared to the approach of exploring and solving all possible patterns. In our future work, we are planning to incorporate depot location decisions into the ULP of the 3LRI-SMDVRP, and treat it as location-protection-interdiction-routing problem. In this extended problem, the defender has to determine where to open depots and which of them to protect simultaneously. As a solution approach, a matheuristic method could be developed that would combine exact and heuristic techniques to obtain near-optimal solutions within acceptable times. Differentiation among depots with respect to their protection costs under a hard budget constraint could also be a realistic extension worthy of consideration.

( 4 1 ) = attacker's viewpoint. In each interdiction pattern, attacker will select the patterns that give rise to biggest disruption in the network (highest objective value) and the defender's best decision in response is to select the best protection pattern which results in the least possible disruption in the network (minimizing the maximum disruption). Table A1 shows all interdiction patterns for each protection pattern. For example, if the defender protects the first depot, the attacker can interdict one of the remaining depots (second, third, fourth or fifth depot). In this table the "Order" column indicates the order of calling VND+TSH for each interdiction pattern.

REDUCTION 1 starts form solving all possible interdiction patterns of the attacker using CWHiMCA when the first depot is protected. According to the solution of CWHiMCA interdicting 3 rd depot has the highest objective value (701.44) and this interdiction pattern is solved one more time with VND+TSH (improved to 686.70). Now this improved solution is compared against the second highest CWHiMCA solution which has an objective value of 695.42 and suggested the interdiction of the 5 th depot. Because it is less, all interdiction patterns whose CWHiMCA objective value is greater than 686.70 will be solved once again with VND+TSH for the purpose of finding a better interdiction pattern. Note that each ULP and MLP solution pair produced during the iterations of REDUCTION 1 along with the associated LLP objective value coming from VND+TSH will be stored in a hash table for future access. After solving the relevant interdiction patterns with VND+TSH, the best interdiction pattern will consist of the 3 rd depot with an objective value of 686.70.

This process is continued for each protection pattern. The best interdiction patterns are presented in the last column of Table A1 . The final solution consists of protecting the 3 rd depot and interdicting the 2 nd depot with a minimum objective value of 667.18. VND+TSH is called throughout the entire REDUCTION 1 procedure 4 times in total, which is less than = = 5. The best CWHiMCA and VND+TSH solution ( ) ( 5 1 )

for each interdiction pattern in Table A1 has been shown in italic and bolded italic, respectively. Fig. A1 presents a visualization of the final solution where customers are shown by circles and depots by squares. The protected and interdicted depots and the outsourced customers are designated in green, red and black color, respectively. As an example for REDUCTION 2 (CASE II), consider again the test instance p03 with , = 3 = 2 ( + and with the tolerance parameter . better (lower) objective value than 663.73 using VND+TSH. Therefore, the final solution consist of protecting depots 2, 3, and 4 and interdicting depots 1 and 5 with an objective value of 663.73. In order to find this solution VND+TSH is called only one time which is far less than 10. Fig. A2 presents ( ) = ( 5 2 ) = a visualization of the final solution for this example. Note that the validation of REDUCTION 2 for this instance has been provided in Table 13 . 

We execute TSH as a local search procedure embedded in our hybrid algorithm VND+TSH which solves the LLP of the 3LRI-SMDVRP heuristically. It is called in Step 13 of the VND+TSH (see ALGORITHM 3). Given the current solution and the subset of admissible arcs associated with the current S AdmissArc move type of the main VND loop, TSH returns a local optimal solution. ALGORITHM 4 presents the k pseudo code of TSH. 

A tri-level rinterdiction median model for a facility location problem under imminent attack

A bi-level programming model for protection of hierarchical facilities under imminent attacks

A bilevel partial interdiction problem with capacitated facilities and demand outsourcing

A bilevel fixed charge location model for facilities under imminent attack

A bilevel p-median model for the planning and protection of critical facilities

The budget constrained r-interdiction median problem with capacity expansion

An empirical investigation of four well-known polynomial-size VRP formulations

Selective multi-depot vehicle routing problem with pricing

On the solution of the bilevel programming formulation of the terrorist threat problem

Analyzing the vulnerability of critical infrastructure to attack and planning defenses

An arc interdiction vehicle routing problem with information asymmetry

Identifying critical infrastructure: the median and covering facility interdiction problems

Protecting critical assets: the r-interdiction median problem with fortification

Scheduling of vehicles from a central depot to a number of delivery points

A tabu search heuristic for periodic and multi-depot vehicle routing problems

The travelling salesman problem and related problems

A Tabu search heuristic for the vehicle routing problem with two-dimensional loading constraints

Future paths for integer programming and links to artificial intelligence

A result on projection for the vehicle routing problem

Shortest-path network interdiction

A bi-level network interdiction model for solving the hazmat routing problem

An improved benders decomposition algorithm for an arc interdiction vehicle routing problem

Complexity of vehicle routing and scheduling problems

Solving the traveling salesman problem with interdiction and fortification

A tri-level covering fortification model for facility protection against disturbance in r-interdiction median problem

Optimal interdiction of a supply network

Variable neighborhood search

The mixed integer linear bilevel programming problem

Analysis of electric grid security under terrorist threat

Tri-level optimization models to defend critical infrastructure

The r-interdiction selective multi-depot vehicle routing problem. Int T

A bilevel mixed-integer program for critical infrastructure protection planning

Protecting supply systems to mitigate potential disaster: a model to fortify capacitated facilities

An adaptive variable neighborhood search algorithm for a vehicle routing problem arising in small package shipping

The prize-collecting vehicle routing problem with single and multiple depots and non-linear cost

The granular tabu search and its application to the vehicle-routing problem

The Theory of The Market Economy

Capacitated VRP Instances

Removing arcs from a network

Trilevel optimization in power network defense

The authors would like to thank two anonymous reviewers for their constructive and instrumental feedback on the original draft of this paper. Their suggestions and comments helped us to amend the paper and improve its clarity by a great deal.

defender's perspective and for each protection pattern there exist 4 interdiction patterns from