key: cord-0117628-uml524k9
authors: Oliveira, George S.; Roning, Juha; Plentz, Patricia D. M.; Carvalho, Jonata T.
title: Efficient Task Allocation in Smart Warehouses with Multi-delivery Stations and Heterogeneous Robots
date: 2022-02-28
journal: nan
DOI: nan
sha: fa98739e67e63e8a1cf253c560dd8ec3834b9a20
doc_id: 117628
cord_uid: uml524k9

The task allocation problem in multi-robot systems (MRTA) is an NP-hard problem whose viable solutions are usually found by heuristic algorithms. Considering the increasing need of improvement on logistics, the use of robots for increasing the efficiency of logistics warehouses is becoming a requirement. In a smart warehouse the main tasks consist of employing a fleet of automated picking and mobile robots that coordinate by picking up items from a set of orders from the shelves and dropping them at the delivery stations. Two aspects generally justify multi-robot task allocation complexity: (i) environmental aspects, such as multi-delivery stations and dispersed robots (since they remain in constant motion) and (ii) fleet heterogeneity, where robots' traffic speed and capacity loads are different from each other. Despite these properties have been widely researched in the literature, they usually are investigated separately. This work proposes a scalable and efficient task allocation algorithm for smart warehouses with multi-delivery stations and heterogeneous fleets. Our strategy employs a novel cost estimator, which computes costs as a function of the robots' variable characteristics and capacity while they receive new tasks. For validating the strategy a series of experiments is performed simulating the operation of smart warehouses with multiple delivery stations and heteregenous fleets. The results show that our strategy generates routes costing up to $33$% less than the routes generated by a state-of-the-art task allocation algorithm and $96$% faster in test instances representing our target scenario. Considering single-delivery stations and non-dispersed robots, we reduced the number of robots by up to $18$%, allocating tasks $92$% faster, and generating routes whose costs are statistically similar to the routes generated by the state-of-the-art algorithm.

capacity loads are different from each other. Despite these properties have been widely researched in the literature, they usually are investigated separately. Also, many algorithms are not scalable for problems with thousands of tasks and hundreds of robots. This work proposes a scalable and efficient task allocation algorithm for smart warehouses with multi-delivery stations and heterogeneous fleets. Our strategy employs a novel cost estimator, which computes costs as a function of the robots' variable characteristics and capacity while they receive new tasks. For validating the strategy a series of experiments is performed simulating the operation of smart warehouses with multiple delivery stations and heteregenous fleets. The results show that our strategy generates routes costing up to 33% less than the routes generated by a stateof-the-art task allocation algorithm and 96% faster in test instances representing our target scenario. Considering single-delivery stations and non-dispersed robots, we reduced the number of robots by up to 18%, allocating tasks 92% faster, and generating routes whose costs are statistically similar to the routes generated by the state-of-the-art algorithm.

Keywords: Mobile robots, Multi-Robot Systems, MRTA, Task Allocation, Smart Warehouse 1 Introduction E-commerce is a type of commercial transaction carried out through online platforms where consumers purchase products and services. This type of transaction keeps growing and has become more and more critical over the years.

In 2020, Online sales volume expressively increased due to SARS-CoV-2 pandemic, forcing the logistics and transportation sectors to improve their product delivery methods.

In this process of buying and selling products online a key component is the logistics warehouse. The online orders are automatically sent to these warehouses, and they are responsible for grouping products, packing, and shipping orders. Due to the ever increasing volume of online transactions, manually operating logistics warehouses becomes unfeasible, which requires the development and use of automation technologies for warehouses. when compared to single-robot systems (SRS), such as solving complex tasks, increasing overall performance, and system reliability [8] .

Within the research on MRS, multi-robot task allocation (MRTA) is one of the most studied problems. The MRTA aims to ensure that the total cost of performing all tasks is minimized while minimizing the number of robots [9] . Another approach is to meet different constraints and complete one or more general objectives, which usually includes performing all tasks [10] and minimizing the makespan [11, 12] , traveled distance [9] , or energy consumption [13, 14] .

An efficient solution needs to consider constraints that can impact execution costs or increase problem complexity. Such constraints may include relocatable tasks ( [15, 16] ), deadline ( [14, 12, 17, 18] ), environment's uncertainties ( [19, 17] ) and different robots' properties such as load capacity ( [9, 12] ), traffic speed ( [11, 16] ) and energy consumption ( [13, 14, 12] ). In smart warehouses, constraints still include the dispersion of robots, location and demand of the tasks, number of delivery stations, warehouse layout, and other environmental features such as obstacles, roadblocks, and other unpredictable situations.

The main tasks performed in smart warehouses consist on collecting goods contained in a set of orders from the shelves and drop them off to the delivery stations. The main objective is to fulfill all orders upholding constraints and minimizing the objective function's cost. The objective functions include, for example, (1) reducing the time to fulfill orders, (2) reducing the robots' energy consumption, and (3) reducing the number of robots used to perform all tasks.

Regarding smart warehouses, there is trend that automated goods/products gathering using picking robots ( [20, 21] ) will keep increasing. Furthermore, according to Parker [22] , the scope of solvable tasks of MRS increases with the use of heterogeneous robots, allowing parallelism and robustness, which leads to better performance with complex and straightforward robot teams. Therefore, considering smart warehouse scenarios, heterogeneous fleets of robots are useful. This type of fleet offers economic benefits as it becomes cheaper to distribute varied tasks among robots with different capabilities and costs, instead of many costly robots with the same properties [22] .

By investigating state-of-the-art works from the perspective of constraints in smart warehouses, we observed a gap in exploring the possibility of using heterogeneous robot fleets in smart warehouses. For example, we found works that address the MRTA where robots have different load capacities from each other ( [19] ), but as far as we could find out, there are no works that explore robots with different load capacities and traffic speeds at the same time. It is noteworthy, such distinction is common even between robots from the same manufacturer.

In addition to these constraints, environmental aspects such as the warehouse layout, initial (standby) positions of the robots, number and positions of delivery stations, directly influence warehouse performance. In general, warehouses have multiple product delivery stations and due to the high demand for orders, robots are constantly dispersed across the map in constant motion. For the sake of optimization in logistics, we believe that it is interesting that robots remain strategically dispersed, even if in standby, to optimize the picking and delivery of goods. Although works about MRTA consider such factors individually, we observe a lack of studies that address all the constraints mentioned above.

Therefore, assuming that heterogeneous fleets will be constant in smart warehouses with automated pickup, there is a need to develop task allocation strategies that consider multiple constraints and different objective function components, such as time, traveled distance, and energy consumption. We hypothesize that formulating the problem and the objective function, taking into account a cost estimator measured as a function of all such constraints, allows us to generate better cost estimates and, consequently, more efficient task assignments. We briefly present the problem below.

In a seminal work developed by Gerkey and Matarić [23] , an MRTA problem is classified according to its taxonomy: Overall, an MRTA problem of any kind is computationally intractable for large-scale applications [23, 24] . For this reason, most solution methods are approximate, i.e., they do not guarantee that solutions will have the minimum overall cost, but they are fast and provide solutions that are acceptable in practice. In this context, there are works that employ auction-based algorithms [25] , heuristics [9, 12, 18] , meta-heuristics [26, 27, 14] and hybrid algorithms [19, 13, 11] , for example.

In this work, we propose a mechanism for a Robotic Mobile Fulfillment System that considers many constraints. Robots can receive multiple tasks simultaneously; each task only needs one robot to be finished, and there is no need to reallocate tasks. The warehouse employs automated pickup, has several delivery stations, and keeps robots dispersed to promote optimization.

Robots can visit delivery stations as often as necessary to fulfill all tasks. In the taxonomy of Gerkey and Matarić [23] , the proposed solution seeks to solve an MT-SR-IA problem.

Works on smart warehouses sometimes treat MT-SR-IA as one of the variations of the classic vehicle routing problem (VRP) [28] . VRP was introduced by Dantzig and Ramser [29] as a generalization of the traveling salesman problem (TSP) to deal with the problem of transporting items by multiple vehicles. Such a problem is well known in operations research where one or more depots supply a group of known demand and location customers. The objective is to determine one or more routes through which a fleet of vehicles must depart to serve customers within a minimum distance.

Over the years, the VRP has had several variations, with different objectives and constraints determined by the properties of the problem. For example, in the capacitated vehicle routing problem (CVRP), vehicles have limited load capacity, and the objective is to determine routes so that vehicles serve customers without the sum of demands exceeding their load capacity [9, 17, 30] . In the multi-depot vehicle routing problem (MDVRP), in addition to the previous features, vehicles depart from different depots [31] . The heterogeneous fleet vehicle routing problem (HFVRP) is similar to CVRP but considers a heterogeneous fleet of vehicles, generally differing in load capacity, and individual cost [32] . Vehicles start and end in the same depots in all of these variations.

In this work, we approach MT-SR-IA from the perspective of the vehicle routing problem with heterogeneous fleet, many depots, and dispersed vehicles (HFMDVRP-DV). In other words, the robot fleet (vehicles) is heterogeneous and dispersed on the map, and our scenario owns several stations for delivering products (multi-depot). Many other works approached VRP to address the multi-robot task allocation in warehouses [19, 9, 17, 18, 30, 33] , and other indoor environments [34, 35] . However, to the best of our knowledge, this is the first work that addresses the HFMDVRP-DV, and such VRP variation applied indoors. The assumptions for this work are:

• Robots are initially dispersed on the map.

• Robots must collect goods with different demands (weight) spread across the warehouse;

• Robots collect goods according to their load capacities;

• There are several delivery stations where robots have to deliver goods. Delivery stations are visited when (i) robots need to restore their load capacity, or (ii) after collecting the last good, to ensure delivery of all of them;

• The robot fleet is heterogeneous, i.e., the robots are different in their load capacity and traffic speed;

• Robots deliver products to any delivery station;

• Robots visit delivery stations as many times as necessary to ensure that the capacity constraint is not violated.

In this work, we present a space decomposition-based heuristic to solve the HFMDVRP-DV called Domain zoNe based Cpacity and Priority constrained textbfTask Allocator (DoNe-CPTA). DoNe-CPTA employs a cost estimator that takes into account differences in load capacity to determine the probability of robots visiting a delivery station. Our proposal manages to postpone visits to stations, generating lower-cost routes and increasing the robots' efficiency.

DoNe-CPTA uses our cost estimator with an adaptation of the Voronoi diagram ( [36] ) to create the least costly task groups for each robot, namely domain zones. We compared DoNe-CPTA with an adaptation of a state-of-the-art algorithm for smart warehouses called nCAR [9] , which was adjusted to meet the properties of the HFMDVRP-DV. Well-known benchmark datasets used for CVRP problems [37] were used in their classical and adapted form. Results showed that DoNe-CPTA is efficient in execution time and execution cost, while minimizing the number of robots used. Thus, our main contributions are as follow:

1. Presenting the mathematical formulation of the HFMDVRP-DV; 2. Presenting a cost estimator that includes multiple factors and constraints;

3. Developing an efficient task allocation algorithm for building efficient routes in fast time while minimizing the number of robots used; 4. Since no other work has covered HFMDVRP-DV before, we have developed and introduced new datasets adapted from well-known instances of operations research datasets [37] . We use real robot specifications so that our datasets are close to real-world scenarios. 5 . Presenting results that indicate that DoNe-CPTA surpasses state-of-theart solutions in terms of cost, robots, and execution time.

Results show that DoNe-CPTA can generate routes costing up to 33% less using up to 18% fewer robots than the state-of-the-art algorithm.

This article is organized as follows: In Section 2, we present the works related to smart warehouses and MRTA from the perspective of the previously presented constraints. Section 3 presents the mathematical formulation of HFMDVRP-DV. Section 4 presents our cost estimator model. Section 5 presents the design of our decomposition-based heuristic DoNe-CPTA. We present the HFMDVRP-DV datasets in Section 6. The experimental setup, the description of the state-of-the-art algorithm, and the validation of execution cost, time, and robot usage aspects we present in Section 7. Finally, Section 8 brings the conclusions we draw based on the results obtained.

This section presents the related works from the perspective of all constraints that we enumerated in Section 1.3. Several works take into account the load capacity [13, 12, 19, 9, 17, 30] and some others deal with the traffic speed [11, 12, 16, 17] . These constraints are approached simultaneously in [12] and [17] . Pinkam et al. [13] , Dou et al. [11] , Li et al. [14] , Shi et al. [16] and Kloetzer et al. [30] tackle the MRTA with robots starting dispersed on the map. Pinkam et al. [13] , Dou et al. [11] , Xue et al. [12] , and Kloetzer et al. [30] attend scenarios with multiple delivery stations. Table 1 summarizes these works. We point out the particularities of each of these works below.

We noticed that most of the works deal with robots' load capacity either in isolation or along with another constraint. Five works cover three constraints, and just one covers four constraints simultaneously. Xue et al. [12] and Claes et al. [19] study robots that differ either in load capacity or traffic speed, respectively. Also, although the theoretical approach of Edelkamp and Lee [17] assumes heterogeneous robots, experiments only employ homogeneous robots.

In Kloetzer et al. [30] , robots must deliver products to respective stations, but the warehouse contains more than one delivery station. We developed this work to deal with such constraints simultaneously, even though the results show that our strategy can also meet configurations without all these characteristics.

The common thread is that all the works described in this section approach smart warehouses. In Table 1 , the last six works (including ours) deal with the MRTA as a reduction to one variation of VRP. We detail these works below into two groups: (i) MRTA strategies focused exclusively on smart warehouses, and (ii) MRTA strategies addressed as VRP problems (and their variations) that share characteristics of smart warehouse applications.

Pinkam et al. [13] assumes a warehouse where each picking station receives an order and waits for robots to bring the corresponding goods. The proposed method assigns the task to the robot with the lowest estimated cost. Finally, robots compute their routes with the Floyd-Warshall algorithm [43, 44, 45] . Experiments evaluated the efficiency of the density prediction in a simulated warehouse with 100 shelves and up to 100 robots. New tasks appear every 2 seconds up to a total of 200 tasks. Results showed that the average time to complete all tasks is shorter when density prediction is active. The results also showed that increasing the number of robots increases the time to assign tasks but decreases the time for robots to fulfill them.

In the context of vehicle routing (VRP), Claes et al. [19] considers the probabil- show that the MCTS got the best rewards in all setups.

The article [9] presents nCAR (Nearest-neighbor based Clustering and Routing), a heuristic for task allocation in smart warehouses for solving the CVRP. First, nCAR creates feasible routes through a greedy search so that the tasks' demand does not exceed the vehicles' capacity. After that, each feasible route is transformed into an effective route using the Christofides' algorithm [48] . Experiments use two reference datasets for VRP problems: P [49] and X [37] . First, the proposed solution was compared with the state-of-art OR-Tools [50] and optimal costs of the first dataset, showing that the method is able to generate sub-optimal routes quickly. After that, comparing nCAR only with OR-Tools when executing instances of dataset X, nCAR proved to be more efficient than the state-of-the-art algorithm, generating routes with less cost, less time, and with fewer robots.

Edelkamp and Lee [17] introduced and studied the physical vehicle routing 

Although these works also focus on legitimate aspects of real environments, such as uncertainties, time, and energy consumption, we tried to synthesize them from the perspective of other constraints, such as the heterogeneous fleet, the robots' dispersion, and multi-delivery stations. We noticed that only [13] , [12] , [19] , [9] , [17] and [30] take into account the load capacity. Although some properties of a heterogeneous fleet are present in [12] and [19] , the respective works did not take into account robots with different capacities and speeds simultaneously. We also noticed that some works [19, 17, 30] are not scalable for problems with thousands of tasks, being unfeasible to meet real warehouse scenarios; and other works [13, 12, 16] have not been validated by comparison to state-of-the-art algorithms.

This section presents the formulation of HFMDTA-DR for smart warehouses.

The warehouse is mapped onto a Cartesian plane as shown in Figure 1 . There are two types of stations: the picking station, containing shelved goods and aisles where robots can travel; and delivery stations, containing delivery points for goods. Every good is a potential task, which means that any goods lying around at the picking station can be picked up and transported to one of the delivery stations. All tasks have a non-zero demand, i.e. weight, that must be satisfied by robots.

The robot fleet is heterogeneous, with each robot having a different traffic speed and load capacity. Robots are initially dispersed throughout the warehouse, and all of them can pick any product, i.e., can perform any task, as long as the sum of the tasks' demand does not exceed their load capacity. There are tasks that are more demanding than the load capacity of some robots, but there will be at least one robot with enough capacity to perform the most demanding task at a given time. Each edge e in E has a cost k e for any robot r i ∈ R to travel through e.

Each vertex v i ∈ V T has a demand d i that must be satisfied. Let C be the load capacity of all robots in R, the demand of any vertex cannot exceed the maximum capacity of the robots, i.e.:

The objective of MDVRP is to find a set of cycles with disjoint vertices for every r i , each robot starting at u i 0 . Each robot can have zero to t cycles associated to it, as long as the cost of all cycles is minimized and the total demand on the vertices in each cycle does not exceed the robots' capacity.

Each cycle is a route:

whose cost K i is given by:

where u → v is the edge in E connecting the vertices u and v and k u→v is the cost of the edge u → v. 

CVRP is an instance of the MDVRP with only one delivery station, i.e., H = {h 1 } and V H = {v 0 }. All robots start from v 0 and must return to v 0 at the end of the route.

In Heterogeneous Fleet and Multi-Delivery Task Allocation (HFMDTA), the MDTA definitions need to be adjusted to include the individual properties of each robot. In the case of this work, such property refers to traffic speed. Thus, it is necessary to define a graph G i separately for each robot r i . The cost of each edge e ∈ E i is defined as a function of one or more properties of r i .

Furthermore, the robots' load capacities and traffic speed are distinct, as the fleet is heterogeneous. Let C be the smallest load capacity among all robots. Then, the demand of any vertex cannot exceed the smallest capacity

The objective of HFMDTA is to find a set of cycles with disjoint vertices for each robot r i , with each robot starting at its respective u i 0 . Each robot r i can have zero to t cycles associated with it, as long as the cost of all cycles is minimized and the total demand of the vertices in each cycle does not exceed C i . Each cycle is a route:

whose cost K i is given by:

where:

1. A j i is the j-th route of r i ; 2. u → v is the edge in E i between the vertices u and v; Mathematically, the HFMDTA's ideal solution is to find t routes

Unlike previous variations, in HFMDVRP-DV the robots' starting point is not necessarily the location of any delivery station but any location in the map. so that r i can start from its starting point v i 0 , visit any picking task, and go back to any delivery station in V H to perform a delivery task. There are no edges between v 0 and any vertex in V H .

Each edge e ∈ E i is costed as a function of the r i 's traffic speed.

As robots start from varying positions, the objective of the HFMDVRP-DV is to find a set of paths (not cycles) for each r i , with each robot starting at its respective v i 0 . The last vertex of each path is a vertex in V H . Mathematically, each path is a route:

whose cost K i is given by:

Therefore, the ideal solution of the HFMDVRP-DV is to find t routes

In the previous section, we presented the HFMDTA-DR formulation without explaining how the costs are estimated. This section introduces a cost estimator that considers the distance between robots and tasks, traffic speed, and robots' maximum and variable load capacity. Costs are first defined by the distance between robot r and task t and the traffic speed of r. So, when comparing the variable load capacity of r and the demand of t, if r needs to visit a delivery station before or after executing t, then the previously estimated cost is incremented with the cost of performing a delivery task. The basic premise is to penalize robots with low variable load capacity, giving way to other robots and thus postponing visits to delivery stations as much as possible.

Let Γ i be the maximum capacity of the robot r i and γ j i the capacity of r i after executing the picking task t j . We say that a picking task t k is absolutely

. In other words, robots are absolutely able of performing all picking tasks whose demand is less than or equal to their maximum load capacity; and conditionally able to perform any picking task whose demand is less than or equal to their current load capacity. By default, all robots are absolutely able of performing any delivery task, but not all robots are able of performing all picking tasks. Let f f act (ζ k , r i ) be a binary Boolean function that determines whether any task ζ ∈ T ∪ H is feasible at r i after r i execute ζ k−1 , so mathematically:

Let θ * i be the set of all absolutely feasible picking tasks to r i , θ tj i the set of all conditionally feasible picking tasks to r i after r i executes t j and τ i = (τ i:1 , . . . , τ i:q ) an ordered sequence of q picking tasks assigned to r i . So, a feasible sequence is every sequence τ k such that:

In other words, τ k is a feasible sequence if (i) the first task is absolutely feasible to r k ; if (ii) the other tasks are conditionally feasible to r k after r k executes the previous task and if (iii) the sum of all tasks' demand in τ k does not exceed the r k 's maximum capacity.

Such a definition of a feasible sequence implies that robots must restore the maximum load capacity between two sequences, which only occurs after a delivery task. As our goal is to reduce costs, we determine that every delivery task after a feasible sequence τ must be the one whose delivery station place is closest to the last task of τ . Now, let A i = (α i:1 , . . . , α i:a ) be a route containing picking and delivery tasks assigned to r i . A i is only a feasible route if :

where:

1. τ i is a feasible sequence that makes up A i ; and 2. {β} is a unitary set whose element β is the delivery task that represents the closest delivery station to τ i:q (the last task of τ i ).

Note that A i is one of the t routes of r i that make up the ideal solution of HFMDTA-DR (Section 3.3). Each task α i:k ∈ A i comprises traversing the edge connecting the corresponding vertices in the graph G i . The first task comprises traversing the edge that connects the vertex α i:1 (or v i 0 ) with the vertex that corresponds to the first task. Thus, the cost k v→u i of traveling from the vertex v to the vertex u is equal to the weight of the edge (v, u) ∈ E i , i.e.:

where υ i is the traffic speed of r i .

Let ∆ u v be the Manhattan distance between v and u, then the cost k v→u i of r i to travel from v to u is given by Equation 11 .

Equation 11 depends on five different cases:

1. Case 1: the vertex u corresponds to a picking task (u ∈ T ), but the r i 's maximum load capacity is insufficient to execute u(u / ∈ θ * i ), so cost(v, u, υ i ) = ∞;

(a) the vertex u corresponds to a delivery task (u ∈ H) or;

(b) the vertex u corresponds to a picking task (u ∈ T ), the r i 's maximum and current load capacity is sufficient to execute u(u ∈ θ * i and Efficient Task Allocation in Smart Warehouses f f act (u, r i ) = true) and r i do not need to visit a delivery station after executing u, as there will be at least one feasible task besides u ( exists c ∈ T | c = u, f f act (c, r i ) = true); 3. Case 3: the vertex u corresponds to a picking task (u ∈ T ), the r i 's maximum and current capacity is sufficient to execute u(u ∈ θ * i and f f act (u, r i ) = true) and r i need to visit a delivery station after executing u, as all other tasks besides u do not will be feasible to r i (∀c ∈ T | c = u, ¬f f act (c, r i ) = false); 4. Case 4: vertex u corresponds to a picking task (u ∈ T ), the r i 's current capacity is insufficient to execute u(f f act (u, r i ) = false), even though the maximum capacity is enough to perform u(u ∈ θ * i ). Also, r i does not need to visit a delivery station after executing u, as there will be at least one feasible task besides u (∃c ∈ T | c = u, f f act (c, r i ) = true); 5. Case 5: vertex u corresponds to a picking task (u ∈ T ), the r i 's current capacity is insufficient to execute u(f f act (u, r i ) = false), even though the maximum capacity is enough to perform u(u ∈ θ * i ). Also, r i needs to visit a delivery station after executing u, as all other tasks besides u will not be feasible to r i (∀c ∈ T | c = u, f f act (c, r i ) = false); Generally, cost estimators conceived in other works only take into account the cost between the current location and the target location, without considering the possibility that the robots will not have enough load capacity to perform the target task at any given time. Our cost estimator contrasts the more common models because it was designed to penalize visits to delivery stations and thus optimize the robots' usage and reduce costs.

DoNe-CPTA is a task-allocation heuristic that applies the proposed cost estimator to create action groups, optimizing the allocation of tasks to a team of heterogeneous robots, and reducing visits to delivery stations. Section 5 presents the details of this novel heuristic to solve HFMDTA-DR in smart warehouses.

DoNe-CPTA is a task allocation algorithm guided by an adaptation of the Voronoi Tessellation [36] , whose definition is as follows: Concerning DoNe-CPTA, sites and tessellation space characterize robots and tasks, respectively. A robot cell corresponds to tasks that a given robot can perform at a lower cost than any other robot. Our cost estimator (Equation if κ current = κ best then 7:

else if κ current < κ best then 9: κ best ⇐ κ current Ψ and Φ are valid sets as long as robots' position and capacity remain the same as when both sets were computed. However, as DoNe-CPTA iteratively adjusts robots' position and capacity, Φ becomes invalid when robots receive new tasks. Likewise, all Ψ sets originating from Φ also become invalid. In this scenario, our strategy requires that robot domains are constantly updated.

Therefore, we propose a simulation-based approach to determine domain recomputation periods and prevent DoNe-CPTA from assigning lower-cost tasks from an invalid domain. Such a simulation employs an arrival queue (AQ) that stores the time robots will take to reach their latter tasks. AQ contains a slot for each robot, and each slot stores information about the robot itself and the arrival time until its next task. Therefore, robots' movements are simulated by updating the arrival times in their respective slots. We use the information The AQ is updated all the time between lines 8 and 33. The slot order dynamically changes depending on the distance and speed of each robot. The test in line 16 checks if t current is not conditionally feasible to r current . This is done by comparing the task demand and the robot's current capacity. If the test is true, it means that the robot must visit the nearest delivery station. If the test on line 16 is false, then the robot will be able to run t current .

At line 24, t current is appended into the r current route. The cost and current capacity of the robot are also updated (rows 25 and 26) . Also, t current is removed from T and r current is appened in λ. setArrivalTime(r current , t current ) updates r current slot in AQ.

If 

Research on MRTA in smart warehouses has grown every year, and new solutions are constantly being proposed. However, to the best of our efforts we could not find a benchmark considering all the aspects we aim to study with the proposed work, all very plausible to smart warehouse scenarios. Overall, environments and task distributions are randomized, and considering the fact that the code of many algorithms is almost always closed and not available to the public, it becomes difficult to replicate and reproduce the results for an adequate comparison.

As a first step to solving this issue, we have developed and made available a new dataset for testing developed solutions for MRTA in smart warehouses.

Authors can use this novel dataset to validate their work on operations research and other MRTA scenarios. The instances we have created are based on datasets widely used in operations research works [58] . We use instances of a dataset designed to cover several characteristics of real applications, called dataset X [59] . Also, dataset X is ideal for our adaptations because it is a CVRP, which sets up a very simple reduction of HFMDVRP-DV (Section 3).

From now on, we call dataset X as base-dataset. Each instance of basedataset defines tasks and robots' location, tasks' demand, and a common load capacity for all robots. So, inserting information that characterizes Table 2 shows the comparison between the novel datasets. Each dataset instance is named as

where < m > is the number of tasks, < n > is the number of robots, and < p > is the number of delivery stations.

For example, instances RMT-t181-r23-d1, SMT-t181-r23-d4,

WMT-t181-r23-d4 and XMT-t181-r23-d1 are adaptations of the base-dataset Same of (single) delivery task 1 X-n181-k23, whose number of picking tasks, robots and delivery tasks are All WMT and XMT instances contain generic robots, with load capacities and traffic speeds equal to each other. In each instance, robots have the same capacity as in the equivalent instance of the base dataset. RMT and SMT instances contain data about load capacity and traffic speed from several robots designed to transport items in many settings, including smart warehouses. In addition to robots, we have also inserted new delivery tasks into HFMDVRP-DV instances. The total of delivery tasks p is proportional to the number of picking tasks, whose value is computed by 14. The number of delivery tasks in XMT and RMT instances is one.

All picking tasks' demands in new instances with homogeneous robots remained the same as in the base-dataset. However, we adapted the demands for datasets with heterogeneous robots to ensure that at least one robot had enough capacity to perform all tasks. The adapted demand d e (t j ) of t j in the instance e of the datasets for heterogeneous robots is given by the Equation 15

where µ e (Γ) is the average of the robots' load capacities of the instance e of the new dataset, d e (t j ) is the demand of the task t j in instance e of the base dataset and C e is the capacity of robots on instance e of the base dataset.

The picking and delivery tasks' locations are the same for all datasets. In SMT and WMT, the position of all new delivery tasks was defined randomly, as many were generated. SMT and WMT also define all robots' positions randomly. All robots and tasks occupy a different point on the map, except in RMT and XMT instances, where robots start and end at the same location.

We We compared the performance of DoNe-CPTA with a state-of-the-art algorithm, adaptated to the proposed HFMDVRP-DV's scenario, which is represented by the datasets detailed on (Section 6). According to its authors, the state-of-the-art nearest-neighbor based Clustering And Routing (nCAR) algorithm ( [9] ) was designed to meet the requirements of smart warehouses and has been formulated based on one of the VRP variations, more specifically, the CVRP. We adapted nCAR to meet the HFMDVRP-DV's properties and named the novel algorithm as Adapted nCAR (a-nCAR). Below, we briefly present the differences between a-nCAR and nCAR:

• In nCAR, all robots depart from the same location, the location of the (single) delivery station. In a-nCAR, robots depart from different locations according to the novel dataset specifications.

• In each main iteration of nCAR, the algorithm creates a feasible route for each robot, as they all have the same capability. The lowest cost route is chosen. In a-nCAR, each robot's main iteration is performed, respecting its capabilities, and the chosen route is the one with the lowest cost.

• In nCAR, when robots receive a delivery task, it corresponds to visiting the (single) delivery station. In a-nCAR, when a robot receives a delivery task, it corresponds to visiting the closest delivery station to the last picking task of such a robot.

Note that such differences are designed to make nCAR suitable for serving different instances of CVRP. In other words, if the input instances meet the CVRP properties, then a-nCAR will work similarly to nCAR. We compared DoNe-CPTA with a-nCAR running dataset X, original CVRP formulation, and observed compatible cost routes generated in a shorter runtime. fleets with high-capacity robots will take longer to perform tasks, while those with high-speed robots will visit delivery stations more often, both making routes more costly.

In SMT (heterogeneous fleet), the highest cost routes occurred on instances with more than 900 tasks, but similar costs occurred on instances with roughly 500 in WMT (homogeneous fleet).

DoNe-CPTA is superior to a-nCAR also in terms of execution time, as can be seen in Figure 5 . robots. Even so, each XMT instance ran 30 times to compute the average execution of the algorithms. Therefore, the results we discussed in this section were taken from 6000 runs. Since the results indicated that DoNe-CPTA and a-nCAR perform very similarly in this scenario, detailed cost results per dataset were only included in the supplementary material.

Execution time overview ( Figure 10 ) shows that despite DoNe-CPTA present similar performance to a-nCAR, it still presents superior performance in route and 92% in XMT and RMT, respectively. Compared to the setup of dispersed robots, DoNe-CPTA was up to 5 times slower (15 seconds for RMT and 16

seconds XMT). Note that since robots are initially located at the same point, the AQ will become invalid in the initial iterations of DoNe-CPTA, increasing call to computeDomain() and consequently the execution time. The a-nCAR behavior was similar to the previous scenario, taking up to 200 seconds to run most instances and over 1000 seconds in some other cases. Regarding the number of robots, DoNe-CPTA is about 18% more efficient than a-nCAR with heterogeneous fleets. Also, our strategy employed 21% fewer robots in RMT compared to XMT, on average. Nevertheless, we found no evidence to assert that DoNe-CPTA employed fewer robots than a-nCAR (Mann-Whitney U, p = 0.07911). Also, we did not observe notable differences by nCAR in any dataset. Note that such results are not statistically significant, as we noted in Section 7.1.3. All supplementary information the reader can find in the supplementary material.

In this paper, we present a task allocation algorithm for Multi-Robot Systems (MRS) considering a smart warehouse with automated picking [21] .

Our motivation is to assume that heterogeneous fleets will be constant in such warehouses. We also hypothesized that constructing a cost estimator measured as a function of all system constraints generates more accurate estimates and, consequently, more efficient task assignments. The problem was 1. Present a mathematical formulation for HFMDVRP-DV since this is the first work to deal with such VRP variation, according to our most recent research.

2. Deploy a novel cost estimator that employs all HFMDVRP-DV properties. Thus, our algorithm manages to postpone delivery task assignments because such a cost estimator penalizes costs for robots with low variable load capacity, giving way to robots performing picking tasks without visiting a delivery station.

3. Introduce a new dataset for HFMDVRP-DV adapted from another CVRP dataset enhanced with real-world smart warehouse features.

4. Develop DoNe-CPTA, an efficient algorithm to generate low-cost routes, with a minimum number of robots and low execution time, to solve real instances of HFMDVRP-DV.

We also validated DoNe-CPTA against an adapted version of a state-ofthe-art algorithm (a-nCAR), both running the novel HFMDVRP-DV dataset.

Results showed that our strategy generates routes costing up to 33% less than the routes generated by a-nCAR, over 90% faster and using up to 18% fewer robots.

Our next step is to optimize domain recalculation with the expectation that DoNe-CPTA will yield better results. Our strategy currently employs a global time queue (AQ) for all robots and determines the domains' validity by comparing the value of time in the queue and whether the robot is (or is not) performing tasks. As a next step, we will determine dynamic domains using an estimate to determine the position of other robots when a particular robot achieves its task. Our future work also includes evolving this work's contributions to deal with other real smart warehouse specifications, such as (i) order processing, (ii) delivering products from the same order at the same station, (iii) balancing in delivery station visits to avoid overloading and (iv) support tasks reallocation.

at Biomimetics and Intelligent Systems Group -BISG, University of Oulu, Finland; Finnish UAV Ecosystem (No. 338080).

Conflict of interest. The authors declare that they have no conflict of interest.

Code and Data Availability. The codes and datasets generated during this study, as well as the data resulting from the experiments are available at https://github.com/geosoliveira/DoNe-CPTA. Ethics approval. Not applicable.

Consent to participate. Not applicable.

Consent for publication. Not applicable.

regardless of the dataset and strategy used, costs tend to increase when more robots are used. WMT-t101-r25-d4 WMT-t115-r10-d4 WMT-t129-r18-d4 WMT-t143-r7-d4 WMT-t157-r13-d4 WMT-t172-r51-d4 WMT-t186-r15-d4 WMT-t200-r36-d4 WMT-t214-r11-d4 WMT-t228-r23-d4 WMT-t242-r48-d4 WMT-t256-r16-d5 WMT-t270-r35-d5 WMT-t284-r15-d5 WMT-t298-r31-d5 WMT-t313-r71-d5 WMT-t327-r20-d5 WMT-t344-r43-d5 WMT-t367-r17-d5 WMT-t393-r38-d5 WMT-t420-r130-d5 WMT-t449-r29-d5 WMT-t480-r70-d5 WMT-t513-r21-d5 WMT-t548-r50-d5 WMT-t586-r159-d5 WMT-t627-r43-d5 WMT-t670-r130-d6 WMT-t716-r35-d6 WMT-t766-r71-d6 WMT-t819

Cost Average cost by DoNe-CPTA and a-nCAR (WMT) a-nCAR DoNe-CPTA

Figure 3 Cost of routes computed by DoNe-CPTA and a-nCAR with WMT. SMT-t101-r25-d4 SMT-t115-r10-d4 SMT-t129-r18-d4 SMT-t143-r7-d4 SMT-t157-r13-d4 SMT-t172-r51-d4 SMT-t186-r15-d4 SMT-t200-r36-d4 SMT-t214-r11-d4 SMT-t228-r23-d4 SMT-t242-r48-d4 SMT-t256-r16-d5 SMT-t270-r35-d5 SMT-t284-r15-d5 SMT-t298-r31-d5 SMT-t313-r71-d5 SMT-t327-r20-d5 SMT-t344-r43-d5 SMT-t367-r17-d5 SMT-t393-r38-d5 SMT-t420-r130-d5 SMT-t449-r29-d5 SMT-t480-r70-d5 SMT-t513-r21-d5 SMT-t548-r50-d5 SMT-t586-r159-d5 SMT-t627-r43-d5 SMT-t670-r130-d6 SMT-t716-r35-d6 SMT-t766-r71-d6 SMT-t819-r171-d6 SMT

WMT-t101-r25-d4 WMT-t115-r10-d4 WMT-t129-r18-d4 WMT-t143-r7-d4 WMT-t157-r13-d4 WMT-t172-r51-d4 WMT-t186-r15-d4 WMT-t200-r36-d4 WMT-t214-r11-d4 WMT-t228-r23-d4 WMT-t242-r48-d4 WMT-t256-r16-d5 WMT-t270-r35-d5 WMT-t284-r15-d5 WMT-t298-r31-d5 WMT-t313-r71-d5 WMT-t327-r20-d5 WMT-t344-r43-d5 WMT-t367-r17-d5 WMT-t393-r38-d5 WMT-t420-r130-d5 WMT-t449-r29-d5 WMT-t480-r70-d5 WMT-t513-r21-d5 WMT-t548-r50-d5 WMT-t586-r159-d5 WMT-t627-r43-d5 WMT-t670-r130-d6 WMT-t716-r35-d6 WMT-t766-r71-d6 WMT-t819-r171-d6 WMT

SMT-t101-r25-d4 SMT-t115-r10-d4 SMT-t129-r18-d4 SMT-t143-r7-d4 SMT-t157-r13-d4 SMT-t172-r51-d4 SMT-t186-r15-d4 SMT-t200-r36-d4 SMT-t214-r11-d4 SMT-t228-r23-d4 SMT-t242-r48-d4 SMT-t256-r16-d5 SMT-t270-r35-d5 SMT-t284-r15-d5 SMT-t298-r31-d5 SMT-t313-r71-d5 SMT-t327-r20-d5 SMT-t344-r43-d5 SMT-t367-r17-d5 SMT-t393-r38-d5 SMT-t420-r130-d5 SMT-t449-r29-d5 SMT-t480-r70-d5 SMT-t513-r21-d5 SMT-t548-r50-d5 SMT-t586-r159-d5 SMT-t627-r43-d5 SMT-t670-r130-d6 SMT-t716-r35-d6 SMT-t766-r71-d6 SMT-t819-r171-d6 SMT

Models for warehouse management: Classification and examples

Robotic mobile fulfillment systems: A survey on recent developments and research opportunities

Adaptive task planning for multi-robot smart warehouse

Coordinating hundreds of cooperative, autonomous vehicles in warehouses

Warehousing in the e-commerce era: A survey

The warehouse manager's guide to automated storage and retrieval systems (as/rs) -everything you need to know

A survey and analysis of multirobot coordination

Multi-robot task allocation: A review of the state-of-the-art

A scalable multi-robot task allocation algorithm

Assembly process planning and its future in collaborative manufacturing: A review

Genetic scheduling and reinforcement learning in multirobot systems for intelligent warehouses

Task allocation of intelligent warehouse picking system based on multi-robot coalition

2016 16th International Conference on Control, Automation and Systems (ICCAS)

Research on task allocation in multiple logistics robots based on an improved ant colony algorithm

Fast scheduling of robot teams performing tasks with temporospatial constraints

Task allocation and path planning of many robots with motion uncertainty in a warehouse environment

Multi-robot multi-goal motion planning with time and resources

Cannot avoid penalty? let's minimize

Decentralised online planning for multi-robot warehouse commissioning

Fetch robotics introduces fetch and freight: Your warehouse is now automated

Towards a conceptualisation of order picking 4.0

Multiple Mobile Robot Systems

A formal analysis and taxonomy of task allocation in multi-robot systems

A taxonomy for task allocation problems with temporal and ordering constraints

Resource-based task allocation for multi-robot systems

A mechanism for scheduling multi robot intelligent warehouse system face with dynamic demand

A novel warehouse multi-robot automation system with semi-complete and computationally efficient path planning and adaptive genetic task allocation algorithms

The vehicle routing problem, RAIRO -Operations Research

The truck dispatching problem

Optimal indoor goods delivery using drones

A tabu search heuristic for periodic and multi-depot vehicle routing problems

Solving a heterogeneous fleet vehicle routing model -a practical approach

Metaheuristic scheduling of multiple picking agents for warehouse management

A vrp-based route planning for a mobile robot group

Itinerary optimisation approach inside hypermarkets

Voronoi diagrams -inventor, method, applications

New benchmark instances for the capacitated vehicle routing problem

Genetic algorithms

Deep reinforcement learning: A brief survey

Some methods for classification and analysis of multivariate observations

A formal basis for the heuristic determination of minimum cost paths

Optimization, Learning and Natural Algorithms

Algorithm 97: Shortest path

A theorem on boolean matrices

Algorithm 141: Path matrix

Efficient selectivity and backup operators in monte-carlo tree search

Effective approximations for multi-robot coordination in spatially distributed tasks

Worst-case analysis of a new heuristic for the travelling salesman problem

Computational results with a branch and cut code for the capacitated vehicle routing problem

Nested monte-carlo search

Nested rollout policy adaptation for monte carlo tree search

Algorithms for the vehicle routing and scheduling problems with time window constraints

Distributed algorithms for multirobot task assignment with task deadline constraints

Practical optimization: A gentle introduction

Ibm ilog cp optimizer for scheduling

Capacitated Vehicle Routing Problem Library

New benchmark instances for the capacitated vehicle routing problem

Mann Whitney U test calculator (Wilcoxon ranksum