key: cord-0166572-6luoxtqa
authors: Jones, Simon; Milner, Emma; Sooriyabandara, Mahesh; Hauert, Sabine
title: DOTS: An Open Testbed for Industrial Swarm Robotic Solutions
date: 2022-03-25
journal: nan
DOI: nan
sha: 874b2aac19667d07ba375e9dd713e1437e2b08f3
doc_id: 166572
cord_uid: 6luoxtqa

We present DOTS, a new open access testbed for industrial swarm robotics experimentation. It consists of 20 fast agile robots with high sensing and computational performance, and real-world payload capability. They are housed in an arena equipped with private 5G, motion capture, multiple cameras, and openly accessible via an online portal. We reduce barriers to entry by providing a complete platform-agnostic pipeline to develop, simulate, and deploy experimental applications to the swarm. We showcase the testbed capabilities with a swarm logistics application, autonomously and reliably searching for and retrieving multiple cargo carriers.

M ANY robots (10-1000+) working together to facilitate intralogistics in real-world settings promise to improve productivity through the automatic transport, storage, inspection, and retrieval of goods. Yet, existing systems typically require sophisticated robots, carefully engineered infrastructure to support the operations of the robots, and often central planners to coordinate the many-robot system (e.g. Amazon, Ocado). Once operational, these solutions offer speed and precision, but the pre-operational investment cost is often high and the flexibility can be low.

Consequently, there is a critical unmet need in scenarios that would benefit from logistics solutions that are low-cost and usable out-of-the-box. These robot solutions must adapt to evolving user requirements, scale to meet varying demand, and be robust to the messiness of real-world deployments. Examples include small and medium enterprises, local retail shops, pop-up or flexible warehouses (COVID-19 distribution centres, food banks, refugee camps, airport luggage storage), and manufacturing of products with high variance or variation (e.g. small series, personalised manufacturing). Flexible solutions for such scenarios also have the potential to translate to other applications and settings such as construction or inspection.

Swarm robotics offers a solution to this unmet need. Large numbers of robots, following distributed rules that react to local interactions with other robots or local perception of their environment, can give rise to efficient, flexible, and coordinated behaviours. A swarm engineering approach has the potential to be flexible to the number of robots involved, S. Jones enabling easy scalability, adaptation and robustness. While traditional logistics solutions prioritise throughput, speed and operating cost, swarm logistics solutions will be measured against additional 'Swarm Ideals', targeting zero setup, reconfiguration time, and scaling effort, zero infrastructure, zero training, and zero failure modes. It is these attributes that make swarm robotics the natural solution for logistics in more dynamic and varied scenarios.

To explore this potential, we present a new 5G-enabled testbed for industrial swarm robotic solutions. The testbed hosts 20 custom-built 250 mm robots called DOTS (Distributed Organisation and Transport System) that move fast, have long battery life (6 hours), are 5G enabled, house a GPU, can sense the environment locally with cameras and distance sensors, as well as lift and transport payloads (2 kg per robot). The platform is modular and so can easily be augmented with new capabilities based on the scenarios to be explored (e.g. to manipulate items). To monitor experiments, the arena is fitted with overhead cameras and a motion capture system, allowing for precise telemetry and replay capabilities. The testbed is accessible remotely through custom-built cloud infrastructure, allowing for experiments to be run from anywhere in the world and enabling future fast integration between the digital and physical word through digital twinning. In addition we present an integrated development environment useable on Windows, Linux and OSX lowering the barriers to entry for users of the system.

Beyond characterising the testbed, making use of the development pipeline illustrated in Figure 2 , we demonstrate the steps and processes required to take a conceptually simple Fig. 2 : Using the DOTS IDE, controllers are developed and tested locally in simulation. When ready, controllers are validated using the online portal and deployed to the physical testbed for an experimental run. Experimental data captured is available for download from the portal and subsequent analysis.

swarm logistics algorithm and successfully run it on real-world robots to reliably find, collect, and deposit five payload carriers in an entirely decentralised way. This paper is structured in the following way; in Section II we provide background and discuss related work. In Section III we detail the physical, electronic, and software systems of the testbed, with characterisation of the performance of various subsystems. In Section IV we build a complete demonstration of a distributed logistics task. Section V concludes the article.

II. BACKGROUND AND RELATED WORK Two recent reviews of multi-robot technologies for warehouse automation show that amongst over 100 papers in the area, only very few considered decentralised, distributed, or swarm approaches [1] , [2] . Of those that did, solutions were only partially decentralised [3] . This suggests swarm solutions for logistics have largely remained unexplored, although the interest in distributed multi-robot systems for intralogistics is growing as shown in a recent review by [4] . At the same time, swarm robotics [5] has 30 years of history, mostly focussing on conceptual problems in the laboratory with translation to applications increasing in the last couple of years [6] .

We now need fresh strategies to design and deploy swarm solutions that eschew the mantra of emergence from interaction of simple agents and embrace the use of high specification robots at the agent level to enhance the performance of a robot swarm without sacrificing its inherent scalability and adaptability in real-world applications. This is made possible now due to the convergence of technologies including high individual robot specifications (fast motion, high-computation, and high-precision sensing), 5G networking to power communication between humans and interconnected robots, access to high onboard computational power that allows for sophisticated local perception of the world and new algorithms for swarm control. Swarm testbeds for research do exist. The Robotarium [7] is a complete system of small robots, with associated simulator, online access, tracking, and automated charging and management. The individual robots are not autonomous though, with controller code executes on a central server, with the robots acting as peripherals. Duckietown [8] is an open source platform of small cheap autonomous robots designed for teaching autonomous self-driving, but the emphasis is on users building their own testbed. The Fraunhofer IML Loadrunner [9] swarm is technically sophisticated and physically considerably larger than our design but does not appear to be open.

In the following sections, we detail the robots, the arena, the online portal for remote experimentation, and the crossplatform integrated development environment.

The robots were designed from the start to be low-cost, around £1000 per robot, simple to construct, to allow relatively large numbers to be built, and high capability, by using commodity parts in innovative ways. Each robot is 250 mm in diameter with a holonomic drive capable of rapid omnidirectional movement, up to 2 ms −1 and typically 6 hours of battery life. They are equipped with multiple sensors; 360°vision with four cameras and a further camera looking vertically upwards, laser time-of-flight sensors for accurate distance sensing of surrounding obstacles, and multiple environment sensors. Each robot has a lifting platform, allowing the transport of payloads. Onboard computation is provided with a high-specification Single Board Computer (SBC), with six ARM CPUs and a GPU. And there are multiple forms of communication available -WiFi, two Bluetooth Low Energy (BLE5) programmable radios, an ultrawideband (UWB) distance ranging radio, and the ability to add a 5G modem.

1) Cost minimisation: A central driving factor in the design process was cost minimisation while still achieving good performance. The rapid progress in mobile phone capabilities means that high performance sensors such as cameras are now available at very low cost. Communications advances means that fully programmable Bluetooth Low Energy (BLE5) modules cost only a few pounds. Single board computers based on mobile phone SoCs are widely available, high definition cameras cost £4. And the market for consumer drones has made very high power-to-weight ratio motors available at much lower cost than specialised servo motors. We leverage these advances to reduce the cost per robot to £1000, so the complete testbed can have many robots. 2) Robot chassis: The mechanical chassis, shown in Figure  3 , is designed to be easy to fabricate in volumes of a few 10s. Custom parts are made using 3D printing or simple milling operations. It is constructed around two disks of 2 mm thick aluminium, 250 mm in diameter, for the base and top surfaces. Between the base and top plates and mechanically joining them 100 mm apart are three omniwheel assemblies, positioned at 120°separation. Each wheel assembly consists of an omniwheel, a drone motor with rotational position encoder, and a 5:1 timing belt reduction drive between motor and wheel. Mounted on the base plate are the single board computer (SBC) and the power supply PCBs, in both cases thermally bonded to the base plate to provide passive cooling. The battery pack sits in the central area of the base plate. On 55 mm standoffs above the base plate is the mainboard PCB, which contains all the sensors, four cameras, motor drives, and associated support electronics. The top plate holds the payload lifting mechanism and an upward-facing camera, connecting via USB and power leads to the mainboard. Alternate top plates, with different actuators, for example a small robot arm, can easily be fitted.

3) System architecture: Many subsystems within the robot need to be integrated and connected to the single board computer. The system architecture is shown in Figure 4 . Communications between sensors, actuators, and the single board computer takes place on three bus types; USB for high bandwidth, mostly MJPEG compressed video from the cameras, I 2 C for low bandwidth, mostly various sensors, and SPI for medium bandwidth deterministic latency, used for controlling the motor movements. An FPGA is used for glue logic, for example to interface to the MEMS microphones and the programmable LEDs. The BLE5 and UWB radios are reprogrammable from the SBC via their single wire debug (SWD) interfaces [10] .

Vision is the primary modality by which the robot can understand its environment. Miniature cameras have been revolutionised by progress in mobile devices. It is now possible to buy a 5 megapixel camera also capable of multiple video resolutions up to 1080p for £4. A central design issue is how to best use this cheap data -image processing is computationally heavy. We designed the system to have all round vision for the robot, achieved by using four cameras with a 120°field of view (FOV) around the perimeter of the robot. The two front-facing cameras have overlapping fields of view to allow for the possibility of stereo depth extraction. A fifth camera is fitted in the centre of the lifting platform looking upwards so that visual navigation under load carriers can be achieved with suitable targeting patterns.

The amount of data in an uncompressed video stream is high, 640x480 at 30 fps takes around 18 MBytes/s of bandwidth. For five cameras, this is close to 100 MBytes/s, which would require a high performance interconnect such a gigabit ethernet, impractical at low cost and indicating a need for image compression. Rather than using fixed function webcam hardware to perform this compression, we looked for flexibility. The approach we took was to use a local Raspberry Pi Zero computer for each camera to perform image compression and then send the data over USB. This gives a 

Step/Dir Step/Dir cost per camera of around £15. The Raspberry Pi Zero is a low cost small form factor Single Board Computer with a camera interface. Although the CPU has quite a low performance (ARM1176 700MHz), it is not generally appreciated how powerful the associated VPU (Video Processing Unit) and GPU (Graphics Processing Unit) is. This allows for the possibility of utilising this power to perform local image processing, e.g. to perform fiducial recognition locally, rather than loading the main central processor.

In this paper, we use standard applications to stream MJPEG compressed frames with low latency from the photon arrival at the camera, to the presence of data in a buffer at the central processor.

Although cameras are cheap, processing image streams to give useful information is not, and although the single board computer is quite capable, it is not in the same class as a desktop PC. This means we are not free to easily run more complex algorithms without considerable optimisation efforts to, for example, utilise GPU processing. As an example, most visual SLAM algorithms are evaluated on PC-class systems. ORB-SLAM2 was evaluated in a system with an Intel Core i7-4790 and this is just fast enough to run at real-time with a single camera at 640x480 30 Hz [11] . With this in mind, we can ease the task of understanding the environment by using ubiquitous fiducial markers -future systems will be able to use greater processing power. The ArUco library [12] has a far lower computational load than visual SLAM, and this allows us to process all five camera streams and extract marker poses from each. a) Vision latency: A basic performance metric when using vision for motion control is the time delay between photons arriving at the camera, and when data corresponding to those photons is available in a memory buffer on the single board computer, known as Glass-to-Algorithm [13] . Measuring the vision latency is non-trivial, see [14] . We used a set of eight LEDs in approximately the centre of the field of view of one camera, and performed image processing on the destination buffer to extract the binary data displayed on the LEDs. By setting a code on the LEDs then timing how long before the code was visible in the buffer data we can measure the latency. After each code is detected, the next in a Gray-code sequence is displayed to accumulate multiple timings.

Data from each camera is transmitted to the single board computer via a USB hub on the mainboard. We load the system by running a varying number of cameras simultaneously. Measurements are taken with different camera refresh rates. The results are shown in Table II for different refresh rates with varying numbers of other cameras running and streaming data at the same time. Times averaged over 1000 samples.

Because the LEDs are positioned at approximately the centre vertically of the camera field of view, and the camera performs a raster scan to read the sensed image, there is a minimum of half a frame before data is available to the Raspberry Pi for processing and compression. Since the LEDs are changed uncorrelated to the frame rate, there may be between zero and one additional frame of delay uniformly distributed, giving a mean of one frame delay before the start of processing, compression, transmission, and then decompression into the destination image buffer.

The worst-case figures of 79 ms at 30 Hz and 58 ms at 60 Hz for the most heavily loaded cases of five cameras streaming image data compare well with state-of-the-art systems, which is reported as 50-80 ms by [13] .

b) Camera calibration: In order that the perimeter cameras can be used to extract pose information from fiducial markers, it is necessary that they be calibrated so that their intrinsic parameters are known. We initially calibrated several cameras using the ArUco calibration target and software tools. The process involves taking multiple images of the calibration target then using the software tool to find intrinsics that minimise the reprojection error. This showed that each camera had significant differences in their intrinsic parameters, perhaps not surprising given their very low cost, but meaning we had to calibrate each camera individually, rather than using a standard calibration. We needed an automated approach -20 robots, each with four cameras to calibrate would require many hundreds of images captured from different angles.

By attaching a calibration target to the arena wall at a fixed and known location, we could execute a calibration process automatically taking multiple pictures per camera in about 5 minutes per robot. Because the only angle that can be varied between the camera and the target is the rotation about robot Z, the calibration is less complete, but sufficient given the robot will always be horizontal on the arena floor. c) Vision processing for localisation: To demonstrate the ability to usefully process vision from multiple cameras we built a simple localisation system, shown in Figure 5 , and based on ArUco markermaps. This is a feature of the ArUco library that allows the specification of a map of the locations of an arbitrary number of arbitrarily placed fiducial markers, with library functions to return the 3D pose of a calibrated camera which can see at least one of the markers in the map. We fixed twelve fiducial markers around the walls of the arena, and encoded the locations in a markermap.

Each camera was calibrated automatically, as described above. The video stream from each of the four perimeter cameras is analysed and if there are any visible fiducials in the correct ID range, the pose of the robot in the global map frame is generated. This stream of poses is fed to an Extended Kalman Filter filter (EKF, robot localization [15] ), along with IMU and wheel odometry information. The output of the EKF is a stream of poses in the map frame.

The robot was commanded to move twice around a square of sides 1.6m, with velocities of 0.3ms −1 and 0.5ms −1 using ground truth as the position feedback. Ground truth from the motion capture system and estimated position were recorded. Figure 6 shows the actual positions and the absolute error in x and y axes. Maximum positional error in either axis never exceeds 62mm, with σ(x) = 22.3mm and σ(y) = 16.1mm.

In addition to the vision system, the are a wide variety of sensors to apprehend the state of the environment and the robot itself. a) Proximity sensors: Surrounding the perimeter of the robot are 16 equally-spaced ST Microelectronics VL53L1CX infra-red laser time-of-flight distance sensors (IRToF). These are capable of measuring distance to several metres with a precision of around 10 mm at an update rate of 50 Hz, giving a pointcloud of the environment around the robot. Each detector has a field of view of approximately 22°, which can be partitioned into smaller regions, allowing for a higher resolution pointcloud at the cost of lower update rate.

As well as returning distance measurements, the VL53L1CX devices are capable of being operated in a mode that returns raw photon counts per temporal bin. This opens up intriguing possibilities -it is possible to classify materials by their temporal response to incident illumination, and this is demonstrated in [16] with a custom built time-of-flight camera. [17] show that this also possible with the VL53L1CX device, despite the much lower cost and limitations on output power and temporal resolution. They demonstrate successful identification of five different materials. There is no reason in principle that this robot could not function as a mobile material identification platform, for example in inspection applications.

There is little published data on the performance of the VL53L1CX sensors, outside the manufacturers specifications, so we wanted to characterise this. The sensors were set up according to the manufacturers recommendations using the VL53L1X Ultra Lite driver 1 for 'short' distance mode and update rate of 50 Hz. This notionally gives a maximum distance measurement of 1.3 m. We commanded a robot to move repeatedly between locations 0 m and 3.5 m from a vertical wall made of polypropylene blocks at a speed no greater than 0.5 ms −1 along a path at right angles to the wall. The robot pose was such that one sensor was directly pointing at the wall. The ground truth of the robot position and the distance readings from the IRToF sensor orthogonal to the wall were captured. A total of 2455 measurement samples were collected. Figure 7 shows the results. Each measured distance is plotted against the ground truth distance of the sensor from the wall. We divided the measurements into bins 0.125 m wide and calculated the mean and standard deviation of the error in each bin.

The performance of the sensor is remarkably good over most of its range. Once above 0.25 m, up to 2.7 m, the mean measurement error does not exceed 12 mm and is mostly less than 5 mm. Even out to 3.2 m, way beyond the notional range for its mode of operation, the mean error never exceeds 30 mm.

Of some interest is the behaviour of the sensor at close distances, less than 0.25 m. Here, the mean error is much higher, up to 75 mm, and examining the measured points in more detail (see inset in Figure 7 ) we can see that many of the points do not fall on a line of slope 1, which would be expected from a perfect sensor, but on lines of slope 2, and possibly slope 3. This would be characteristic of multipath reflections e.g. reflecting off the wall, then the robot body, then the wall again before being sensed. We intend to investigate 1 https://www.st.com/en/embedded-software/stsw-img009.html other device setup possibilities to see if it possible to mitigate this behaviour. b) Additional sensors: In addition to the cameras and proximity sensors, the robot is equipped with other sensors to understand the robot state or the state of the environment -9 degree-of-freedom (DoF) Inertial Measurement Unit (IMU), temperature, pressure, humidity, two MEMS microphones, robot health, and auxiliary ADCs for future use.

The IMU consists of a 6 DoF LSM6DSM accelerometer and gyroscope and a 3 DoF LIS2MDL magnetometer. The IMU, together with information about wheel position, is the main way the robot performs odometry to estimate its local movement. We currently acquire samples at 100Hz and filter with an EKF to match the motion control update rate.

The temperature, pressure and humidity sensors allow us to monitor the state of the robots environment -useful in a warehouse scenario, for example, noting locations with excessive humidity or temperatures outside desired limits.

The two IM69D130 MEMS microphones are processed by the on-board FPGA to a I 2 S stream which is sent to the SBC. The audio is available using the standard Linux audio framework. We intend to use these to experiment with audiobased localisation -sending ultrasonic signals at known times and measuring time-of-arrival to estimate distances.

Robot health sensors monitor the state of various subsystems on board. We measure the voltages and currents of each power supply, the state of each cell of the battery pack, and the temperatures of the battery charging system, power supplies, motor drives, and single board computer. All this information is available through the Linux hwmon sensor framework 2 .

6) Communication: Local communication is important for swarm robotics, and an area where there is much potential for novel approaches. To facilitate experimentation, we include multiple different radios. As well as the WiFi built in to the single board computer, each robot is equipped with two nRF52840-based BLE5 (Bluetooth Low Energy) USBconnected radio modules, and a DWM1001 UWB (Ultra wide band) module. A private 5G modem can be added. All of these modules can be reprogrammed with custom firmware under the control of the single board computer, allowing for on-the-fly installation of novel communication protocols.

We currently have firmware for the BLE radios that continually advertises the robots unique name and a small amount of data, and scans for other BLE radios. Each scan result contains the name of any nearby robot sensed, along with its received signal strength (RSSI), which can be used as a proxy measure for distance.

The DWM1001 radio is designed to perform two-way ranging between devices, measuring the distance between a pair of radios to a claimed accuracy of around 0.1m. It is interfaced to the SBC using an SPI bus. 7) Mobility: For the collective transport of large loads, it is necessary for multiple robots to be able to move together in a fixed geometric relationship with each other. The only way to achieve this with arbitrary trajectories is for the robots to have holonomic motion. We use a three omniwheel system with the wheels equally spaced 120°apart. The kinematics for this type of drive are well known [18] and are shown below:

where v 1 , v 2 , v 3 are the tangential wheel velocities, v x , v y , ω the robot body linear and angular velocities, and R the radius from centre of robot to wheels. Another important requirement for the robots is that they can move fast and accurately. A limitation of other labbased swarm systems is that the locomotion is often based on stepper motors or small geared DC motors. These are relatively cheap and accurate but are slow and heavy. Much higher performance is possible with Brushless DC (BLDC) servo motors. These servo motors are paired with a position encoder and drive electronics that modulates the coil current to achieve accurate torque, velocity and position control with much higher performance to weight and size ratios than is typically possible with stepper motors. This comes at a cost, a typical motor, encoder, and driver with the performance we require costs around £350.

There has been recent interest in using commodity drone motors in place of dedicated servo motors [19] - [22] . The high power-to-weight ratios and low costs due to the large market size make this an interesting alternative. There are disadvantages -the motors are not designed for use as servos and certain parameters which are more important in servo motors are not controlled, e.g. the amount of torque cogging, 2 https://www.kernel.org/doc/html/latest/hwmon/hwmon-kernel-api.html and the motors often use lower voltages and higher currents than typical servos, but these deficiencies can be compensated for with clever software, e.g. [23] . We designed drive circuitry suitable for running the Odrive Robotics 3 open source BLDC servo controller and tested various cheap drone motors, selecting for the motor with the least cogging that met the required form factor and price. We replaced the costly traditional optical position encoder with a high resolution magnetic field angle encoder IC. By glueing a small diametrically magnetised disc magnet to the end of a motor shaft and positioning the encoder IC correctly in line with the motor axis, we can sense absolute motor shaft angle with high precision and low cost.

These innovations reduced the cost per robot of the motor drives from over £1000 to less than £200 with comparable performance. Table III illustrates this, and Figure 8 shows a single complete wheel assembly.

Many systems aimed at swarm robotics research e.g. epucks, pheno, kilobots etc (cite) are physically quite small and move at low speeds. Collisions are not of high enough energy to cause damage and the mass of the robots is low enough that little attention has to be paid to concerns that are important in larger robots such as trajectory generation respecting acceleration limits. The DOTS robots, however, have a mass of 3 kg, with a potential payload of 2 kg, so a proper trajectory generation and motion control loop is important. We want to be able to generate new trajectory points in real time with low computational cost, and also to be able to revise the goals on-the-fly. This capability is important for agile quadrotor drones, so there has been recent work in this area. We use the Ruckig trajectory generator described in [24] , which is available as C++ library 4 . The on-board motion control system is shown in Figure 9 . The action server can accept a path consisting of multiple waypoints with required positions and velocities, this is played out by the trajectory generator producing intermediate goal positions at a rate of 100 Hz. Three PID controllers generate velocity commands in the robot frame to satisfy the current goal on the path. Position feedback uses the Inertial Measurement Unit (IMU) and wheel odometry fed through an EKF filter. At any point, the action server can accept a new trajectory or the cancellation of the existing trajectory and goals are updated on-the-fly while respecting acceleration and jerk limits. Cancellation of a trajectory results in maximum acceleration braking to zero velocity. Supplementary material ?? shows various examples of agile trajectory following. 8) Power: The power system consists of two Toshiba SCiB 23Ah Lithium Titanate cells, with a total energy capacity of 100 Wh. These are combined into a battery pack, with a battery management system to provide under-and overvoltage protection and monitoring. This is converted by the PSU module to the required voltages 3.3 V@2 A, 5 V@4A, and 12 V@4 A. All voltages, currents, and the battery state are A battery charger circuit appropriate to the chemistry of the SCiB cells is integrated into the PSU module, enabling the robot battery to be recharged with a commodity AC-DC adaptor capable of providing 12 V@5 A. Table IV shows the power consumption of various subsystems. The baseline idle consumption gives access to all sensors except for the vision system, and includes WiFi and Bluetooth communication. At this level, the endurance of a robot with a full charge is around 16 hours. We can see that the consumption increases considerably when using the camera vision system. The motor system figure is for all wheels delivering maximum torque so would not be a continuous state. Average motor power when performing a random walk with a maximum speed of 0.5 ms −1 was around 2 W. With full camera usage and moderate movement and processing we get an endurance of around 6 hours.

Battery life is limited to the point the output voltage falls below the undervoltage protection limit. The largest dynamic load component is the motor drives, so we compromise on the maximum allowed acceleration in order to limit voltage drop and extend effective endurance. The available torque implies a maximum loaded acceleration of around 4 ms −2 but we limit this to 2 ms −2 . 9) Manipulation: In order to demonstrate novel algorithms for logistics and collective object transport, the robots need a means of carrying payloads. For this purpose, each robot is equipped with a lifter -a platform on top of the robot that can be raised and lowered by the use of a miniature scissor lift mechanism actuated by a high-powered model servo. In the centre of the lifter platform there is an upwards-facing camera that can be used to visually navigate.

To hold payloads, there are carriers, square trays with four legs that the robot can manoeuvre underneath before raising the lifter. In order that a robot can position itself correctly to lift the carrier, we place a visual target on the underside.

Each carrier weighs about 200 g, so with the lifting capacity of 2 kg we can take a payload of around 1 8kg. In order to demonstrate collective transport, the carriers can be joined together along their edges to form arbitrary shapes, to be moved with an equal number of robots, one under each carrier.

The industrial swarm testbed is the collection of DOTS robots, the physical arena -a 5 m x 5 m main room overlooked by an adjacent control room, the communications infrastructure comprising WiFi and private 5G network, motion tracking system, video cameras, and the system software architecture.

It also has an associated high fidelity simulation system, based on the widely used Gazebo simulator. Controllers can be safely developed in simulation and transferred without modification to run on the real robots.

This section details the individual components making up the testbed, before tying them together with a description of the complete system architecture.

We use ROS2 Galactic [25] for the testbed. ROS2 is the next generation of ROS [26] , and, at first glance, would appear to have several advantages over ROS1 for the type of decentralised swarm testbed we describe here. Firstly, there is no ROS master, ROS2 is completely decentralised. With multiple, potentially communicating, robots, we don't need to choose one to be the master, nor do we need to use multimaster techniques (e.g, [27] ) as were necessary with ROS1. This is facilitated by the second major difference -the communications fabric of ROS2 uses middleware conforming to the DDS standard [28] of which there are several suppliers. DDS supports automatic discovery of the ROS graph and supports a rich set of Quality of Service (QoS) policies that can be applied to different ROS topics, or communication channels. This maps onto a swarm perspective quite naturally; communications within the ROS graph of a single robot may be marked as reliable meaning data will always be delivered, even if retries are necessary, whereas inter-robot communications could be marked as best effort meaning data may be lost and dropped with no errors or retries if the physical channel is unreliable.

However, this enticing vision has yet to be borne out in reality. The DDS discovery process generates network traffic that increases approximately O(n) where n is the number of participants when using UDP multicast, or O(n 2 ) when using unicast [29] [30] . A participant is typically a ROS2 process of which there might be many per robot. Complete information is maintained for all participants at every location, even if they will never communicate. Additionally, although the standard discovery protocol uses UDP multicast, this is very unreliable on WiFi networks [31] forcing the use of unicast discovery. Building robot systems with many mobile agents linked using wireless communication is becoming more common, and the limitations of ROS2 in this regard have resulted in several possible solutions. One approach, the discovery server, uses a central server to hold information about all participants, greatly reducing traffic. This so far is limited to a single DDS vendor, eProsima 5 , and is the opposite of the decentralised vision. Another approach uses bridges between DDS and a more suitable protocol, this is the approach we used; Zenoh 6 is an open-source project supported Eclipse Foundation 7 . The Zenoh approach is agnostic of the particular DDS middleware used, and is intended to be part of the next release of ROS2. The Zenoh protocol is designed to overcome the deficiencies of DDS regarding discovery scalability [32] . By using a Zenoh-DDS bridge on each agent (robots and other participants) and disallowing DDS traffic off robot, we can achieve transparent ROS2 connectivity with far lower discovery traffic levels, granular control over the topics that can seen outside the robot, true decentralisation, and no use of problematic multicast.

A number of ROS nodes are always running on each robot, interfacing to the hardware and providing a set of ROS topics that constitute the robot API. This API is available on both the simulated and real robot, and is intended to be the only way a controller has access to the hardware.

2) Coordinate frames: We adhere to the ROS standard for coordinate frames REP-105 [33] , see the arena diagram Figure  13 , with the extension that each frame name that is related to an individual robot is prefixed with the unique robot name. The global frame map is fixed to the centre of the arena, odom is fixed to the starting position of the robot, and base link is attached to the robot body, with +x pointing forward, between the overlapping cameras. The base link frames are updated relative to the odom frames using each robots inertial measurements and wheel velocities. There is no transform between map and any odom unless some form of localisation is used on a robot, which is not necessary for many swarm applications. The ROS transform library TF [34] is used to manage transforms. Each robot has its own transform tree, and a transform relay node is used to broadcast a subset of the local transforms to global topics for use in visualisation and data gathering.

3) Containerisation: Controllers for the robots are deployed as Docker [35] containers. Docker provides a lightweight virtualised environment making it easy to package an application and associated dependencies into a container, which can then be deployed to computation resources using an orchestration framework such as Kubernetes, we use the lightweight implementation K3S 8 .

Containers only have controlled access to the resources of the compute node, facilitating security and abstraction -as far as the container is concerned, it has no knowledge of whether it has been deployed onto a real robot, or a compute node communicating with a simulated environment.

4) Real world: The arena consists of a 5 m x 5 m main room, overlooked by an adjacent control room. The main room is equipped with an OptiTrack [36] motion capture system with 8x Flex 13 120Hz cameras for analysis of experiments. There are separate high resolution video cameras. Communications with the robots are provided with a dedicated 5 GHz WiFi access point and a private 5G base station. A Linux server is used for data collection, Kubernetes orchestration, video streaming, and general experiment management. The arena server and 5G base station are connected to the UMBRELLA network [37] with a fibre link.

The standard robot simulator Gazebo is used. The simulation environment consists of a configurable model of the real arena with various props such as carriers the robots can pick up, and obstacles, a Unified Robot Description Format (URDF) model of the robot and some of its senses, and a ROS node that provides the same topic API that is present in the real robots.

The robot is modelled taking into account the trade-off between speed and fidelity. Rather than modelling the physically complex holonomic drive omniwheels we instead modelled the motion of the robot as a disc with friction sliding over the floor, with a custom plugin using a PID loop to apply force and torque to the robot body in order meet velocity goals. This has the advantage over directly applying velocity of avoiding unphysical effects such as infinite acceleration, and results in realistic behaviour at low computational cost.

The cameras, time-of-flight proximity sensors, and IMU (Inertial Measurement Unit) are modelled using standard Gazebo ROS plugins and the ROS node emulates hardware sensors such as the battery state and controls actuation of the simulated lifting platform, presenting the same interface as the real hardware.

Simulation is not reality -there are always differences. The reality-gap [38] means that keeping in mind the limitations of simulation when designing robot behaviours that will transfer well to real robots is important. For example, collisions are hard to simulate, and bad for robots, so behaviours relying on them in simulation should be avoided.

6) Online portal for remote experimentation: The physical arena and the simulation environment are designed to be useable remotely. The UMBRELLA Platform, detailed in [37] provides cloud infrastructure to facilitate this. An online portal is used for managing experiments through which users may upload controller containers and simulator and experiment configuration files -controlling for example the number and starting positions or the robots, and download data generated during an experimental run.

The user can schedule or queue experiments to be run in simulation or on the real robots. Because the real robots could be damaged by collisions, the controller containers used for experiments on them need to be verified in a simulation run that checks robot trajectories for dangerous approach velocities and other potential indicators of hazard. This is an open area of research [39] - [41] .

Controller containers for simulation are run on cloud machines of the same processor architecture as the robots (ARM 64 bit), communicating over ROS topics on a virtual network with a server running a Gazebo simulator instance. When the controller container has been verified and is to be run on the real robots, it is passed to the testbed Linux server and queued to be run by the testbed technician, who ensures the physical arena is correctly set up and the right number of robots positioned.

As well as data collected by the experiment containers, the testbed server collects ground truth data from the motion Fig. 10 : System architecture. Each robot has a stack of lowlevel software interfacing to the sensors and actuators, with some ROS nodes running natively and a Zenoh-DDS bridge to handle DDS traffic over the arena communications fabric. Docker containers with ROS user applications are deployed by container orchestration from the server, which also handles Optitrack, cameras, and Umbrella integration.

capture system and video from the arena cameras to be streamed to the UMBRELLA cloud system and made available to through the online portal. 7) System Architecture: Figure 10 shows the testbed system architecture. Robots in the arena all run a stack of hardware interface code, housekeeping and low-level ROS nodes that provide the robot API, and communication bridges. User applications or controllers are deployed to the robots as Docker containers using Kubernetes orchestration. A dedicated PC runs an Optitrack motion capture system. A Linux server PC is responsible for running container orchestration, arena system control, camera, ROS, and Optitrack data capture, and interface to the UMBRELLA Platform.

In order to facilitate wider use of the DOTS system, we created a Docker-based development and simulation environment, capable of running on Windows, Linux, and OSX operating systems. There is a large barrier to entry in learning to program ROS-based robots -versions of ROS are closely tied to particular releases of Ubuntu, dependencies are not always straightforward to install, custom plugins for the simulator need to be compiled, and there are few public examples of complete systems.

Because Docker containers can run on multiple operating systems, we can package up the particular version of Ubuntu, ROS, and all the difficult to install dependencies and provide easy access to a complete ROS2 DOTS code development and simulation environment. All access is provided via web interfaces using open-source technologies developed for cloud applications. The local file system is mounted within the Docker container.

1) Code-server: Code-server 9 takes the popular opensource Microsoft editor VSCode 10 and makes it available via a browser window. We install this within the Docker container. By connecting to a particular port on the localhost, a standard VSCode interface is available in the browser. This not only allows browsing and editing of files, but also provides a terminal for command line access to the environment.

2) Gazebo: The Gazebo simulator is architecturally split into gzserver which runs the simulation, and gzclient which presents the standard user interface on a Linux desktop. We make available GZWeb, which is another client for gzserver that presents a browser-based interface using WebGL [42] . This allows browser access and is more performant than using the standard interface via VNC.

3) VNC: VNC is a technology for allowing remote desktop access. We run a virtual framebuffer within the Docker container together with a simple window manager and a standard VNC server, and use the noVNC 11 client to make this available through a browser window. This gives access to a standard Linux desktop and allows the use of other graphical applications such as the ROS data visualiser rviz.

These three applications packaged in a Docker container enable rapid access to a complete DOTS development and simulation environment in a platform agnostic way. Once a controller application has been prototyped, it can be converted into a Docker container for upload to the UMBRELLA online portal to be verified and for deployment to the real robots.

In order to demonstrate the whole system, we implemented a conceptually simple swarm intralogistics task. Imagine a cloakroom, similar to the scenario described in [43] , where users can deposit and collect bags and jackets, and a swarm of robots will move these to and from a storage area. It is possible to perform this search and retrieval task in an entirely distributed, decentralised manner. Here we implement one aspect of the task, item retrieval, in this decentralised fashion.

The arena, shown in Figure 17 , has two regions, the search zone on the left-hand side, everywhere with x < 0, and the drop zone on the rightmost 0.6m, everywhere with x > 1.25. Robots start randomly placed in the drop zone, and carriers in the search zone. The task is for the swarm to find and move the carriers into the drop zone. Apart from the earlier described sensors, we make available two synthetic senses: a compass that gives the absolute orientation of a robot, and a zone sense, indicating if the robot is in either of the two zones. Currently these are derived from the visual localisation 9 https://github.com/cdr/code-server 10 https://code.visualstudio.com/ 11 https://novnc.com/info.html Once picked up, the carriers must be moved to the drop zone and deposited. described in Section III-A4c but could use, for example, the magnetometer and Bluetooth beacons.

Although conceptually simple, for a robot to detect, manoeuvre underneath a carrier and position itself correctly in the centre, then lift it, is non-trivial. Many swarm foraging tasks demonstrated with real robots have used virtual objects [44] , objects that can be moved just by collision and pushing [45] , or have an actuator that mates with a particular structure [46] . These approaches abstract the difficulty of an arbitrary object collection task to various extents. We argue the system we demonstrate here is closer to real-world practical use because the location and lifting of the carriers require the sorts of sensory processing and navigation typical of many non-trivial robotics tasks.

We develop using the pipeline shown in Figure 2 . The DOTS integrated development environment is used to write controller code, in a mixture of Python and C++, which can be quickly iterated to run on the local Gazebo simulator. Simple sub-behaviours are built up into the complete controller until multiple robots are running successfully in simulation. Once satisfied with simulation the controllers are packaged into Docker containers for validation on the remote UMBRELLA cloud service portal. This allows both larger simulations, with resources limited only by cloud availability, and state-of-theart validation of the simulated controller for safety. Once validated, the controller may be deployed onto the real robots in the testbed for an experiment run. During the run, multiple data sources are captured and these are then made available for download from the portal.

Pick up Take to drop zone base link is the robot frame, with +x being forward. Each carrier has four fiducials around the sides, with frame cfiducial, and the carrier as a whole has the frame carrier, centred in the markermap on its underside. predock and dock are frames relative to cfiducial that the robot uses when navigating under a carrier.

We use a Behaviour Tree controller [47] - [51] . The top level is shown in Figure 12 . There are a sequence of actions that are performed, once each action has been performed successfully, the next is started. Firstly, the robot explores the environment, searching for a cargo carrier. If one is found, the pick up behaviour is started. This involves manoeuvring under the carrier, positioning itself centrally, then raising the lifting platform. Finally, the robot does take to drop zone, moving in the direction of the nest region and once there, lowering the lifting platform. If there are any failures, or once a robot has succeeded in all tasks, the behaviour reverts to explore. 1) Behaviour trees: Behaviour Trees are hierarchical structures that combine leaf nodes that interact with the environment into subtrees of progressively more complex behaviours using various composition nodes such as sequence and selector, see [52] for more information. The whole tree is ticked at a regular rate, in this case 10Hz, corresponding to the controller update rate, with nodes returning success, failure, or running. A blackboard is used to interface between the tree and the robot, holding conditioned and abstracted senses and means of actuation. The three top-level behaviours are described below in more detail. a) Explore: The explore behaviour sends the robot on a ballistic random walk. At the start, or whenever a potential collision is detected, a new direction is chosen and the robot moves in that direction at 0.5ms −1 . The direction chosen is randomly picked from a Gaussian distribution:

This has probability p ≈ 0.9 of moving in −x direction when outside the search zone, and p ≈ 0.6 when in the search zone, so robots will tend to move towards the search zone then randomly within it. The robot continues in the same direction until another potential collision is detected. If, at any point, a carrier ID fiducial (on the four sides), or a carrier markermap (on the underneath) are detected, the robot motion is stopped and the explore behaviour returns success, moving to the pick up behaviour, otherwise it returns running. b) Pick up: This behaviour is started if the robot has seen an ID or markermap. In order of priority, the robot will try to move to the centre position under the carrier if a markermap has been seen, or move underneath the carrier to the dock (see frame diagram, Figure 13 ) position if an ID has been seen and robot is close to predock position or move to the predock position if an ID has been seen. If the centre is reached successfully, the lifting platform is raised and the behaviour returns success and starts the take to drop zone behaviour. If the vision system loses track of the fiducial ID or the markermap, the behaviour returns failure, otherwise it returns running. The consequence of failure is reversion to the explore behaviour.

c) Take to drop zone: This behaviour is started if the pick up behaviour has succeeded. It is similar to the explore behaviour but biassed to move towards the drop zone. Movement is slower at 0.3ms −1 and collision detection suited to the situation where the legs of the carrier can obscure some of the IRToF sensors. This behaviour will return running until the robot reaches the drop zone with no collision detection, at which point it will lower the lifting platform and return success and revert to explore behaviour.

2) Cargo carrier detection: Carriers have an ArUco fiducial marker on each side, labelled cfiducial in Figure 13 . These markers has an ID unique to the carrier, and can be detected by the perimeter cameras of the robot from about 1m away. On the underside of the carrier is a markermap, shown in Figure  14 , so the robot can locate the centre of the carrier using its upwards-facing camera. The size and density of the fiducials making up the map are chosen such that there are always at least one complete marker visible in the camera field-of-view for all physically possible positions of the robot under the carrier.

When a carrier fiducial has been seen by any camera, the transform from the robot to that fiducial is made available on the blackboard. The detection system then focusses on that ID for a short time (3s) and ignores all other IDs, this is to prevent flipping of attention when multiple IDs are visible. The fiducial transform can be used by the controller to navigate to the predock and dock locations relative to the carrier.

When a carrier markermap is visible, the transform to the centre of the carrier is made available on the blackboard. This can be used to navigate the robot to correct central location under the carrier in order to safely lift it.

3) Collision: Information from the IRToF sensors is used to build a dynamic robot-relative map of obstacles. The map is an array of cells, centred on the robot, with values in the range [0, 1]. The value of a cell continually reduces by exponential decay rate λ, and is increased when a cell is affected by a sensor return. Sensor returns increase a cell location by a Gaussian based on distance between each return location (x i , y i ) and cell location (x, y):

where S(x, y) is the sensor return function, showing the total effect all n sensor returns have at this location. with the decay rate λ = 1s, and σ sensor = 25mm. Static objects result in corresponding cells in the map reaching the upper limit of 1.0, transient or false returns tend towards zero.

Collisions are predicted by projecting the current velocity vector forward by a lookahead time, then taking the average value of the cell contents over a circular area related to the robot size. The scalar value is a collision likelihood which is made available on the blackboard. 4) Navigation: The various behaviours need to navigate and choose a path to follow. All navigation is performed in the local frame of the robot -there is no global map. For example, when exploring, a direction is chosen as described above, then a lower-level navigation behaviour is started that creates a trajectory, then continually monitors the collision map of the locality to abort the trajectory playout if necessary. If there are no paths available that don't trigger a collision warning, for example, if a robot is under a carrier, then a fallback behaviour of picking a least-worst (lowest sum of nearby collision cells) direction and moving slowly for short distances is used to safely emerge into more open space.

We ran the task five times, in each case the five robots were positioned with random orientation in the drop zone, and the carriers in the search zone. Robots were allowed to run until the task was completed and all robots had escaped the drop zone, or more than 10 minutes elapsed. The minimum time the task could take, assuming each robot goes directly to the nearest carrier, the distance from robot to carrier is 3 m, search speed is 0.5 ms −1 , carry speed 0.3 ms −1 , pickup and drop times 5 s, is 26 s Obviously, this assumes each robot moves directly underneath a separate carrier, there are no collisions, and they move directly to the drop zone. Table V shows the results. In every case, the swarm was able to successfully retrieve all the carriers within the allowed 10 minutes. The average time to complete the task was 178 seconds. All the runs are shown in the Supplementary video material logistics task runs.

In this work, we have introduced DOTS, our new open access industrial swarm testbed. With a purpose-build arena, 20 high specification robots that move fast, have long endurance and high computational power, a platform-agnostic development environment, and a cloud portal, this system breaks new ground in enabling experimentation.

The intralogistics scenario demonstrates the abilities of the robot swarm to successfully complete a non-trivial task. Fig. 17 : Tracks of the robots in one run of the logistics task, with thicker tracks when a robot is carrying a load. loads are carried from left to right. In this example, robot 3 never picks up a load.

Although conceptually simple, the underlying processing necessary to handle the five concurrent video streams, time-offlight pointcloud, and other senses, and entirely autonomously search for, find, manoeuvre under with precision, lift, carry and deposit multiple cargo carriers, is considerable. More importantly, we demonstrate a process. The DOTS Testbed pipeline reduces barriers to entry by making it easy to get started, using the cross-platform IDE to experiment with ideas in simulation, then making available a real physical robot swarm to see these ideas translated into reality.

There are many directions this research can go in. Currently, there is necessarily some manual management of the robots, they need to be manually charged, set up for experimental runs, and watched over. We are currently designing automatic charging stations to aid this. Validation of controllers in simulation is difficult, although the simulator is relatively high-fidelity, it still suffers from differences to reality, and it is not possible to guarantee the safety of a general-purpose controller; even defining safety for a swarm is an open question. At the moment, we are not utilising the full processing power of the robotic platform -we intend to accelerate image processing, moving it onto the GPUs, both on the single board computer and on the peripheral Raspberry Pi Zeros. We look forward to exploring the possibilities for localisation and short-range communication of the BLE5 and UWE radios, and the two microphones. The possibilities for decentralised stochastic swarm algorithms to reliably perform logistics tasks is touched upon here, and is open to further work. Designing toolkits of useful, working, and validated sub-behaviours from which to more easily construct full applications is another step towards wider real-world use of swarm techniques.

Building swarm robotic applications for the real world is hard, and one of the biggest obstacles is the lack of actually existing robot swarms with the types of capabilities necessary to experiment with ideas. By making available such a swarm, together with an accessible development pipeline, we hope to

Flexible automated warehouse: a literature review and an innovative framework

Automated order picking systems and the links between design and performance: a systematic literature review

Decentralized control of multi-agv systems in autonomous warehousing applications

Planning and control of autonomous mobile robots for intralogistics: Literature review and research agenda

Swarm robotics: From sources of inspiration to domains of application

Swarm intelligence and cyber-physical systems: concepts, challenges and future trends

The robotarium: A remotely accessible swarm robotics research testbed

Duckietown: an open, inexpensive and flexible platform for autonomy education and research

Technical report: Loadrunner®, a new platform approach on collaborative logistics services

Low pin-count debug interfaces for multi-device systems

Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras

Speeded up detection of squared fiducial markers

On the minimization of glass-to-glass and glass-to-algorithm delay in video communication

Are today's video communication solutions ready for the tactile internet

A generalized extended kalman filter implementation for the robot operating system

Material classification using raw time-of-flight measurements

Low-cost spad sensing for non-line-of-sight tracking, material classification and depth imaging

Holonomic control of a robot with an omnidirectional drive

Mini cheetah: A platform for pushing the limits of dynamic quadruped control

Empirical characterization of a high-performance exterior-rotor type brushless dc motor and drive

An open torque-controlled modular robot architecture for legged locomotion research

Stanford doggo: An opensource, quasi-direct-drive quadruped

Cogging torque reduction in brushless motors by a nonlinear control technique

Jerk-limited real-time trajectory generation with arbitrary target states

Next-generation ROS: Building on DDS

ROS: an open-source Robot Operating System

Multi-master ros systems

Omg data-distribution service: Architectural overview

Content-based filtering discovery protocol (cfdp) scalable and efficient omg dds discovery protocol

Bloom filter-based discovery protocol for dds middleware

Rfc 9119-multicast considerations over ieee 802 wireless media

Eclipse Foundation. (2021) Minimizing discovery overhead in ROS2

October) Coordinate frames for mobile platforms

tf: The transform library

Docker: lightweight linux containers for consistent development and deployment

Motion capture systems

Umbrella collaborative robotics testbed and iot platform

20 years of reality gap: a few thoughts about simulators in evolutionary robotics

Safe, remote-access swarm robotics research on the robotarium

Engineering safety in swarm robotics

Complete agent-driven model-based system testing for autonomous systems

WebGL 1.0 Specification, Khronos Group

Distributed situational awareness in robot swarms

Information exchange design patterns for robot swarm foraging and their application in robot control algorithms

Onboard evolution of understandable swarm behaviors

Swarmanoid: a novel concept for the study of heterogeneous robotic swarms

Behavior trees for next-gen game AI

Increasing modularity of UAV control systems using computer game behavior trees

Behavior trees for uav mission management

Evolving behaviour trees for swarm robotics

Behavior trees in robotics

Towards a unified behavior trees framework for robot control

 15 reduce the barriers to entry and stimulate further research.