key: cord-0735588-sphijbbm
authors: Becker, Matthias
title: Swarm learning for decentralized healthcare
date: 2022-01-14
journal: Hautarzt
DOI: 10.1007/s00105-021-04940-z
sha: 11a4a9284797b2e4835c87b6c6d54d12b34673c5
doc_id: 735588
cord_uid: sphijbbm

nan

Machine learning is revolutionizing medicine by enabling novel applications as well as supporting physicians in routine tasks. These techniques rely on large datasets and collections for development, which are hard to acquire in the decentralized world of healthcare. Privacy and data safety are challenges which slow development in the medical field in comparison to others. Even if a sufficiently large dataset can be created, it is only a static snapshot and cannot reflect upcoming diseases. Machine learning is playing a growing role in dermatology [1, 2] and will become a major technology to assist (and not replace) physicians to improve the quality of care and reduce workload. Medicine itself is inherently based on learning from each other. The long-established mentor principle during training is a good example. Similarly, tools that support physicians in using machine learning techniques need to be trained as well. Unlike humans, these tools do not have a general understanding of the world (like a strong artificial intelligence) and need to compensate that by learning from a large number of training datasets. The performance of a machine learning model correlates with the amount of training data. Therefore, acquiring evergrowing training datasets is crucial. However, data privacy laws and patient consent limit the data collection and impact model performance.

Training separate models at different sites, e.g., per hospital, was the first approach during the establishment of the use of artificial intelligence in medicine. To build more robust models or models for diseases with only few cases, central resources like a cloud [3] have been used. Here, the training data were accumulated at a central point and a single model was trained. While this offers improved model performance, increased interest in data privacy have made such centralized data accumulation more difficult. Laws like the US Health Insurance Portability and Accountability Act (HIPAA) or the EU General Data Protection Regulation (GDPR) [4] further limit this approach. Federated machine learning techniques [5] have been established to overcome this central player.

Here, the training is performed at the participating sites, so called nodes, using only local data and the resulting local models are merged by a central instance. In this setup, all nodes need to trust the central player for the model parameter merging and distribution.

Swarm learning (SL) [6] aims to overcome the dependency on central components or participants. Unlike the previous approaches, there is no central player collecting either data or trained models. In SL, the nodes (participants in the swarm) train an agreed-upon machine learning model with their local data. After a specified number of training steps, the trained model parameters are shared with the swarm, merged, and then redistributed to all nodes. Merging parameters can be achieved by simple averaging, more complex functions, or code that might weight the different nodes by their size. The merging will be done by a single node; however, for each iteration, a new node is elected for this task. This removes the dependency on a central player and the risk of such a node manipulating the model parameters. A distributed ledger (based on the Ethereum blockchain) enforces a smart contract between the nodes. This contract can specify the conditions for joining and finishing a swarm experiment. Furthermore, it tracks the contributions of each node for proper attribution and retraceability of the learning process. The continuous updates between the nodes generate a shared model that encompasses the properties of all datasets of the nodes, e.g., device properties or study biases. Unlike a model that was trained in isolation, no transfer learning is needing to apply the swarm model to new datasets. To join the swarm, a node agrees first to the conditions and then begins training on the local data. Once the swarm has finished an iteration, all nodes share their model parameters and receive the merged parameters afterwards. The new node now has the same model parameters as all other nodes in the swarm and continues training on these. A swarm experiment does not require all nodes to have the same size or properties, e.g., in a multi-disease classifier, not all nodes need to have training data for all diseases.

Initially, swarm learning was tested on the classification of COVID-19 in transcriptomic profiles [6] . In these experiments, the swarm always showed better than or similar performance to individual nodes. These results have been demonstrated on swarms with up to six nodes and different prevalences at the nodes. Furthermore, two different techniques for acquiring the transcriptomic profiles were used at different nodes. The swarm was able to robustly merge the input from them. Next, it was shown that swarm learning works on a completely different data space as well by using a popular, openly available on Kaggle, X-ray chest image dataset [7] to successfully classify different lung diseases. Such an application uses the same underlying techniques as dermato-logical applications as demonstrated in [1] . Swarm learning has also been successfully applied by other groups [8] in analyzing pathology images.

Health data is inherently distributed as each patient can be considered a single data source. Local health care centers, doctor's offices, or hospitals accumulate individual silos of data in many different modalities [2] , including imaging, molecular profiles, and omics together with patient records, phenotypes, and diagnoses. Using traditional approaches, they cannot be easily pooled, but to advance the fields of personalized and precision medicine, these silos need to be tapped into. Swarm learning allows joint learning on this vast data pool without compromising patient data and has been shown to integrate different modalities and machine learning approaches. Tapping into all silos provides a dataset that helps to overcome study biases and differences from acquisition methods. Furthermore, rare diseases with insufficient local training data can be more robustly detected. A large swarm could continuously monitor health data from different countries worldwide and help in the early detection of pandemics [9] . Overall, medical research and clinical applications will benefit from swarm learning as collaboration can be done faster with reduced legal overheads, as data are not being shared.

So, how can swarm learning be used? There are two cases to be considered: first, you have a use case and want to initiate a swarm experiment, and second, you have data and want to join an existing swarm experiment. To create a swarm experiment, it is possible to follow the same steps as for a local machine learning experiment. For a specific use case, a fitting machine learning approach needs to be selected and sufficient initial training data are needed. The training data must be checked for their general properties and potential biases. If the data can be expected to be different at other nodes (e.g., different data acquisition protocols, devices), such data should also be included in training to estimate the generalization of the model. When performing the local training, the overall goal is not to get a perfect model, as only local data are used. It is rather intended to test the feasibility of the selected model configuration. Next, the model needs to be swarm enabled. This is achieved by adding the swarm library and the necessary callbacks, which can be found in the documentation as well as in the examples [10] . Afterwards, the swarm needs to be deployed and as soon as the minimal number of peers have joined, the training will begin.

To join a swarm experiment, local data need to be prepared and then the swarm can be joined. The swarm experiment initiator defines the data requirements, which can be minimum quality requirements or resizing image data to a specific resolution. Additionally, the swarm-enabled model code will be distributed and only needs to be adjusted to load the correct data. Then the swarm can be joined by training the model.

All participants in the swarm will receive the merged model parameters after each iteration; this way everyone is using the same initial parameters for the next iteration. The smart contracts in the distributed ledger control start and stop conditions and log the contributions by each node, which makes the training transparent and accountable.

Swarm learning is a novel collaborative machine learning approach to leverage health data acquired in decentralized environments. By using the power of the swarm, smaller nodes with fewer datasets can also contribute and obtain a robust model. The privacy-preserving properties of SL protect patient data. By exploiting an existing local machine learning model, new swarm experiments can be set up fast and with minimal code modifications, and joining an experiment only requires preparing the data. 

A deep learning system for differential diagnosis of skin diseases

Artificial intelligence in dermatology-where we are and the way to the future: a review

Genomic cloud computing: legal and ethical points to consider

What does the GDPR mean for the medical community

Secure, privacy-preserving and federated machine learning in medical imaging

Swarm learning for decentralized and confidential clinical machine learning

ChestX-Ray8: hospitalscale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases

Swarm learning for decentralized artificial intelligence in cancer histopathology

Artificial intelligence cooperation to support the global response to COVID-19

A simplified libraryfordecentralizedprivacypreservingmachine learning

For this article no studies with human participants or animals were performed by any of the authors. All studies performed were in accordance with the ethical standards indicated in each case.