Abstract
While hospitals routinely gather patient data, such as X-ray images, the challenge of sharing this data across multiple institutions to create a comprehensive and large dataset is hampered by privacy concerns. Consequently, this limitation affects the effectiveness of state-of-the-art deep neural networks for tasks like identifying lung diseases in medical images, as they require substantial annotated data. Federated Learning offers a solution by enabling collaborative training across multiple edge devices or sites, where updates (e.g., neural network weights) are aggregated without sharing patient data, thus maintaining privacy. This work introduces a federated-learning-based approach for automatically detecting lung diseases in chest X-ray images, focusing on preserving data privacy and enhancing robustness. Our approach follows the federated learning protocol: decentralized training of neural networks on data from multiple sites (hospitals) and centralized aggregation of knowledge in the server. The solution presents promising results in identifying fourteen lung diseases compared to three baselines within a simulated environment comprising chest X-ray images from five distinct sites.
Access provided by University of Notre Dame Hesburgh Library. Download conference paper PDF
Similar content being viewed by others
1 Introduction
The significant generation of data, which is fundamental to modern systems, has increasingly become a powerful force in integrating the physical and digital worlds. This transformation not only changes the way we interact with devices around us but also reshapes entire industrial sectors, opening up new opportunities for countless innovations. The Internet of Things (IoT) market is expected to reach a value of USD 1.39 trillion by 2026, making it a major contributor to the large-scale generation of data. In Brazil, the technology has significant potential, with the National Bank for Economic and Social Development (BNDES) estimating that the country will generate around USD 200 billion in revenue from IoT implementations by 2025 [5]. This growth reflects not only technological advancement but also the increasing need for personalized solutions to address specific challenges, particularly those related to privacy.
Even simple activities generate important inputs for the use of Machine Learning (ML) and thus empower the development of intelligent tasks according to business needs. Hospitals routinely collect patient data (e.g., X-ray images), which constitutes a valuable resource for training ML models that aim at detecting diseases [14]. Yet, the volume of disease-specific data (e.g., pneumonia) collected by each hospital often falls short of supporting robust supervised ML models, particularly sophisticated methodologies like Convolutional Neural Networks (CNNs) [8, 9]. So, sharing data from multiple hospitals to establish a comprehensive repository (large dataset) offers a promising avenue to overcome this limitation, facilitating the development of more accurate ML models. Nevertheless, keeping patients’ privacy is paramount.
Recently, the American government issued a presidential executive order on safe, secure, and trustworthy Artificial Intelligence (AI) [18] which mentions the intention to protect Americans’ privacy, and explicitly provides mechanisms to strengthen research into privacy-preserving technologies prioritizing federal support to accelerate the development and use of such techniques. This fact alongside many others, such as the publication of regulations like the GDPR [1], highlights the need for new technologies capable of meeting privacy preservation requirements while using techniques widely recognized in the context of AI.

Adapted from Bonawitz et al. [3].
Federated learning training protocol.
Federated Learning (FL) [7, 11] is a machine learning methodology tailored to situations where there are decentralized clients and/or when data privacy is a major concern, as exemplified by sensitive medical examination data. The primary objective is to aggregate the knowledge acquired from diverse sources throughout the learning process. This operational principle entails the collaborative execution of training across multiple edge devices or servers, collaboratively aggregating updates (i.e., CNN weights) from these entities without the need to centralize the data, thereby preserving privacy and reducing communication overhead [10, 12, 15]. Figure 1 illustrates the FL training approach. The server selects all the available clients among all n clients (a client may vary from a mobile device to an institution, like a hospital or a company, for instance). After selection, the model weights and settings are sent from the server to the clients to initiate the local model training. The training time may differ from client to client, depending on the amount of data and the computing power available at each site. Following the training stage, all the clients should report their results to the server by sending the model updates. After receiving all the updates, the server will aggregate the knowledge from all clients into one single model, starting a new federated iteration.
This work presents a federated-learning-based solution to train a supervised neural network with data from multiple sites (hospitals) for lung disease identification in chest X-ray images in a decentralized way. Throughout successive learning iterations, our approach individually trains a local model for each site, leveraging solely its own data. At the end of each round, the learned weights from all participating sites are shared with a central server, ensuring data privacy by refraining from sharing images. The central server then aggregates these weights into a global model, subsequently disseminating it back to each site for further refinement as part of a new federated iteration. This iterative process runs across multiple rounds.
We evaluated our solution within a simulated environment containing chest X-ray images from five distinct sites. Our approach relies on a dense convolutional neural network, DenseNet [6], based on the CheXNet architecture proposed by Rajpurkar et al. [14] for lung diseases identification. Our solution presented promising results in identifying fourteen lung diseases when compared against three baselines trained on the whole dataset.
1.1 Motivation
We were motivated by two main factors in our work. The first is related to the promotion of techniques that prioritize collaborative approaches over private data strategies. While there are many open datasets available from different medical fields [16], it is estimated that a model trained with such datasets would not be as effective as a collaborative model due to the size of the federated dataset needed for widespread adoption by multiple entities. This could usher in a new era of improved diagnosis accuracy.
The other motivating factor is related to the importance of human life. A recent study from the Johns Hopkins Armstrong Institute has estimated that more than 795,000 Americans suffer from serious harm due to misdiagnosis every year [13]. Some of them live permanently disabled, while others lose their lives. We believe that to overcome this situation, the medical field can utilize the benefits of AI to support its professionals in making diagnoses. As shown in a recent study [17], even simple techniques like logistic regression and statistical analysis, when applied to tabular data, can provide crucial information to support better medical decision-making.
1.2 Organization
The remainder of this article is organized as follows. In Sect. 2, we present the methods and materials used in the experiment held by this work. In Sect. 3, we show in detail the obtained results and discuss them compared to other works. And finally, in Sect. 4, we conclude our article with a summary of our main findings and future work.
2 Materials and Methods
The presented solution employs the principles of federated learning to train a decentralized supervised neural network specifically designed to automate the identification of lung diseases within chest X-ray images. Tailored for multilabel classification, wherein each patient image can be assigned zero or more labels (in our case, lung diseases), the proposed solution is application-independent, thus adaptable for other domains and image modalities, such as identifying brain lesions in Magnetic Resonance (MR) images, for instance. However, a fundamental requirement for its implementation remains the utilization of neural networks.
2.1 Data
While our solution is application-independent, evaluating it proves challenging due to the absence of publicly available large annotated datasets for various medical conditions. In this context, we opted to utilize the ChestX-Ray14 dataset [19] for assessment, comprising a substantial collection of 112, 120 frontal-view chest X-ray images sourced from 30, 805 distinct patients. Figure 2 presents a few examples of images for multiple diseases. Each image was annotated with as many as 14 distinct lung disease labels, acquired through automated extraction methods applied to radiology reports. These labels encompass Atelectasis, Cardiomegaly, Consolidation, Edema, Effusion, Emphysema, Fibrosis, Hernia, Infiltration, Mass, Nodule, Pleural Thickening, Pneumonia, and Pneumothorax. Images with no diseases are labeled as ‘No Finding’.
2.2 Model
Our model is a convolutional neural network with 121 layers. The network is initialized with weights pre-trained in the Imagenet dataset [4]. The network is trained using the Adam optimizer with default parameters (\(\beta _{1}\) = 0.9 and \(\beta _{2}\) = 0.999). We train the model using mini-batches of size 16. Since the dataset has fourteen labels (see Sect. 2.1), we replaced the final layer with a 14-output fully connected layer, where we applied an element-wise sigmoidal function.
2.3 Federated Training
Following the federated learning protocol illustrated in Fig. 1, we conducted a simulation involving five hospitals, leveraging X-ray images from each hospital to facilitate the identification of lung diseases. Our methodology entails the utilization of local hospital datasets to train individual local models, one per hospital. After local training, the resultant model weights from all hospitals are sent to a central server, and the weights are aggregated through the Federated Averaging [12] method, within a single global model that synthesizes the learned knowledge. This global model is then returned to all participating hospitals, perpetuating the federated training process. This process is shown in Fig. 3.
For our experimental setup, we initially partitioned the ChestX-Ray14 dataset into two distinct subsets: a training set denoted as \(D_{train}\) (comprising \(80\%\) of the data) and a separate testing set labeled as \(D_{test}\) (accounting for \(20\%\) of the data). Subsequently, we further subdivided \(D_{train}\) into five distinct subsets, simulating a scenario involving five distinct hospitals. Each hospital dataset, referred to as \(H^i\), comprised approximately 18, 000 images. Finally, we divided each hospital’s dataset into two distinct portions: a hospital training subset, designated as \(H^i_{train}\) (constituting \(75\%\) of \(H^i\)) to train the local model, and a separate hospital validation subset, denoted as \(H^i_{val}\) (comprising \(25\%\) of \(H^i\)) for local validation.
It is worth highlighting that after \(D_{train}\) is split into five subsets, the images themselves remain localized within each hospital site. Only the weights of each neural network, trained individually at each site, are shared with the central server. The federated learning process encompassed 30 rounds, each comprising 10 local epochs, resulting in a cumulative total of 300 training iterations per site. In each round, the global model, with the aggregated weights, was evaluated using \(D_{test}\).
The orchestration of training and communication across these training sites is coordinated via the Flower framework [2] in simulation mode. Our simulation required approximately 72 h to complete the training phase on a single computer equipped with the following hardware specifications: an Intel Core i7 CPU clocked at 1.80GHz (8th generation), with 16 GB of RAM, and 4 GB of dedicated GPU memory (Nvidia GeForce).
3 Results and Discussion
We used the per-class AUROC (Area under the Receiver Operating Characteristic Curve) to evaluate and compare our results against three baselines. These baselines are: 1) Wang et al. [19], the work that released the ChestX-Ray14 dataset and former state-of-the-art for 1 class; 2) Yao et al. [20], the former state-of-the-art for 13 classes; and 3) CheXNet [14], which, as far as we know, is the current state-of-the-art model for lung disease identification in the ChestX-Ray14 dataset. We highlight that all baselines were trained over the full training set \(D_{train}\) while our solution used about 5 times less training data per site. Although we do not expect our results to surpass the state-of-the-art performance, we do expect them to be comparable.
Table 1 shows the per-class AUROC of our solution (FL) compared to the baselines. We have highlighted in bold the diseases for which our method outperformed Wang et al. [19] and underlined the diseases for which we obtained virtually the same results when compared to other methods.
Our solution demonstrated comparable performance to Wang et al. [19], yielding improved scores (in bold) for specific diseases such as Atelectasis, Infiltration, Nodule, Pneumonia, Consolidation, and Hernia, and virtually the same performance (underlined values) for Effusion, Mass, Pneumothorax, Fibrosis and Pleural Thickening. Conversely, it presented inferior results than Yao et al. [20] and CheXNet [14] in all diseases. This discrepancy can be attributed to the fact that both baselines were trained using the complete training set \(D_{train}\). At the same time, FL utilized substantially less data—each hospital’s training set is approximately one-fifth the size of \(D_{train}\). Hence, this comparison lacks parity, and the anticipation of suboptimal results is justified. Nevertheless, FL presented promising results for certain diseases in comparison to Yao et al.and CheXNet, particularly evident in the cases of ‘Infiltration,’ ‘Consolidation,’ and ‘Fibrosis,’ as highlighted in Table 1.
It is important to highlight a few considerations. Firstly, the effectiveness of the baseline methods, notably CheXNet, is contingent upon access to a large annotated dataset (e.g., encompassing tens of thousands of labeled images). Yet, this prerequisite proves impracticable in most practical clinical routines, given the challenges of obtaining such a volume of annotated data for a specific disease, coupled with the labor-intensive and time-consuming nature of data annotation. Consequently, these baseline methods tend to exhibit poorer results when such circumstances cannot be held. In contrast, while prioritizing data privacy, FL operates under the premise of a substantially smaller annotated dataset and strives to harness collective learning from diverse sources. This underscores both the potential inherent in our proposed solution and the substantial room for further enhancement.
4 Conclusion
This work presented a federated-learning-based solution for automatically identifying lung diseases in chest X-ray images, effectively tackling privacy challenges inherent to medical imaging and enhancing identification robustness through federated learning. This approach facilitates decentralized and collaborative model training, embracing data from diverse sources. Our solution was trained on considerably fewer data per site than the baselines, mirroring a more realistic clinical setting. Yet, it endeavors to compete with state-of-the-art methods that may not be feasible in a clinical scenario since they demand large annotated sensitive datasets due to privacy constraints. Our federated learning model exhibits significant potential for improvements (e.g., fine-tuning), allowing it to contend with state-of-the-art benchmarks following further refinements.
While hospitals routinely collect patient data, sharing these data across multiple hospitals to establish a comprehensive large dataset is impractical due to privacy constraints. Our solution followed the federated learning protocol to train decentralized dense convolutional neural networks across data from multiple sites (hospitals). Throughout successive learning rounds, each hospital’s learned model weights (knowledge) are shared with a central server and aggregated to build a global model, thus preserving data privacy and enhancing robustness.
Our solution reported promising results in identifying fourteen lung diseases for a comprehensive chest X-ray dataset. Although it presented inferior results than state-of-the-art methods, their efficacy depends on a large annotated dataset, a demand often unfeasible within practical clinical workflows. The hurdles of assembling a substantial volume of annotated data for a particular ailment, combined with the laborious and time-intensive data annotation process, underscore these limitations. In contrast, our solution leverages lung disease identification by aggregating the knowledge learned from smaller datasets across multiple sites, mirroring a more realistic clinical setting.
For future work, we first intend to refine our dense CNN through fine-tuning, regularization methods, and addressing the class imbalance. Second, we aspire to assess other deep neural networks for lung disease identification and consider different data volumes per site, on non-IID scenarios, and various numbers of sites. Finally, we intend to evaluate our solution in other medical imaging problems, such as detecting brain lesions in MR images.
References
Regulation (EU) 2016/679 of the European parliament and of the council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/EC (general data protection regulation). Official Journal of the European Union (2016). https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679. Accessed 08 July 2023
Beutel, D.J., et al.: Flower: a friendly federated learning research framework. arXiv preprint arXiv:2007.14390 (2020)
Bonawitz, K., et al.: Towards federated learning at scale: system design. Proc. Mach. Learn. Syst. 1, 374–388 (2019)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
GR01: As 9 maiores tendências da IoT para o mundo moderno (2022). https://gr1d.io/2022/09/02/inteligencia-artificial-3. Accessed 13 Dec 2023
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Kairouz, P., et al.: Advances and open problems in federated learning. Found. Trends® Mach. Learn. 14(1–2), 1–210 (2021)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Lim, W.Y.B., et al.: Federated learning in mobile edge networks: a comprehensive survey. IEEE Commun. Surv. Tutor. 22(3), 2031–2063 (2020)
McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.Y.: Communication-efficient learning of deep networks from decentralized data. In: Singh, A., Zhu, J. (eds.) Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 54, pp. 1273–1282. PMLR (2017). https://proceedings.mlr.press/v54/mcmahan17a.html
McMahan, H.B., Moore, E., Ramage, D., Arcas, B.A.: Federated learning of deep networks using model averaging. arXiv preprint arXiv:1602.05629 2 (2016)
Newman-Toker, D.E., et al.: Burden of serious harms from diagnostic error in the USA. BMJ Qual. Saf. (2023). https://doi.org/10.1136/bmjqs-2021-014130
Rajpurkar, P., et al.: Chexnet: radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225 (2017)
Ryffel, T., et al.: A generic framework for privacy preserving deep learning. arXiv preprint arXiv:1811.04017 (2018)
Sheller, M.J., et al.: Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Sci. Rep. 10(1), 12598 (2020)
Takao, M.M.V., et al.: Artificial intelligence in allergy and immunology: comparing risk prediction models to help screen inborn errors of immunity. Int. Arch. Allergy Immunol. 183(11), 1226–1230 (2022)
The White House: FACT SHEET: President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence (2023). https://www.whitehouse.gov/briefing-room/statements-releases/2023/10/30. Accessed 13 Dec 2023
Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: Chestx-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2097–2106 (2017)
Yao, L., Poblenz, E., Dagunts, D., Covington, B., Bernard, D., Lyman, K.: Learning to diagnose from scratch by exploiting dependencies among labels. arXiv preprint arXiv:1710.10501 (2017)
Acknowledgments
This work was made possible through the invaluable support of SiDi, a Brazilian Institute of Science and Technology committed to innovating everyday applications. We also extend our sincere gratitude to the Federal Institute of São Paulo for their collaboration throughout the conception and development of this project. Furthermore, we would like to thank the State University of Campinas, where three of the authors are currently pursuing their PhD and master’s degrees, for its ongoing academic support.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cunha, W.L., Castelo-Fernandez, C., Simionato, R., de Lacerda, M.S., Martins, S.B. (2025). Preserving Privacy, Enhancing Robustness: Federated Learning for Lung Disease Identification in Chest X-Ray Images. In: Paes, A., Verri, F.A.N. (eds) Intelligent Systems. BRACIS 2024. Lecture Notes in Computer Science(), vol 15414. Springer, Cham. https://doi.org/10.1007/978-3-031-79035-5_28
Download citation
DOI: https://doi.org/10.1007/978-3-031-79035-5_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-79034-8
Online ISBN: 978-3-031-79035-5
eBook Packages: Computer ScienceComputer Science (R0)

