key: cord-1016723-0aocxz3s
authors: Gueguim Kana, E. B.; Zebaze Kana, M. G.; Donfack Kana, A. F.; Azanfack Kenfack, R. H.
title: A web-based Diagnostic Tool for COVID-19 Using Machine Learning on Chest Radiographs (CXR)
date: 2020-04-24
journal: nan
DOI: 10.1101/2020.04.21.20063263
sha: fb4b6265f1f52b5a0deff36ea54f16eb17c7ee4d
doc_id: 1016723
cord_uid: 0aocxz3s

This paper reports the development and web deployment of an inference model for Coronavirus COVID-19 using machine vision on chest radiographs (CXR). The transfer learning from the Residual Network (RESNET-50) was leveraged for model development on CXR images from healthy individuals, bacterial and viral pneumonia, and COVID-19 positives patients. The performance metrics showed an accuracy of 99%, a recall valued of 99.8%, a precision of 99% and an F1 score of 99.8% for COVID-19 inference. The model was further successfully validated on CXR images from an independent repository. The implemented model was deployed with a web graphical user interface for inference (https://medics-inference.onrender.com ) for the medical research community; an associated cron job is scheduled to continue the learning process when novel and validated information becomes available

Radiology observations of COVID-19 infected individuals show bilateral multifocal consolidations (fluid or other products of inflammation filling pulmonary air space) that may progress to involve entire lungs. There is also a presence of small pleural effusions which is an abnormal fluid developed in the spaces around the lungs (Kampalath, 2020) . Thus there is presence of glass patterned areas, which affect both lungs even in the initial stages of the infection, in particular the lower lobes, and especially the posterior segments, with a fundamentally peripheral and subpleural distribution ( Figure 1 Currently, to detect and diagnose an infected individual upon clinical suspicion, or monitor the disease, a real-time reverse transcription polymerase chain reaction (rRT-PCR) test is done (CDC b, 2020) followed by chest radiographs (CXR) and/or computed tomography (CT) scans. rRT-PCR test results may take several hours or days and there is a pronounced shortage of RT-PCR tests kit, unlike the chest CXR which are routinely available with an instant outcome. Italian and British hospitals are beginning to employ CXR as a first-line triage tool due to long reverse transcription polymerase chain reaction (RT-PCR) turnaround times (Wong et al., 2020) . There is a growing interest in the use of CXR and CT scans for the screening, diagnosis and management of patients with suspected or known COVID-19

infection. CT scans are more sensitive than RT-PCR, thus a potential advantage of using them (Milagros, 2020) . However, the interpretation of CXR and CT-scan is laborious, time-All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020 . . https://doi.org/10.1101 consuming and requires radiologist expertise. An immediate access to such expertise may not always be available in all clinical settings. These challenges could be addressed with computer vision.

There is an increasing interest in the use of Artificial Intelligence (AI) technologies to support disease diagnosis and management in areas of ophthalmology, pathology, cancer detection, radiology or prediction and personalized medicine. The use of Convolutional Deep Neural Network (CDNN) allows for impressive discovery of patterns in medical image datasets, achieving image classification sometimes difficult for human experts ( Zhou et al., 2019) . The CDNN uses multiple processing layers to learn and represent data with multiple levels of abstraction by applying image analysis filters, or convolutions. The abstracted representation of images within each layer is constructed by convolving multiple filters across the image, producing a feature map that is used as input to the next layer. Thus with this architecture, the image pixels are used as input, then processed to generate the desired This study aimed at developing and deploying an inference model for rapid detection of COVID-19 using transfer learning and Machine vision on chest radiographs (CXR).

A total of 2507 bacterial or viral pneumonia infected, 2487 healthy individuals and 161 COVID-19 positives CXR images were sourced from two repositories and used for model All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020 . . https://doi.org/10.1101 development (Kermany et al., 2018; Cohen, 2020) .

Additional 2585 background class images labelled as "Not Certain" were included.

To address the challenge of class imbalance for COVID-19 positive (1:13), data augmentation through contrast transformation (scale: 0.5-1.9) with probability of 0.5 was invoked for this class to achieve a total of 2093.

Further data augmentation such as random rotation, horizontal flipping, horizontal and vertical shifting and magnification were not considered for this dataset. Thus a total of 4 classes of healthy individual, bacterial and Viral pneumonia, COVID-19 infected and "Not certain" with 7254 training and 2418 validations image were used. These images were reviewed for their quality, then sized to 350 pixels and normalized according to imagenet statistics.

The transfer learning of Resnet50 architecture was leveraged for model development.

Resnet50 has shown a high performance on benchmark dataset such as Imagenet dataset, a huge database of labelled images containing more than 14 million images which have been hand-annotated and distributed into more than 20,000 classes suitable for training Convolutional Neural Networks. Transfer learning has proven to be a highly effective approach which addresses the limitation in data sizes and lowers the computing cost (Yosinski et al., 2014) . It leverages previous knowledge acquired while solving one problem and applying it to a different but related problem . Thus, through adopting Restnet architecture, the optimized weights gained from previous, and more general trainings were fixed in the lower layers, and only the weights of the upper layers were retrained on chest radiographs ( Figure 2 ).

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . Figure 2 : Illustrative diagram of transfer learning using pre-trained model (Nayak, 2019) .

The model was developed using python language and trained on the subset of training data with differential learning rate (10 -4 -10 -2 ) for 15 epochs on Tesla K80 GPU from google cloud resources. Each epoch corresponded to one complete presentation of the training set to the model.

The performance of the model was assessed using the accuracy, precision, recall and Fscore. The precision metric is particularly useful when the false positives are of great concern.

The recall is important when the impact of false negatives on decision making is high. These were computed using Equations 1 and 2 respectively (Olson, 2008) . In some instances, the observed accuracy could be largely contributed by the vast number of true negatives, which may be acceptable for some classes of business decision, whereas false negative and false positive could pose irreversible damage in medical decision, thus F-score was computed using Equation 3.

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. 

The developed model was further validated on a sample data from an additional repository of CXR images (Chowdhury et al., 2020) . 

The availability of a large labelled dataset of medical images required for the machine vision model development remains a critical challenge due to privacy concerns in data sharing, and the laborious task of data labelling. Transfer learning enables model development using fewer datasets. In this study, Resnet was trained on fewer instances (9672 images) for 30 epochs to obtain an accuracy of 99%. Training was stopped to avoid overfitting as there were no further indications of learning.

The confusion matrix is depicted in Figure 4 based on unseen images for a bacterial or viral pneumonia, out of 2418 instances of images, 2383 were accurately predicted (true positive) and only 35 were wrongly predicted (false positives and false negatives). For COVID-19 class, out of 540 validation instances, 539 were accurately predicted (true positive) and 1 was wrongly predicted as viral or bacterial pneumonia. Zu et al., (2019) reported that in patients with severe disease, their X-ray readings may resemble pneumonia or acute respiratory distress syndrome (ARDS). Notwithstanding, in the case of COVID -19, a single case of a false-All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . negative outcome for COVID-19 infected individual is a serious concern as it could compromise global efforts to contain the pandemic; similarly, a false positive can cause undue psychological distress, possible contamination if quarantined with infected individuals or unnecessary investigation which places extra burdens on the healthcare system. However, with the current COVID-19 diagnostic procedure, radiography results are used to support the clinical symptomatic examination and could be further confirmed through PCR results. The associated precision and recall values for each class are depicted in table 1.

A false-positive result occurs when an individual is inaccurately assigned to a class, such as a healthy individual categorized as COVID-19 patient and a false negative occurs when an individual who accurately belongs to a given class is excluded from such group. All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. 

When the developed model was assessed on CXR data from an additional repository using a subset of 600 CXR images of COVID-19 positives, viral or bacterial pneumonia and normal persons (200 images per class), accuracies of 97%, 85% and 98% respectively were observed.

A comparison of the developed model with other reported machine vision models for medical image inference is shown in Table 2 and the corresponding chart for the performance of classifiers is displayed in Figure 4 . A noticeable trend in the use of machine vision for medical diagnosis has been observed in the last few years as evidenced in Table   2 . The purpose of this comparison was to highlight the similarities in performance metrics and effectiveness of the developed COVID-19 model with other reported machine vision models for medical image inference.

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . https://doi.org/10. 1101 highlight the potential of the developed COVID-19 model to accurately discern the disease pattern on novel CXR images.

The model's ability to recognise pixels of glass patterned areas on CXR as distinguishing features of COVID-19 and to differentiate these from other viral and pneumonia infected lungs or healthy images resulted from the knowledge gained from the repeated exposures of the neural network to positive training instances, thus providing prediction similar to human experts.

With the rapidly expanding COVID-19 pandemic, the demand for chest radiographs has grown exponentially and proportionally with the number of patients visiting the emergency departments (Milagros, 2020) thus generating a large number of CRX images. Immediate access to expert interpretation may not available or affordable in all clinical settings, thus computer-assisted decision support may be useful to clinicians and radiologists, and for prioritizing cases in a large worklist.

Furthermore, considering that if a massive screening of the population is performed as recommended by the WHO in attempts to curb the infection curve, assuming only 5% the global population (7.8 billion inhabitants) is eventually screened (Worldometer,2020 ) that would generate 390 million images which need to be reviewed. This will place an unprecedented burden on human experts. Thus automated image analysis tool based on machine learning algorithm, with friendly GUI (Figure 5 ) will be a key enabler to meet this goal All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. .

Figure 5: Web graphical User Interface (GUI) All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020 . . https://doi.org/10.1101 

Deep Learning for Computer Vision: A Brief ReviewComputational Intelligence and Neuroscience

Artificial intelligence applied on chest Xray can aid in the diagnosis of COVID-19 infection: a first experience from

Training and validating a deep convolutional neural network for computer-aided detection and classification of abnormalities on frontal chest radiographs

Centers for Disease Prevention and Control (CDC a ) (2020), How COVID-19 Spreads

Centers for Disease Prevention and Control (CDC b ,) 2020 CDC Diagnostic Test for COVID-19

Development and Evaluation of an AI System for COVID-19 Diagnosis

COVID Chest X-ray dataset

Clinical features of patients infected with 2019novel coronavirus in

Classification using Deep Learning Neural Networks for Brain Tumors

Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification

Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks

Can AI help in screening Viral and COVID-19 pneumonia

Greenspan GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification

Imaging the coronavirus disease COVID-19

Advanced Data Mining Techniques

The M, E, and N structural proteins of the severe acute respiratory syndrome coronavirus are required for efficient assembly, trafficking, and release of virus-like particles

Levy-Bruhl, D. and Team, I. First cases of coronavirus disease 2019 (COVID-19) in France: surveillance, investigations and control measures

Image Classification using Transfer Learning in PyTorch

Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine

Spring Research Presentation: A Theoretical Foundation for Inductive Transfer

Frequency and Distribution of Chest Radiographic Findings in COVID-19 Positive Patients.Radiology, Radiology

WHO Director-General's opening remarks at the media briefing on COVID-19-11

Automatic thyroid nodule recognition and diagnosis in ultrasound imaging with the YOLOv2 neural network

Automatic Diagnosis for Thyroid Nodules in Ultrasound Images by Deep Neural Networks

Automatic thyroid nodule recognition and diagnosis in ultrasound imaging with the YOLOv2 neural network

WHO Director-General's opening remarks at the media briefing on COVID-19 -11

Generalizing from a Few Examples: A Survey on Few-Shot Learning

How transferable are features in deep neural networks?

A review: Deep learning for medical image segmentation using multi-modality fusion

Coronavirus Disease 2019 (COVID-19): A Perspective from China

No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted