key: cord-286485-tt9ysg0w authors: Lucius, M.; Belvisi, M.; Galmarini, C. M. title: ROBUST COVID-19-RELATED CONDITION CLASSIFICATION NETWORK date: 2020-05-26 journal: nan DOI: 10.1101/2020.05.19.20106336 sha: doc_id: 286485 cord_uid: tt9ysg0w COVID-19 can exponentially precipitate life-threatening emergencies as witnessed during the recent spreading of a novel coronavirus infection which can rapidly evolve into lung collapse and respiratory distress (among other various severe clinical conditions). Our study evaluates the performance of a tailor-designed deep convolutional network on the tasks of early detection and localization of radiological signs associated to COVID-19 on frontal chest X-rays. We also asses the frameworks capacity in differentiating the above-mentioned signs, which are usually confused with the more usual common bacterial and viral pneumonias. Open-source chest X-ray images categorized as Normal, Non-COVID-19 and COVID-19 pneumonias were downloaded from the NIH (n=2,259), RSNA (n=600) and HM Hospitales (n=2,307). Our algorithmic framework was able to precisely detect the images with COVID19- related radiological findings (mean Accuracy: 90.5%; Sensitivity: 80.6%; Specificity: 98.0%), whilst correctly categorizing images deemed as Non-COVID-19 pneumonias (mean Accuracy: 88.4%; Sensitivity: 93.3%; Specificity: 92.0%) and normal chest X- rays (mean Accuracy 92.1%; Sensitivity: 91.8%; Specificity: 94.3%). The associated results show that our AI framework is able to classify COVID-19 accurately, making of it a potential tool to improve the diagnostic performance across primary-care centres and, to grant priority to a subset of algorithmic selected images for urgent follow-on expert review. This would sensibly accelerate diagnosis in remote locations, reduce the bottleneck on specialized centres, and/or help to alleviate the needs on situations of scarcity in the availability of molecular tests. Reverse-transcription polymerase chain reaction (rt-PCR) tests remain the goldstandard for COVID-19 diagnosis in the acute phase of infection 1, 2 . However, this molecular diagnostic tool has some limitations: (i) its false negative rate is relatively high (20-60%); (ii) obtention of results varies from 4-6 hours to, sometimes, a few days; (iii) it requires to be performed by certified laboratories with trained personnel, expensive equipment and availability of reagents; (iv) a large demand can easily overcome supply, particularly at times when its needed the most 3, 4 . The situation is even more complicated in remote areas affected by the pandemic or in those with access only to low complexity healthcare centres. These, generally, have difficulties in accessing the rt-PCR tests forcing non-specialized practitioners to diagnose the COVID-19 condition based solely on clinical and/or radiological data. Drawbacks like these, have lately driven the debate on using chest imaging as a primary diagnostic tool. Indeed, two studies have reported high sensitivity of non-contrast chest CT compared to rt-PCR 5, 6 . However, because of the associated operational costs and limited equipment availability, chest CT can hardly be relied upon for initial diagnosis in mass. As a consequence, most clinical guidelines only recommend it when an alternative diagnosis is paramount and/or when COVID-19 testing kits are scarcely available. Similar attempts were recently performed, all based on chest x-rays which are widely available and less expensive to perform than CT equivalents. However, chest x-rays taken in patients with confirmed and symptomatic COVID-19 condition can induce to confusion in cases associated to other lung infections or pathologies (including the absence of them) making it difficult for non-trained physicians to differentiate among these patterns. Even in the case of expert radiologists, it has proven challenging to be precise on diagnosis (particularly with early stage patients). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 26, 2020. Artificial intelligence (AI) tools aim to reproduce human cognition and processes involved in the analysis of complex data 7 . Their use in assisting physicians has recently made inroads in various medical fields. Specifically, in the context of radiology, image recognition using a set of deep neural networks ("DNNs") has proven to be a valuable aid to physicians diagnosing among many different pathologies. In radiology, these complex algorithms achieve accuracy rates comparable (or, in some instances, beyond) to those related to the ones by radiologists in most of the specific fields 8 . Thus, AI-based analytical tools help securing increased precision along the entire spectrum of diagnostic radiology and, improving workflow prioritization in the case of large-scale screenings. Since the beginning of the current pandemic a few studies have been published proving the use of AI systems in diagnosing COVID-19 condition using radiological images [9] [10] [11] [12] [13] . In differentiation to the existing research on the field, ours focuses on achieving improved measures of generalisation and stability across the different type of images provided by varied sources and equipment through a series of innovative algorithmic design features (details of which are beyond the scope of this work). These features also facilitate the accurate classification when images to be used for interpretation are gathered using an ordinary mobile phone camera exposed to chest X-rays projected by a negatoscope. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 26, 2020. . https://doi.org/10.1101/2020.05. 19.20106336 doi: medRxiv preprint Our study is based on an aggregated pool of images sourced from HM Hospitales (n=2,307), the National Institute of Health ("NIH"; n=2,259) and the Radiological Society of North America ("RSNA"; n=600). In all cases, the files include anonymous frontal chest X-rays, whilst the dataset provided by HM Hospitales contains anonymized records related to the 2,307 patients admitted with a confirmed (n=2,075) or pending (n=232) of COVID-19 diagnosis performed by rt-PCR. All relevant ethical guidelines have been followed for the use of this material. The NIH dataset was Table 1 . A Dense Convolutional Network architecture ("DenseNet") has been the one accomplishing the most out of a comprehensive architectural network search initially performed. Such a design is known for delivering significant performance improvements over most of the alternative constructions, whilst requiring less memory . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 26, 2020. . https://doi.org/10.1101/2020.05.19.20106336 doi: medRxiv preprint and computational support. As described in previous publications, the DenseNet configuration is one where each layer is connected to every other layer through a feedforward linkage 16 . At every level, feature-maps from all preceding layers are used as inputs, whilst its output feeds all subsequent deeper layers. The framework is composed by 4 dense blocks of 6, 12, and 24 layers each. In order to improve the performance and to optimize training, we have run calibration routines a total of 5 times (folds), each consisting on 200 epochs. Overall, training using the curated image patches took 55 h and 156k iterations to complete, using a 4 GeForce GTX 1080 GPU configuration. Maximum accuracy (100%) was reached after 125 epochs, whereas the pretrained version took 32 epochs for convergence to begin. The network was trained, validated and tested using images adjusted to 1024x1024x1 pixels/channel. When a deep convolutional neural network overfits, it performs extremely well on a training set but poorly on data outside of the calibration spectrum. In our design, two steps were taken to address this issue: (i) a dropout layer was added (set to 0.25) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 26, 2020. which results in 25% of the neurons to be randomly turned off during the training process, therefore reducing the likelihood of overfitting; (ii) a data augmentation process was performed where randomly selected data points were duplicated and modified to account for the variability found across different image capturing methods. To address variations related to grid location, size of the radiological finding and the image angle, all of the training images were altered using a combination of random rotation, zoom, shear, and flipping. The model was calibrated initially once just using dropout and, in a second occasion, both with data augmentation and dropout activated. Following the Training stage, a Validation process was performed in which images across the three different chest X-ray's categories were used. Finally, a Test phase was run, which results were statistically analysed. A Confusion Matrix was constructed comparing the framework's Classification Accuracy across each of the labels. Additionally, Accuracy, Error rate, Sensitivity, Specificity and Geometric Mean were . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 26, 2020. We have had access to two basics demographic descriptors, gender and age, which were included on the original datasets. In the case of Normal X-rays, the data distribution related 43.7% to women and a 56.3% to men. The Non-COVID-19 split CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 26, 2020. . https://doi.org/10.1101/2020.05.19.20106336 doi: medRxiv preprint 9 corresponded 40.6% to women and 59.4% to men, and in the case of COVID-19, 40.3% and 59.7% to women and men respectively. In terms of age distribution, the mean/SD estimated (expressed in years) was 46.7±16.4, 45.0±17.2 and 67.8±16.0 for each of the corresponding classification classes. As expected, and due to the COVID-19 suspected condition, the dataset originated on hospitalized patients shows a higher mean age when compared to the other two datasets. Table 2 . The overall Accuracy performance for the multi-classification task reached 90.4%. Our system was able to differentiate COVID-19 positive related chest X-ray images from the other two classes (common pneumonia and normal) with Accuracy, TPR and TNR of 90%, 80% and 98%, respectively. Of the 23 false negative patients, 9 were wrongly classified as Normal chest X-rays but, it is not clear whether these errors were a true misclassification or Non-COVID19 classes, Accuracy rate reached 92% and 88%, respectively. In the . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 26, 2020. Our results demonstrate that a deep learning framework trained on a diverse set of images can achieve classification Accuracy Rates in line (or beyond) the ones by specialized radiologists. The robust multi-class classification system presented has proven to be accurate across each and all of the three main classes (Normal, Non-COVID-19 and COVID-19). The use of AI applications as a diagnostic aiding tool is a growing trend in radiology providing a significative help among specialized professionals as well as general practitioners inducing the shortening of diagnostic workflow time and, an increase in diagnosis precision. These tools can also ease the exponential demand for diagnostic expertise during pandemic periods, letting radiologists to focus on the most urgent flow whilst allowing suspected patients within remote areas to get an immediate preliminary diagnosis. In addition, our system is . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 26, 2020. . https://doi.org/10.1101/2020.05.19.20106336 doi: medRxiv preprint designed to be able to process images obtained by standard mobile devices and nonprofessional cameras on chest X-rays projected on the negatoscope. Additionally, the framework could be also integrated into any existing teleradiology platform. The advent of machine learning algorithms has made automated classification of radiological signs on chest X-rays an achievable target milestone. For example, CheXNet is a 121-layer CNN trained with ChestX-ray14, a large publicly available chest X-ray dataset containing over 110,000 frontal view X-ray images from 14 different diseases 8 . Not only did CheXNet performed better than radiologists at diagnosing pneumonia, but it also overperformed when identifying other 13 diseases including cancer, pleura thickening and tuberculosis 8 . Different machine learning systems have demonstrated their potential for diagnosing pediatric pneumonia using chest X-rays as well 18 CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 26, 2020. . https://doi.org/10.1101/2020.05. 19.20106336 doi: medRxiv preprint with the additional assurances provided by the disproportionally higher number of positive cases. Also, based on the above, it is clear that an AI-based tool like ours can be of particular utility during stress and pandemic periods. This relies on the fact that COVID-19 condition exhibits particular radiological signatures and image patterns which can be observed in medical imagery and used for differential diagnosis from other pathologies 19, 20 . Additionally, chest X-rays are the most widely utilised imaging method to establish the diagnosis of patients with respiratory problems, especially in rural and isolated areas with limited access to molecular tests or advanced medical equipment (such as CT scanners). Finally, in addition to the difficulties described, the demand overflow and workload pressure on expert radiologists usually witnessed during pandemic periods could potentially induce to perceptual and cognitive biases, all of which leads to random diagnostic errors 21 . Besides, in hospitals and general healthcare centres lacking resident specialized radiologists, AI systems could prove useful in delivering an initial real-time diagnosis in any suspected patient just relying on X-ray equipment, a mobile application and access to a tele-radiology platform 22 . Our framework was bound to only three different classes, which does not reflect the clinical reality of the many more conditions to be taken into account when diagnosing respiratory diseases on a chest X-ray. As a consequence, the use of our classification system should be regarded as an assisting tool for radiologists and/or general practitioners, aiming at improving accuracy within a limited context but, not as a replacement of qualified physicians diagnosis (when and if available). On the other hand, deep learning models are powerful "black box" models which remain relatively uninterpretable compared to the statistical kind traditionally used in medical practice. Computer vision models combine pixel-based visual information in a highly intricated . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 26, 2020. 10 . We intend to address these limitations in future editions of this work. In summary, the results exposed show that AI-based tools can be used to discriminate between COVID-19 positive patients and the ones with other (or no) pulmonary infections. The system is designed to equally process images obtained by photographing X-rays using mobile devices (cell phones or tablets) or, scanning chest radiographs. Additionally, the framework could be integrated into tele-radiology systems via an automated-programming interface (API). In any case, the presented technology probes to be of extreme help in the context of accelerating the initial and preliminary detection of COVID-19-related pulmonary conditions when traditional molecular methods are temporally or permanently unavailable. Furthermore, the . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 26, 2020. . https://doi.org/10.1101/2020.05.19.20106336 doi: medRxiv preprint system can be adapted to assess the future evolution of the disease, that is to estimate probability that any given patient will aggravate from an initial diagnosed clinical condition potentially demanding a future admittance into specialized care units. Though a secondary feature, this could substantially ease the pressure on existing limited healthcare infrastructure by allowing practitioners to objectively filter through (for admittance) only patients whose condition is predicted to worsen. Ultimately, systems based on DNNs will allow general practitioners across undeserved and/or demand overflown regions, to act as a first point of diagnosis with expected accuracy rates similar, or in some instances better, to those achieved by specialized professionals and/or by means of utilising dedicated test kits. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 26, 2020. . https://doi.org/10.1101/2020.05.19.20106336 doi: medRxiv preprint Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement False negative of RT-PCR and prolonged nucleic acid conversion in COVID19: Rather than recurrence Variation in false negative rate of rt-PCR based SARS-CoV-2 tests by time since exposure Correlation of chest CT and RT-PCR testing in coronavirus disease 2019 (COVID-19) in China: a report of 1014 cases Sensitivity of chest CT for COVID-19: comparison to RT-PCR Artificial intelligence: a disruptive tool for a smarter medicine Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network Chest CT findings in coronavirus disease-19 (COVID-19): relationship to duration of infection The diagnostic evaluation of Convolutional Neural Network (CNN) for the assessment of chest X-ray of patients infected with COVID-19 explainable COVID-19 predictions based on chest X-ray images International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted COVID-Net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest radiography images ChestX-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases Augmenting the National Institutes of Health chest radiograph dataset with expert annotations of possible Pneumonia Classification assessment methods Identifying medical diagnoses and treatable diseases by imagebased deep learning Imaging Profile of the COVID-19 Infection: Radiologic Findings and Literature Review International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19 Error in radiology COVID-MobileXpert: on-Device COVID-19 screening using snapshots of chest X-ray International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity We acknowledge the NIH Clinical Center, the RSNA and HM Hospitales for the access to their image datasets. The copyright holder for this preprint this version posted May 26, 2020. . https://doi.org/10.1101/2020.05.19.20106336 doi: medRxiv preprint