key: cord-189307-qb0s06tl authors: Wang, Linda; Wong, Alexander title: COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-Ray Images date: 2020-03-22 journal: nan DOI: nan sha: doc_id: 189307 cord_uid: qb0s06tl The COVID-19 pandemic continues to have a devastating effect on the health and well-being of the global population. A critical step in the fight against COVID-19 is effective screening of infected patients, with one of the key screening approaches being radiology examination using chest radiography. Motivated by this and inspired by the open source efforts of the research community, in this study we introduce COVID-Net, a deep convolutional neural network design tailored for the detection of COVID-19 cases from chest X-ray (CXR) images that is open source and available to the general public. To the best of the authors' knowledge, COVID-Net is one of the first open source network designs for COVID-19 detection from CXR images at the time of initial release. We also introduce COVIDx, an open access benchmark dataset that we generated comprising of 13,975 CXR images across 13,870 patient patient cases, with the largest number of publicly available COVID-19 positive cases to the best of the authors' knowledge. Furthermore, we investigate how COVID-Net makes predictions using an explainability method in an attempt to not only gain deeper insights into critical factors associated with COVID cases, which can aid clinicians in improved screening, but also audit COVID-Net in a responsible and transparent manner to validate that it is making decisions based on relevant information from the CXR images. By no means a production-ready solution, the hope is that the open access COVID-Net, along with the description on constructing the open source COVIDx dataset, will be leveraged and build upon by both researchers and citizen data scientists alike to accelerate the development of highly accurate yet practical deep learning solutions for detecting COVID-19 cases and accelerate treatment of those who need it the most. The COVID-19 pandemic continues to have a devastating effect on the health and well-being of the global population, caused by the infection of individuals by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). A critical step in the fight against COVID-19 is effective screening of infected patients, such that those infected can receive immediate treatment and care, as well as be isolated to mitigate the spread of the virus. The main screening method used for detecting COVID-19 cases is reverse transcriptase-polymerase chain reaction (RT-PCR) [14] testing, which can detect SARS-CoV-2 RNA from respiratory specimens (collected through a variety of means such as nasopharyngeal or oropharyngeal swabs). While RT-PCR testing is the gold standard as it is highly specific, it is a very time-consuming, laborious, and complicated manual process that is in short supply. An alternative screening method that has also been utilized for COVID-19 screening has been radiography examination, where chest radiography imaging (e.g., chest Xray (CXR) or computed tomography (CT) imaging) is conducted and analyzed by radiologists to look for visual indicators associated with SARS-CoV-2 viral infection. It was found in early studies that patients present abnormalities in chest radiography images that are characteristic of those infected with COVID-19 [11, 7] , with some suggesting that radiography examination could be used as a primary tool for COVID-19 screening in epidemic areas [3] . In particular, there are several advantages to leveraging CXR imaging for COVID-19 screening amid the global COVID-19 pandemic: • Rapid triaging: CXR enables rapid triaging of patients suspected of COVID-19 and can be done in parallel of viral testing (which takes time) to help relief the high volumes of patients especially in areas most affected where they have ran out of capacity (e.g., New York, Spain, and Italy), or even as standalone when viral testing isn't an option (low supplies). Furthermore, CXR can be quite effective for triaging in geographic areas where patients are instructed to stay home until the onset of advanced symptoms (e.g., New York City), since abnormalities are often seen at time of presentation when patients suspected of COVID-19 arrive at clinical sites [12] . • Availability and Accessibility: CXR is readily available and accessible in many clinical sites and imaging centers as it is considered standard equipment in most healthcare systems, • Portability: The existence of portable CXR systems means that imaging can be performed within an isolation room, thus significantly reducing the risk of COVID-19 transmission during transport to a fixed systems such as CT scanners as well as within the rooms housing the fixed imaging systems [12] . As such, radiography examination can be conducted faster and have greater availability given the prevalence of chest radiology imaging systems in modern healthcare systems, making them a good complement to PCR testing (in some cases, even exhibiting higher sensitivity [5] ). However, one of the biggest bottlenecks faced is the need for expert radiologists to interpret the radiography images, since the visual indicators can be subtle. As such, computeraided diagnostic systems that can aid radiologists to more rapidly and accurately interpret radiography images to detect COVID-19 cases is highly desired. Motivated by the need for faster interpretation of radiography images, a number of artificial intelligence (AI) systems based on deep learning [8] have been proposed and results have shown to be quite promising in terms of accuracy in detecting patients infected with COVID-19 via radiography imaging [6, 15, 10] . However, to the best of the authors' knowledge, these developed AI systems have been closed source and unavailable to the research community to build upon for deeper understanding and extension of these systems. Furthermore, such systems are unavailable for public access and use. As a result, there has been recent efforts to push for open access and open source AI solutions for radiography-driven COVID-19 case detection [1, 4] , with an exemplary effort being the open source COVID-19 Image Data Collection, an effort by Cohen et al. [1] to build a dataset consisting of COVID-19 cases (as well as SARS and MERS cases) with annotated CXR and CT images, so that the research community and citizen data scientists can leverage the dataset to explore and build AI systems for COVID-19 detection. Motivated by the urgent need to develop solutions to aid in the fight against the COvID-19 pandemic and inspired by the open source and open access efforts by the research community, this study introduces COVID-Net, a deep convolutional neural network design tailored for the detection of COVID-19 cases from CXR images that is open source and available to the general public. We also describe the dataset leveraged to train COVID-Net, which we will refer to as COVIDx and is comprised of 13,800 CXR images across 13,725 patient cases, created as a combination and modification of three open access data repositories contain- Figure 2 . COVID-Net Architecture. High architectural diversity and selective long-range connectivity can be observed as it is tailored for COVID-19 case detection from CXR images. The heavy use of a projection-expansion-projection design pattern in the COVID-Net architecture can also be observed, which strikes a strong balance between computational efficiency and representational capacity. ing chest radiography images (i.e., [1] , [4] , and [17] ), one of which we introduced [4] . Furthermore, we investigate how COVID-Net makes predictions using an explainability method in an attempt to gain deeper insights into critical factors associated with COVID cases, which can aid clinicians in improved screening. The paper is organized as follows. First, Section 2 discusses the strategy leveraged to create the proposed COVID-Net, the architecture design of COVID-Net, the strategy used to create the COVIDx dataset, and the strategy leveraged to audit COVID-Net via explainability. Section 3 presents and discusses the results of experiments conducted to evaluate the efficacy of the proposed COVID-Net in both a quantitative and qualitative manner. Finally, conclusions are drawn and future directions discussed in Section 4. Here, we will discuss the architecture design methodology behind the proposed COVID-Net, the resulting network architecture, the process of creating the COVIDx dataset, as well as the implementation details in creating COVID-Net. In this study, a human-machine collaborative design strategy is leveraged to create COVID-Net, where humandriven principled network design prototyping is combined with machine-driven design exploration to produce a network architecture tailored for the detection of COVID-19 cases from CXR images. Each of the two design stages are described below. The first stage of the human-machine collaborative design strategy employed to create the proposed COVID-Net is a principled network design prototyping stage, where an initial network design prototype is constructed based on human-driven design principles and best practices. More specifically in this study, we leveraged residual architecture design principles [16] as they have been shown time and again to enable reliable neural network architectures that are easier to train to high performance, and enables deeper architectures to be built successfully. In this study, we construct the initial network design prototype to make one of the following three predictions: a) no infection (normal), b) non-COVID19 infection (e.g., viral, bacterial, etc.), and c) COVID-19 viral infection (see Fig. 1 for example CXR images of non-COVID19 and COVID-19 infections). The rationale for choosing these three possible predictions is that it can aid clinicians to better decide not only who should be prioritized for PCR testing for COVID-19 case confirmation, but also which treatment strategy to employ depending on the cause of infection, since COVID-19 and non-COVID19 infections require different treatment plans. The second stage of the human-machine collaborative design strategy employed to create the proposed COVID-Net is a machine-driven design exploration stage. More specifically, at this stage, the initial network design prototype, data, along with human specific design requirements, act as a guide to a design exploration strategy to learn and identify the optimal macroarchitecture and microarchitecture designs with which to construct the final tailor-made deep neural network architecture. Such a machine-driven design exploration stage enables much greater granularity and much greater flexibility than is possible through manual human-driven architecture design, while still ensuring that the resulting deep neural network architecture satisfies domain-specific operational requirements. This is especially important for the design of COVID-Net, where sensitivity to COVID-19 cases is significant to limit the number of missed COVID-19 cases as much as possible. In this study, we leverage generative synthesis [13] as the machine-driven design exploration strategy, which is based on an intricate interplay between a generator-inquisitor pair that work in tandem to garner insights and learn to generate deep neural network architectures that best satisfies human specified design requirements. More specifically, the following human specified design requirements were employed in this study to enable the generative synthesis process to learn and identify the optimal macroarchitecture and microarchitecture designs for the final COVID-Net network architecture: (i) COVID-19 sensitivity ≥ 80%, and (ii) COVID-19 positive predictive value (PPV) ≥ 80%. The proposed COVID-Net network architecture is shown in Fig. 2 , and available publicly for open access at https: //github.com/lindawangg/COVID-Net. It can be observed that the COVID-Net network architecture makes heavy use of a lightweight residual projection-expansionprojection-extension (PEPX) design pattern, which consists of: • First-stage Projection: 1×1 convolutions for projecting input features to a lower dimension, • Expansion: 1×1 convolutions for expanding features to a higher dimension that is different than that of the input features, • Depth-wise Representation: efficient 3×3 depthwise convolutions for learning spatial characteristics to minimize computational complexity while preserving representational capacity, • Second-stage Projection: 1×1 convolutions for projecting features back to a lower dimension, and • Extension: 1×1 convolutions that finally extend channel dimensionality to a higher dimension to produce the final features. Furthermore, it can be observed that there is considerable architectural diversity and selective long-range connectivity in the COVID-Net architecture, which reflects the fact that the machine-driven design exploration has tailored the network architecture at a very fine level of granularity for COVID-19 case detection from CXR images to achieve strong representational capacity for a specific task. The dataset used to train and evaluate the proposed COVID-Net, which we will refer to as COVIDx, is comprised of a total of 13,800 CXR images across 13,725 patient cases. To generate the COVIDx dataset, we combined and modified three different publicly available datasets: 1) COVID-19 Image Data Collection [1] , 2) Figure 1 COVID-19 Chest X-ray Dataset Initiative [4] , which we established in collaboration with Figure 1 (Toronto, Canada), and 3) RSNA Pneumonia Detection Challenge dataset [17] , which used publicly available CXR data from [18] . The choice of these three datasets from which to create COVIDx is guided by the fact that both are open source and fully accessible to the research community and the general public, and as datasets grow we will continue to grow COVIDx accordingly. The distribution of images and patient cases amongst the different infection types shown in Fig. 3 and 4 , respectively. The most noticeable trend is the limited amount of COVID-19 infection cases and associated CXR images, which reflects the scarcity of COVID-19 case data available in the public domain but also highlights the need to obtain more COVID-19 data as more case data becomes available to improve the dataset. More specifically, the COVIDx dataset contains 183 CXR images from 121 COVID-19 patient cases. For CXR images with no pneumonia and non-COVID19 pneumonia, there are significantly more patient cases and corresponding CXR images. More specifically, there are a total of 8,066 patient cases who have no pneumonia (i.e., normal) and 5,538 patient cases who have non-COVID19 pneumonia. Dataset generation scripts for constructing the COVIDx dataset is available publicly for open access at https://github. com/lindawangg/COVID-Net. Due to the mission-critical nature of clinical applications such as COVID-19 detection that can affect the health and well-being of patients, it is important to design deep neural network architectures such as COVID-Net with responsibility and transparency in mind. Therefore, in this study, we perform an explainability-driven audit on COVID-Net to validate that it is making detection decisions based on relevant information rather than improper information (e.g., erroneous visual indicators outside of the body, embedded markup symbols, imaging artifacts, etc.). More specifically, we audit COVID-Net via an qualitative analysis to study the critical factors leveraged by COVID-Net in making detection decisions. Here, we leveraged GSInquire [9] , an explainability method that is a critical aspect of the generative synthesis strategy [13] leveraged in the machine-driven exploration strategy used to create the proposed COVID-Net network architecture. A brief summary of the GSInquire is provided as follows. GSInquire revolves around the notion of an inquisitor I within a generator-inquisitor pair {G, I}, with G denoting a generator, that work in tandem to obtain improved insights about deep neural networks as well as learn to generate networks. The insights gained by I can not only be used to improve G to generate better networks, but also be subsequently transformed into an interpretation of decisions made by a network. More specifically, a deep neural network is defined as a graph N = {V, E}, comprising a set V of vertices v ∈ V and a set E of edges e ∈ E that form the network. A generator function is defined as G(s; θ G ) parameterized by θ G that, given a seed s ∈ S, generates a deep neural network N s = {V s , E s } (i.e., N s = G(s)), where S is the set of possible seeds. Finally, an inquisitor function is defined as I(G; θ I ) parameterized by θ I that, given a generator G, produces a set of parameter changes ∆θ G (i.e., ∆θ G = I(G)). In the scenario where the underlying goal is to obtain an interpretation z of a decision made by a reference network N ref (in this case, COVID-Net) for an input signal x (in this case, a CXR image), both θ G and θ I are initialized based on {V ref , E ref }, a universal performance function U (e.g., [19] ), and an indicator function Given the generated N s = G(s), the inquisitor I probes {V s , E s }, where V s ⊆ V s and E s ⊆ E s , with the targeted stimulus signal as x and the corresponding set Y G(s) of reactionary response signals y ∈ Y G(s) are observed. The parameters θ I are updated based on Y G(s) , U(G (s)), and 1 r (G (s)), leading to the inquisitor I learning from the insights that are derived from Y G(s) . Following the update of θ I , set of parameters ∆θ G = I(G) is generated which can not only be leveraged to update θ G to improve G, but can also be transformed and projected into same subspace as x via a transformation T (∆θ G(s) ) to produce an interpretation z(x; N ref ). In this study, the produced interpretation indicates the critical factors leveraged by COVID-Net in making a detection decision based on a CXR image, and can be visualized spatially relative to the CXR image for greater insights into whether COVID-Net is making the right decisions for the right reasons and validate its performance. The proposed COVID-Net was pretrained on the Ima-geNet [2] dataset and then trained on the COVIDx dataset using the Adam optimizer using a learning rate policy where the learning rate decreases when learning stagnates for a period of time (i.e., 'patience'). The following hyperparameters were used for training: learning rate=2e-5, number of epochs=22, batch size=8, factor=0.7, patience=5. Furthermore, data augmentation was leveraged with the following augmentation types: translation, rotation, horizontal flip, and intensity shift. Finally, we introduce a batch re-balancing strategy to promote better distribution of each infection type at a batch level. The initial COVID-Net prototype was built and evaluated using the Keras deep learning library with a TensorFlow backend. The proposed COVID-Net architecture was built using generative synthesis [13] , as described in Section 2.1.2. To evaluate the efficacy of the proposed COVID-Net, we perform both quantitative and qualitative analysis to get a better understanding of its detection performance and decision-making behaviour. To investigate the proposed COVID-Net in a quantitative manner, we computed the test accuracy, as well as sensitivity and positive predictive value (PPV) for each infection type, on the aforementioned COVIDx dataset. The test accuracy, along with the architectural complexity (in terms of number of parameters) and computational complexity (in terms of number of multiply-accumulation (MAC) operations) are shown in Table 1 . It can be observed that COVID-Net achieves good accuracy by achieving 92.6% test accuracy, thus highlighting the efficacy of leveraging a humanmachine collaborative design strategy for creating highlycustomized deep neural network architectures in an accelerated manner, tailored around task, data, and operational requirements. This is especially important for scenarios such as disease detection, where new cases and new data are collected continuously and the ability to rapidly generate new deep neural network architectures tailored to the ever-evolving knowledge base over time is highly desired. Next, we take a deeper exploration into the current limitations of the proposed COVID-Net by studying the sensitivity and PPV for each infection type, which is shown in Table 2 and Table 3 , respectively, and the confusion matrix in Fig. 5 . A number of interesting observations can be made about how COVID-Net performs under the different scenarios. First, it can be observed that COVID-Net can achieve good sensitivity for COVID-19 cases (87.1% sensitivity), which is important since we want to limit the number of missed COVID-19 cases as much as possible. While promising, it should be noted that the number of COVID-19 patient cases available is limited compared to the other infection types in COVIDx and as such a better view of effectiveness will improve as more COVID-19 patient cases becomes available. Second, it can be observed that COVID-Net achieves high PPV for COVID-19 cases (96.4% PPV), which indicates very few false positive COVID-19 detections (for example, as seen in Fig. 5 , one patient with non-COVID19 infection was misidentified as having COVID-19 viral infections). This high PPV is important given that too many false positives would increase the burden for the healthcare system due to the need for additional PCR testing and additional care. Third, it can also be observed that sensitivity is noticeably higher for normal and non-COVID19 infection cases than COVID-19 infection cases. This observation may be primarily attributed to the significant larger number of images for both normal and non-COVID19 infection cases. Therefore, based on these results, it can be seen that while COVID-Net performs well as a whole in detecting COVID-19 cases from CXR images, there are several areas of improvement that can benefit from collecting additional data, as well as improving the underlying training methodology to generalize better across such scenarios. As mentioned earlier, we performed an audit on the proposed COVID-Net to gain better insights into how COVID-Net makes decisions, and validate whether it is making detection decisions based on relevant information rather than erroneous information that bias decisions based on irrelevant visual indicators. The critical factors identified by GSInquire [9] in several example CXR images of COVID-19 cases are shown in Fig. 6 . It can be observed that, based on the interpretation produced by GSInquire, the proposed COVID-Net primarily leverages areas in the lungs in the CXR images as the main critical factors in determining whether a CXR image is of a patient with a SARS-CoV-2 viral infection, as shown in red in Fig. 6 . As such, we were able to validate that that COVID-Net was not relying on improper information to make decisions (e.g., erroneous visual indicators outside the body, embedded markup symbols, imaging artifacts, etc.), which could lead to scenarios where the right decisions are made for the wrong reasons. Such 'right decision, wrong reason' scenarios are very difficult to track and identify without the use of such an explainability-driven auditing strategy, and thus highlight the value of explainability in improving the reliability of deep neural networks for clinical applications. In addition to performance validation for more responsible and transparent design, the ability to interpret and gain insights into how the proposed COVID-Net detects COVID-19 infections is also important for a number of other reasons: • Transparency. By understanding the critical factors being leveraged in COVID-19 case detection, the predictions made by the proposed COVID-Net become more transparent and trustworthy for clinicians to leverage during their screening process to aid them in making faster yet accurate assessments. • New insight discovery. The critical factors leveraged by the proposed COVID-Net could potentially help clinicians discover new insights into the key visual indicators associated with SARS-CoV-2 viral infection, which they can then leverage to improve screening accuracy. In this study, we introduced COVID-Net, a deep convolutional neural network design for the detection of COVID-19 cases from CXR images that is open source and available to the general public. We also described COVIDx, a CXR dataset leveraged to train COVID-Net that is comprised of 13,800 CXR images across 13,725 patient cases from two open access data repositories. Moreover, we investigated how COVID-Net makes predictions using an explainability method in an attempt to gain deeper insights into critical factors associated with COVID cases, which can aid clinicians in improved screening as well as improve trust and transparency when leveraging COVID-Net for accelerated computer-aided screening. By no means a production-ready solution, the hope is that the promising results achieved by COVID-Net on the COVIDx test dataset, along with the fact that it is available in open source format alongside the description on constructing the open source dataset, will lead it to be leveraged and build upon by both researchers and citizen data scientists alike to accelerate the development of highly accurate yet practical deep learning solutions for detecting COVID-19 cases from CXR images and accelerate treatment of those who need it the most. Future directions include continuing to improve sensitivity and PPV to COVID-19 infections as new data is collected, as well as extend the proposed COVID-Net to risk stratification for survival analysis, predicting risk status of patients, and predicting hospitalization duration which would be useful for triaging, patient population management, and individualized care planning. COVID-19 image data collection Example CXR images of COVID-19 cases from several different patients and their associated critical factors 2009 IEEE conference on computer vision and pattern recognition Correlation of chest CT and RT-PCR testing in coronavirus disease 2019 (COVID-19) in china: A report of 1014 cases Figure 1 COVID-19 chest x-ray data initiative Sensitivity of chest CT for covid-19: Comparison to RT-PCR Rapid AI development cycle for the coronavirus (COVID-19) pandemic: Initial results for automated detection and patient monitoring using deep learning ct image analysis Clinical features of patients infected with 2019 novel coronavirus in wuhan, china. The Lancet Deep learning Do explanations reflect decisions? a machinecentric strategy to quantify the performance of explainability algorithms Artificial intelligence distinguishes COVID-19 from community acquired pneumonia on chest CT Imaging profile of the COVID-19 infection: Radiologic findings and literature review The role of chest imaging in patient management during the COVID-19 pandemic: A multinational consensus statement from the fleischner society Learning generative machines to generate efficient neural networks via generative synthesis Detection of SARS-CoV-2 in different types of clinical specimens Deep learning system to screen coronavirus disease 2019 pneumonia Deep residual learning for image recognition RSNA pneumonia detection challenge Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases Netscore: Towards universal metrics for large-scale performance analysis of deep neural networks for practical usage We would like to thank Natural Sciences and Engineering Research Council of Canada (NSERC), the Canada Research Chairs program, and DarwinAI Corp.