key: cord-347333-h899xkfy authors: Li, Z.; Zhong, Z.; Li, Y.; Zhang, T.; Gao, L.; Jin, D.; Sun, Y.; Ye, X.; Yu, L.; Hu, Z.; Xiao, J.; Huang, L.; Tang, Y. title: From Community Acquired Pneumonia to COVID-19: A Deep Learning Based Method for Quantitative Analysis of COVID-19 on thick-section CT Scans date: 2020-04-23 journal: nan DOI: 10.1101/2020.04.17.20070219 sha: doc_id: 347333 cord_uid: h899xkfy Background: Thick-section CT scanners are more affordable for the developing countries. Considering the widely spread COVID-19, it is of great benefit to develop an automated and accurate system for quantification of COVID-19 associated lung abnormalities using thick-section chest CT images. Purpose: To develop a fully automated AI system to quantitatively assess the disease severity and disease progression using thick-section chest CT images. Materials and Methods: In this retrospective study, a deep learning based system was developed to automatically segment and quantify the COVID-19 infected lung regions on thick-section chest CT images. 531 thick-section CT scans from 204 patients diagnosed with COVID-19 were collected from one appointed COVID-19 hospital from 23 January 2020 to 12 February 2020. The lung abnormalities were first segmented by a deep learning model. To assess the disease severity (non-severe or severe) and the progression, two imaging bio-markers were automatically computed, i.e., the portion of infection (POI) and the average infection HU (iHU). The performance of lung abnormality segmentation was examined using Dice coefficient, while the assessment of disease severity and the disease progression were evaluated using the area under the receiver operating characteristic curve (AUC) and the Cohen's kappa statistic, respectively. Results: Dice coefficient between the segmentation of the AI system and the manual delineations of two experienced radiologists for the COVID-19 infected lung abnormalities were 0.74 {+/-} 0.28 and 0.76 {+/-} 0.29, respectively, which were close to the inter-observer agreement, i.e., 0.79 {+/-} 0.25. The computed two imaging bio-markers can distinguish between the severe and non-severe stages with an AUC of 0.9680 (p-value < 0.001). Very good agreement ({kappa} = 0.8220) between the AI system and the radiologists were achieved on evaluating the changes of infection volumes. Conclusions: A deep learning based AI system built on the thick-section CT imaging can accurately quantify the COVID-19 associated lung abnormalities, assess the disease severity and its progressions. Coronavirus Disease 2019 (COVID-19) has rapidly spread all over the world since the end of 2019, 2 and 1, 436, 198 cases have been confirmed as COVID-19 to date (9 April 2020) [1] . 3 Reverse-transcription polymerase chain reaction (RT-PCR) is used as the standard diagnostic 4 method. However, it suffers from low sensitivities as report in [2, 3] . Computed tomography (CT) 5 imaging is often adopted to confirm the COVID-19 in China and some European countries, e.g., 6 Netherlands. CT plays a key role in the diagnosis and treatment assessment of COVID-19 due to its 7 high sensitivity [2, 4] . 8 The explosive growing number of COVID-19 patients requires the automated AI-based computer 9 aided diagnosis (CAD) systems that can accurately and objectively detect the disease infected lung 10 regions, assess the severity and the progressions. Recently, several deep learning based AI systems 11 were developed to differentiate the COVID-19 and community acquired pneumonia (CAP) [5] or 12 other viral pneumonia [6, 7] , and to quantify the infection regions [8, 9, 10, 11] . However, all these 13 previous AI systems built upon the high resolution thin-section CT images, which have high 14 radiation doses and require higher costs. In contrast, the thick-section CT images from affordable 15 CT scanners has relatively low radiation doses and are popularly used in hospitals worldwide, 16 especially in primary care. Hence, it is worthwhile to develop an AI-based CAD system using the 17 thick-section CT images. 18 In this study, we developed a fully automated AI system to quantify COVID-19 associated lung 19 abnormalities, assess the disease severity and the disease progressions using thick-section chest CT 20 images. Specifically, the lung and infection regions were first segmented by a deep learning based 21 model, where the labels came from another multi-center annotated CAP CT dataset knowing that 22 COVID-19 shares similar abnormal lung patterns with other pneumonia such as ground glass opacity 23 (GGO), consolidation, bilateral infiltration, etc. Using the lung and infection segmentation masks, 24 we computed the portion of infection (POI) and the average infection HU (iHU) as two imaging 25 bio-markers, which were applied to distinguish the COVID-19 severity. Moreover, the changes of 26 POI and iHU in patient's longitudinal CT scans were calculated to evaluate the COVID-19 27 progression. For evaluation, the AI based lung abnormalities segmentation was compared to two 28 experienced radiologists manually delineations, while the AI based assessment of disease severity and 29 progression was compared to patients diagnosis status extracted from clinical and radiology reports. 30 To the best of our knowledge, this is the first AI-based study to quantitatively assess the COVID-19 31 severity and disease progression using the thick-section CT images. China National Health Commission [12] , the severity of COVID-19 includes mild, common, severe 42 and critical types. Since there were few mild and critical cases, we categorized all the CT scans into 43 severe group (including severe and critical) and non-severe group (mild and common). In total, we 44 had 79 severe CT scans from 32 patients, and 452 general CT scans from 164 patients. It should be 45 noticed that some patients were in non-severe phase when they entered the hospital, but may 46 develop into severe phase during treatment. All the COVID-19 patients were used to test the AI 47 system performance. 48 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 23, 2020. To train the lung abnormalities segmentation deep learning model, another multi-center 49 pnumonia dataset was collected consisting of 558 CT scans with manual annotations. The informed 50 consent waiver of the training data were approved by the Ethics of Committees of multiple institutes. 51 All COVID-19 patients underwent the CT scanning using the GE Brivo CT325 scanner (General 53 Electric, Illinois, the United States). The scanning protocol was as follows: 120 kV; adaptive tube 54 current (30mAs-70mAs); pitch= 0.99-1.22 mm; slice thickness= 10 mm; field of view: 350 mm 2 ; 55 matrix, 512×512; and breath hold at full inspiration. CT images were reconstructed with 5mm slice 56 thickness and the soft reconstruction kernel. Note that the radiation dose (3.43mGy) from the 57 thick-section CT imaging are reasonably lower than the conventional high resolution chest CT 58 imaging (6.03mGy, Siemens SOMATOM go.Top). For the multi-center pneumonia dataset, the 558 CT scans were from Siemens, Hitachi, GE, The computed POI and iHU are consistent with latest version (the seventh) of COVID-19 75 diagnostic guideline released by the National Health Commission of China [12] . The guideline states 76 that the POI is one of the principles to differentiate the severe and non-severe patients. It also 77 reports that lung findings in chest CT may start from small subpleural GGO to crazy paving pattern 78 and consolidation when patients conditions getting worse, which correspond to the increase in iHU 79 changes. 80 Statistical analysis was performed by SAS (version 9.4) and Matlab (version 2018b). Sensitivity and 82 specificity were calculated using specific cutoffs by using the Youden index generated from the 83 receiver operating characteristic curve (ROC). Cohen's kappa statistic was used to measure of 84 agreement between the disease progress assessment from AI and radiologists. χ 2 test was used to 85 compare differences among different groups. A two-sided p value less than 0.05 was considered to be 86 statistically significant. The Dice coefficient was computed to evaluate agreement between the 87 automatic infection region segmentation and the manual infection delineations by radiologists. 88 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. and two radiologists were 0.74±0.28 (median=0.79) and 0.76±0.29 (median=0.84), respectively. The 97 inter-observer variability between the two radiologists was also assessed using Dice coefficient, which 98 is 0.79±0.25(media=0.85). Based on the clinical diagnosis reports, 79 CT scans had been identified to belong to the severe 101 group, while 452 scans were in the non-severe group. Figure 3 shows the box-plot of the computed 102 POI and iHU for severe and non-severe groups. Note that both the POI and iHU show significant 103 difference between severe and non-severe groups with p-value < 0.001. Predictive probabilities were generated using the logistic regression model. Comparisons of 105 different imaging bio-markers for assessment of severe and non-severe exams are shown in Table 2 . 106 Using the POI as input, the sensitivity and specificity for identifying the severe group are 92.4% 107 90.5%, respectively. Using the iHU as input, the sensitivity and specificity for identifying the severe 108 group are 91.1% and 41.6%, respectively. When combining the POI with iHU, the sensitivity and Figure 5 showed a qualitative example of the automatically segmented infection regions of a severe 118 patient's longitudinal CT scans. We calculated the changes of the POI and iHU for each consecutive 119 CT scan pair of the patients. The key phrases extracted from patients radiology reports were used as 120 ground-truth reference. The correspondence of the computed bio-markers changes with radiologists 121 assessment was described in Table 3 . To measure the agreement between the AI computed imaging bio-markers changes and the 123 radiologists assessment, we first binarize the bio-markers changes. The value 1 (or 0) represented the 124 increasing (or decreasing) of bio-markers and its corresponded phrases of radiology reports. Cohen's 125 Kappa was then used to measure the agreement, and the results were shown in Table 4 .The very 126 good and moderate agreement were achieved between two AI imaging bio-markers and radiologists 127 assessment if we only consider the changes on whole lung level (ignoring the cases with phrase of 128 'partially changes'). The change of POI showed overall better agreement (very good and good ) with 129 radiologists assessment than iHU (moderate and fair ). Our diagnosis system is an multi-stage AI system. The key step is to extract the infection regions. 147 It is interesting that this processing modules are trained using CAP cases while the detection and 148 segmentation accuracy is still closed to radiologist-level. Dice coefficient between the COVID-19 149 infected region segmentation of the AI system and two experienced radiologists were 0.74±0.28 and 150 0.76±0.29, respectively, which were close to the inter-observer agreement, i.e., 0.79±0.25. Among our computed imaging bio-markers, only the POI shows high sensitivity and specificity 152 for differentiating the severe from non-severe COVID-19 groups. This indicates that the POI is an 153 effective imaging bio-marker to assess the severity of COVID-19 patients. Although the iHU value is 154 also able to reflect infection progress, however, it is affected by several other disease irrelevant 155 factors, such as the reconstruction slice thickness and the respiration status [15, 16] . For instance, 156 consolidation on HRCT images might be displayed as GGO on thick-section CT images. The changes of volume and density of the infection region are two key indicators that used by 158 radiologists for COVID-19 progression assessment. However, it is time consuming (or even 159 impractical) for radiologists to produce the quantitative measurements for this longitudinal analysis. 160 Our AI system provides a quantitative and objective measurement, i.e., the POI, which shows strong 161 agreement with radiologist qualitative judgements. More importantly, the AI based longitudinal 162 disease quantification is precise, reproducible and fast, which can reduce the reading time of 163 radiologists for COVID-19 each patient and improve the quality of the disease progression 164 assessment [10] . 165 This study has several limitations. Firstly, we only evaluated changes of imaging bio-markers at 166 the whole lung level in certain phrase. Although our model can compute the bio-markers at the lobe 167 level, the standard phrases from the radiology reports were mostly at the whole lung level. 168 Furthermore, some phrases in the reports like 'lesion absorption' might respond to either infection 169 region decreasing or HU value reduction. Thus it needs more sophisticated and precise analysis 170 evaluating our model in the future. Secondly, motion artifacts due to respiration and heart motion 171 may cause false positive segmentation in the AI system. We noticed that some false positive 172 segmentation affected the longitudinal infection evaluations6. One possible solution is to identity the 173 motion artifacts before applying the infection segmenting. Finally, our model was only tested the 174 COVID-19 positive patients. A recent study has shown that a deep learning based AI classification 175 model can detect the COVID-19 and distinguish it from the community acquired pneumonia and 176 other non-pneumonic lung diseases using thin-section HRCT [5] . As the next step, it would be 177 interesting to see if our model can also differentiate the pneumonia caused by COVID-19 and other 178 factors using the thick-section CT imaging. In conclusion, a deep learning based AI system is developed to quantify COVID-19 abnormal 180 lung patterns, assess the disease severity and the progression using thick-section chest CT images. The imaging bio-makers computed from the AI system could be used for reproducing several findings 182 of infection change from the reports by radiologists. These results demonstrate that the deep 183 learning based tool has the ability to help radiologists on diagnosing and follow-up treatment for 184 COVID-19 patients based on CT scans. 185 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 23, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 23, 2020. . 14/16 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 23, 2020. . https://doi.org/10.1101/2020.04.17.20070219 doi: medRxiv preprint Figure 5 . The lesion segmentation of six adjacent CT scans that taken from Jan.27 to Feb.12 for a severe patient. The red dot corresponds to the time for given the 'severe' diagnosis and the green point corresponds to the time for given the 'non-severe' diagnosis. 15/16 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 23, 2020. . https://doi.org/10.1101/2020.04.17.20070219 doi: medRxiv preprint Figure 6 . The false positive segmentation from a exam with motion artifacts. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 23, 2020. . https://doi.org/10.1101/2020.04.17.20070219 doi: medRxiv preprint Coronavirus disease 2019 (covid-19)situation report-80 Correlation of chest ct and rt-pcr testing in coronavirus disease 2019 (covid-19) in china: a report of 1014 cases Sensitivity of chest ct for covid-19: comparison to rt-pcr Chest ct for typical 2019-ncov pneumonia: relationship to negative rt-pcr testing Artificial intelligence distinguishes covid-19 from community acquired pneumonia on chest ct Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: a prospective study. medRxiv A deep learning algorithm using ct images to screen for corona virus disease (covid-19). medRxiv Lung infection quantification of covid-19 in ct images with deep learning Severe covid-19 pneumonia: Assessing inflammation burden with volume-rendered chest ct Serial quantitative chest ct assessment of covid-19: Deep-learning approach Longitudinal assessment of covid-19 using a deep learning-based quantitative ct pipeline: Illustration of two cases National Health Commission of PRC. Diagnosis and treatment protocol for novel coronavirus pneumonia