key: cord-0866638-q2e9yq60 authors: Bartoli, Axel; Fournel, Joris; Maurin, Arnaud; Marchi, Baptiste; Habert, Paul; Castelli, Maxime; Gaubert, Jean-Yves; Cortaredona, Sebastien; Lagier, Jean-Christophe; Million, Matthieu; Raoult, Didier; Ghattas, Badih; Jacquier, Alexis title: Value and prognostic impact of a deep learning segmentation model of COVID-19 lung lesions on low-dose chest CT date: 2022-03-31 journal: Research in Diagnostic and Interventional Imaging DOI: 10.1016/j.redii.2022.100003 sha: da2f133e6dad164c9a1fbaa3f51153d16ab8c26c doc_id: 866638 cord_uid: q2e9yq60 Objectives 1) To develop a deep learning (DL) pipeline allowing quantification of COVID-19 pulmonary lesions on low-dose computed tomography (LDCT). 2) To assess the prognostic value of DL-driven lesion quantification. Methods This monocentric retrospective study included training and test datasets taken from 144 and 30 patients, respectively. The reference was the manual segmentation of 3 labels: normal lung, ground-glass opacity(GGO) and consolidation(Cons). Model performance was evaluated with technical metrics, disease volume and extent. Intra- and interobserver agreement were recorded. The prognostic value of DL-driven disease extent was assessed in 1621 distinct patients using C-statistics. The end point was a combined outcome defined as death, hospitalization>10 days, intensive care unit hospitalization or oxygen therapy. Results The Dice coefficients for lesion (GGO+Cons) segmentations were 0.75±0.08, exceeding the values for human interobserver (0.70±0.08; 0.70±0.10) and intraobserver measures (0.72±0.09). DL-driven lesion quantification had a stronger correlation with the reference than inter- or intraobserver measures. After stepwise selection and adjustment for clinical characteristics, quantification significantly increased the prognostic accuracy of the model (0.82 vs. 0.90; p<0.0001). Conclusions A DL-driven model can provide reproducible and accurate segmentation of COVID-19 lesions on LDCT. Automatic lesion quantification has independent prognostic value for the identification of high-risk patients. In December 2019, an outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spread worldwide from Asia to Europe [1, 2] . SARS-CoV-2 is responsible for coronavirus disease 2019 . It was declared a worldwide pandemic by the World Health Organization on March 11 th 2020. One of the main risks is the congestion of the health care system due to an unusually rapid inflow of patients, especially in the intensive care unit (ICU). Thus, there is a need for precise patient selection and risk stratification to focus on severe cases [3] . This stratification is based on clinical criteria, viral load on reverse-transcription polymerase chain reaction (RT−PCR) and pulmonary lesions on chest CT. Low-dose computed tomography (LDCT) is more effective than chest X-ray for depicting ground-glass opacity (GGO) and consolidation (Cons), with a lower dose of radiation than conventional chest CT [4−7] . Some investigators have shown that a semi quantitative clinical score reflecting the extent of lesions might be useful for patient risk stratification [8, 9] . Nevertheless, the computation of semi quantitative scores remains a time-consuming process that is prone to intra-and interobserver variability. Hence, there is a need for a fast, reproducible and fully automated COVID-19 lung lesion segmentation method that can be applied to a large cohort as a predictive risk stratification tool in disease management and prediction. Deep learning (DL) techniques, especially convolutional neural networks (CNNs), have shown promising results in the automation of medical imaging measures [10] . In thoracic imaging, these techniques have shown excellent performance in nodule detection, lesion segmentation and disease classification [11, 12] . The main purpose of this study was to develop and evaluate a complete DL pipeline that allows a fully automated segmentation of COVID-19 pulmonary lesions on LDCT and the computation of lesion volume and extent. Our secondary purpose was to investigate whether automatic lesion quantification was associated with adverse events among COVID-19 patients. This single-center retrospective study was conducted from March 3 rd to July 2 nd , 2020, and approved by the local Institutional Review Board (N°: 2020-0012, RGPD/Ap-Hm: 2020-48). Training, validation and test datasets including LDCT from 124, 20 and 30 patients, respectively, were included to build a pipeline based on CNNs adapted to assess automatic segmentation and quantification of COVID-19 lesions on LDCT as well as computation of lesion volume and extent. A flow diagram of the procedure is shown in Fig. 1 . Then, we evaluated the predictive value of deep learning (DL)-driven quantification of lung lesions on adverse event occurrence in a dataset of 1621 patients, excluding data from the training, validation and test datasets. Among those 1621 patients, 983 have been previously reported [13, 14] . The authors did not receive any financial or material support from any industrial company in the execution of this study. All patients were enrolled from a single center (La TIMONE Hospital − Assistance Publique Hôpitaux de Marseille (APHM)). All patients who presented between March 30 th and June 2 nd 2020 with a confirmed COVID-19 infection using SARS-CoV RNA detection from a nasopharyngeal swab sample [15, 13] and were eligible for unenhanced LDCT were retrospectively included. LDCT was performed on all patients who were over 55 years old or had risk factors for adverse outcomes for COVID-19, such as hypertension, diabetes, obesity (BMI>30), dyspnea or abnormal lung auscultation. The exclusion criteria were refusal to participate in the protocol and an age below 18 years. The following clinical parameters were recorded by infectiologists (M.M. and J-C.L., with 25 and 20 years of experience, respectively) the same day as the LDCT: age, sex, date of the first symptoms, temperature, heart rate, systolic and diastolic blood pressures, respiratory rate, oxygen saturation, cough, rhinorrhea, dyspnea, diarrhea, myalgia, and lung auscultation abnormalities. Medical history was recorded: heart disease, tobacco use, chronic obstructive pulmonary disease, asthma, diabetes, obesity, sleep apnea syndrome, oncological status and immunosuppression status. The time between the first symptoms and the LDCT was recorded. Patient follow-up lasted 10 days for patients with no adverse events, and the follow-up period was extended to cover the in-hospital stay for patients who required hospitalization. The primary endpoint of the second objective was a combined outcome consisting of either a need for oxygen therapy, a need for transfer to the ICU, hospitalization ≥10 days and/or death. All patients underwent unenhanced, deep-inspiration LDCT on the same system (Revolution EVO − GE Healthcare, WI, USA) with parameters detailed in Appendix A. To develop our pipeline, we used a training dataset composed of 124 LDCT examinations (68767 CT slices) and a validation dataset of 20 LDCT examinations (6317 CT slices) from consecutive patients in clinical care. To obtain a training dataset including all types of lesions and with a homogeneous repartition of lesion extent and severity, the chest tomography severity score (CT-SS) developed by Yang et al. was used on the whole cohort [16] . This score, ranging from 0 to 40, has been validated as a semiquantitative clinical method to quantify the extent and severity of lung abnormalities in COVID-19. All CT-SS images were evaluated by two experienced chest radiologists (J-Y.G. and P.H., with 25 and 7 years of experience, respectively). Patients for the training and validation datasets were chosen depending on their CT-SS, resulting in 13/144 (10.5%) and 2/20 (10%) severe patients (CT-SS >19.5) and 111/144 (89.5%) and 18/20 (80%) mild patients (CT-SS < 19.5). The test dataset was composed of 30 consecutive patients (15587 CT slices) from clinical care and did not overlap with the training dataset nor the validation datasets. Manual image segmentation was undertaken for the combined training, validation and test datasets by a single observer (Observer 1 (O1), A.B., with 5 years of experience). For each patient, all images from the lung window LDCT were anonymized. Images were imported in DICOM format into the validated post processing software 3D Slicer (https://www.slicer.org, 2014) [17] . Manual segmentation of the lung window CT was applied to the entire lung volume, including all slices, using thresholding, painting and erasing methods to obtain the segmentation masks of three distinct labels: GGO, Cons, and normal pixels within the lungs (LungN). GGO and Cons were distinguished using a threshold based on the attenuation values in HU compared to that of the pulmonary artery [18] . Distal vascular and bronchial trees were not extracted from the labels. The non-segmented part of the image was classified under a fourth label: background (BG). After being validated by one experienced chest radiologist (J-Y.G.), the obtained segmentation masks were considered the ground truth, especially for GGO and Cons. Clinical parameters were obtained from the ground-truth segmentations as follows: lung volume (cm 3 ) was the sum of the LungN, GGO, and Cons labels. The GGO and Cons volumes (cm 3 ) were extracted from the respective labels. The GGO and Cons extents (%) were the ratios of the GGO and Cons volumes, respectively, to the total lung volume. Lesion extent (%) was the sum of the GGO and Cons extents. The user interaction time was recorded for all manual segmentations. All ground-truth manual segmentations and extracted clinical measures were labeled O1a. Our pipeline was composed of three 2D slice-based CNN models and aimed to produce automated segmentation of GGO and Cons on LDCT images with corresponding measures in terms of volume (cm 3 ) and extent (%). All automated segmentations and extracted measures To assess the segmentation accuracy of our model, we compared the manual ground-truth segmentations (O1a) to the automatically obtained segmentations (Auto) in terms of technical metrics and clinical parameters on the test dataset (n= 30). For the technical metrics, we evaluated the model performance with the Dice similarity coefficient (DSC) and mean volume similarity function (MVSF) [19] . Our DSC calculation method was identical to those from [20] and [21] . A 2D-CNN is trained and then used to segment the volumetric (3D) image of a patient. This is obtained through slice-by slice 2D interference and the resulting 2D segmentations are concatenated with regard to the z-axis to produce the final 3D segmentations. The metrics (i.e. DSC) can then be computed at 3D-level. The O1a and Auto clinical parameters were evaluated using lesion volume (cm 3 ) and lesion extent (%) using mean absolute error (MAE), bias and correlation. Significance of the bias was evaluated by Wilcoxon signed-rank test. Efficiency, defined in terms of the user interaction time, was evaluated and compared. The reproducibility of the Auto method was compared to the inter-and intraobserver segmentation performances. Observer 1 performed a second analysis, labeled O1b, 2 weeks after the groundtruth segmentation; the tasks within the second analysis were performed in randomized order to minimize bias. Two other independent observers (Observer 2 (O2), A.M., with 3 years of experience; observer 3 (O3), B.M., with 3 years of experience) manually segmented the same test dataset; their segmentations were labeled O2 and O3, respectively. The observers were blinded to the subjects' characteristics and the segmentations made by the other observers. To assess the prognostic performance of the radiological quantification of lesion extent and type for adverse events among COVID-19 patients, we evaluated both forms of radiological quantification: the CT-SS and the automatic quantification, corresponding to disease extent (%) obtained with the presented DL pipeline. For automatic quantification, we evaluated GGO, Cons and lesion extent scores. Lesion extent was the sum of the GGO and Cons extents. To assess the predictive performance of these quantifications, we performed multivariate logistic regression on the combined outcomes. To this end, we used all the patients fulfilling our inclusion criteria and included in the study between March 3 rd and July 2 nd , excluding patients from the training and validation datasets. Quantitative variables are expressed as the mean § standard deviation and range or median, Q1-median-Q3 and range. Categorical data are expressed as raw numbers, proportions and percentages. To assess the predictive performance of the DL-driven automatic lesion extent quantification on the prognostic value dataset, we performed multivariate logistic regression on the following outcome: "transfer to ICU and/or death and/or hospitalization ≥ 10 days and/or oxygen therapy". We randomly divided the prognosis value dataset (n=1621) into a training subset (70% of the initial sample size, n=1135) and a validation subset (30% of the initial sample size, n=486). Model parameters were estimated on the training dataset, and prognosis performance was assessed on the validation dataset. A reference model (A) was first tested and adjusted for the following covariates: age, sex, comorbidities (cancer, diabetes, coronary artery disease, hypertension, chronic respiratory diseases, obesity), and time from symptom onset to scan date. Next, we tested a second model (B) where we added CT-SS as an independent variable and a third model (C) where we added the automatic lesion extent quantification obtained by DL-driven segmentation. Second-order interaction terms between the scores and the covariates were tested in Models B and C. We used likelihood ratio tests for comparing models. To estimate the models' ability to discriminate individuals, we computed the C-statistic on the validation dataset [22] . The optimal cutoff value for the automatic lesion extent quantification was selected based on the Youden index to maximize accuracy (sensitivity + specificityÀ1). A two-sided a of less than 0.05 was considered statistically significant. All analyses were carried out using SAS 9.4 statistical software (SAS Institute, Cary, NC). A total of 1785 patients were included, and the clinical characteristics, CT-SS and pulmonary lesion distributions of the training, validation, test and prognostic value datasets are shown in Table 1 . Pulmonary lesions evaluated on LDCT from the training, validation and test datasets were extracted from the manual ground-truth segmentations (O1a). Those from the prognosis value dataset were derived from the model segmentation. An example of the automated segmentation results is shown in Fig. 3 . The overall test dataset of LDCT scans had a median mean dose−length product of 38.75 §39.9 mGy.cm. The results for the DSC and clinical parameters between the automatic and manual segmentations, as well as a comparison to the inter-and intraobserver performances, are shown in Table 2 . The correlations between automatic and manual measures of lesion extent are presented in Table 2 and Fig. 4 . The DSC was 0.75 §0.08 for the overall lesion segmentations, 0.71 §0.10 for GGO segmentation, and 0.64 § 0.09 for Cons segmentations. The MVSF results are presented in Appendix C. The MAE was 70.3 §65.8 cm 3 for the GGO volume, 29.5 §35.9 cm 3 for the Cons volume and 71.4 §72.6 cm 3 for the lesion volume. The biases were -18.3 §95.4 cm 3 for the GGO volume, 14.4 §44.4 cm 3 for the Cons volume and -3.9 §102.6 cm 3 for the lesion volume, and none of these biases was found to be significant. In terms of disease extent, the MAE was 2.2 §2.1% for the GGO extent, 1.0 §1.3% for the Cons extent and 2.1 §2.4% for the lesion extent. The biases were not significant for the lesion extent quantification (-0.1% § 3.2; p = 0.59). Disease extent measures were highly correlated with ground truth, with a lesion extent correlation of 0.947 (p<0.001). Concerning segmentation efficiency, the mean interaction time was significantly different between manual and automated segmentation: 14.74 § 2.9 min versus 19 seconds (p<0.001) for each patient. For lesion segmentation, the DSC was higher for the Auto vs. O1a evaluation (0.75 §0.08) than for the interobserver (O1a vs. O2: 0.70 § 0.08; O1a vs. O3: 0.70 §0.08) or intraobserver agreement (0.72 §0.09). It was identical for the GGO and Cons segmentations. The automated lesion volume measures had an MAE of 71.4 §72.6 cm 3 ; the interobserver MAEs were as follows: O1a vs. O2: 105.1 §102.6 cm 3 ; O1a vs. O3: 122.8 §105.4 cm 3 . The intraobserver MAE was 117.0 §82.7 cm 3 . The correlation with the ground truth was higher for the automated measures of lesion volume (0.94) than for the interobserver (O1a vs. O2: 0.88; O1a vs. O3: 0.87) or intraobserver (0.91) measures. For lesion extent, the MAE was lower for the Auto vs. O1a evaluation (2.1 §2.4%) than for the interobserver (O1a vs. O2: 3.1 §2.9%; O1a vs. O3: 3.9 §3.7%) or intraobserver evaluations (3.5 §2.7%). The lesion extent correlation r was 0.947 for automated measures versus 0.909, 0.872 and 0.920 for the inter-and intraobserver measures. There were statistically significant biases in lesion extent for the O1a vs. O3 interobserver and intraobserver measures. Bland−Altman plots are presented in Appendix D. There were 227 patients (14%) in the prognostic value dataset who presented with the combined outcome (Table 3) . After adjustment for baseline clinical characteristics, the global scores were significantly associated with outcome occurrence ("transfer to ICU and/ or death and/or hospitalization ≥ 10 days and/or oxygen therapy".) and the addition of GGO or Cons did not modify the prognostic prediction for either the human or automatic radiological score. The adjusted odds ratios were 3.02 (95% CI: 2.44; 3.73) for the CT-SS and 3.86 (95% CI: 2.96; 5.05) for automatic quantification. The C-statistic was 0.82 (0.79−0.88) in Model A excluding all radiological scores, 0.89 (0.95−0.93) in Model B including CT-SS and 0.90 (0.86−0.94) in Model C including DL-driven quantification. The differences between Models A and B and between Models A and C were statistically significant (likelihood ratio tests: p<0.001). ROC curve analysis for lesion extents DL-driven quantification is shown on Appendix E. The main finding of the study was that the proposed automatic quantification pipeline provides an accurate and reproducible segmentation of GGOs and consolidations in COVID-19 infection. With respect to the human ground-truth segmentation, the variability of the model was lower than the inter-or intraobserver variability. The presented model was computationally efficient, requiring less than 20 seconds for complete DL-driven segmentation. Its accuracy was similar regardless of the extent of the lesions. Furthermore, the presented data showed that the automatic quantification of lesion extent provides a strong prognostic marker of adverse events during COVID-19 infection. During the COVID-19 pandemic, diagnostic imaging has multiple roles, including diagnosis, prognosis, and follow-up [23] . One potential method to obtain a precise evaluation of disease-related lesions and prognosis is to quantify the extent of the lesions. This study proposes a distinct segmentation of different COVID-19 lesions, differentiating GGO from consolidation. Most previously published works have focused on automated algorithms that help distinguish COVID-19 infection from other pulmonary infections [24, 25] . One of the main strengths of the present paper was the use of LDCT as input data. COVID-19 patients might undergo multiple CT examinations for diagnosis, follow-up and evaluation of complications of SARS-CoV-2 infection. At times when LDCT is encouraged in pneumonia diagnosis, automated algorithms should be adapted to these technical modifications [26] . The training dataset had substantial variability in pulmonary lesion extent and disease severity (from 0% to 36%). One of the main strengths of our study was that manual segmentation was conducted on all LDCT images in the training, validation and test datasets. Contrary to many segmentation models, the algorithm and obtained results were tested on all images in the test dataset (which was numbered at 15587 images for the 30 patients) rather than selected slices. The literature has seen a wide number of CNN-based methodologies for automatic segmentation of lung abnormality on CT scan. Works may be divided in three categories: those that base the training on CT scans fully annotated by experts [21, 27] , those that make use of weak/noisy labels to lower the annotation load [20, 28, 29] ) and those using transfer learning to transfer knowledge from non-COVID19 lesions [30] ). Regarding the network architectures, 2D CNNs [20, 21, 27] and 3D CNNs [27, 28, 30] are both represented. Some researchers focus on the detail of the architectures and advocate for additional modules, such as attention blocks [21, 29] . Despite of the vast number of papers proposing new architectures and modules, 8 out of the 10 finalists in the COVID-19 Lung CT Lesion Segmentation Challenge chose an UNet architecture as we propose [27] In 2020, Belfiore et al. highlighted the need to quantify the percentage of ventilated lung parenchyma as distinguished from the affected lung parenchyma [31] . Here, we propose a segmentation tool that differentiates normal from affected lungs (GGO and Cons). Cons DSC, volume and extent measures were always lower than the GGO measures. Interestingly, this demonstrates the difficulty of producing Cons segmentations. This could be due to the anatomic presentation of COVID-19 consolidations, which mostly have a sub pleural distribution and affect the lower segments [9] . Hence, consolidations are sometimes in continuity with the sub pleural fat and the chest wall, which can lead to segmentation failure. Consolidations were the only measure whose correlation was lower for the automated measure (Auto vs. 01a) than for the interhuman measure (O1a vs. O3). This finding suggests that our model might fail partially in cases of peripheral and lower lobe consolidation. Liu et al. proposed CT quantification of pneumonia lesions to predict the progression of severe disease and distinguished three labels: consolidation, semiconsolidation and ground glass [32] . They used a simple threshold to differentiate these labels. The authors obtained a DSC of 0.82 for COVID-19 pneumonia but did not publish the algorithmic details, biases or correlations. Chassagnon et al. presented a COVID-19 segmentation algorithm with a mean lesion DSC between automated and manual segmentation of 0.69 [25] . For GGO lesion segmentation, the DSC of 0.71 § 0.10 in the present study was below that of Jung et al. for the automated segmentation of GGO (0.78 § 0.07) [33] . This discordance can probably be explained by the difference in morphological patterns between parenchymal lesions and nodules. Among all tested factors, age remained the best predictor of clinical outcome. However, the C-statistic was significantly improved when DL-driven quantification was added for the combined outcome, which confirms the benefit of adding the radiological score to evaluate the prognosis. DL-driven quantification was not superior to the CT-SS in predicting the occurrence of clinical outcomes but did not require any human input. Concerning gender, 'men' is no longer a risk factor after adjustment on CTSS and automatic CT scores. This was due to a significant difference in CT scores between men and women. The same statistical reason can explain hypertension results. The present code is protected (IDDN.FR.001.220003.000.S. C.2020.000.31235) and can be shared upon the signing of a collaboration agreement. Our study has some limitations. All CT images were acquired on the same CT scanner in one clinical center. Additionally, the presented algorithm cannot provide a segmentation of the distal vascular and bronchial trees. A future goal of our work should be to include arterial and bronchial segmentation in our algorithm for even more precise lesion segmentation. A complete DL-driven pipeline for LDCT, which allows minimum radiation exposure, was developed to segment GGOs and consolidation due to COVID-19 lung involvement. The algorithm produces automatic lesion volume and extent measures that can be directly provided to physicians. DL-driven segmentation was more reproducible than human measures, achieving lower biases and mean absolute error than human inter-and intraobserver comparisons of lesion volume and extent. Lung involvement as quantified by our DL-driven pipeline was significantly associated with the occurrence of adverse events. This framework should be tested on multicenter datasets to evaluate disease severity at the time of the first LDCT evaluation. Institutional review board statement The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Institutional Review Board of ASSISTANCE-PUBLIQUE DES HOPITAUX DE MARSEILLE (AP-HM) (N°: 2020-0012, RGPD/Ap-Hm: 2020-48). None. The authors state that this work has not received any funding. The scientific guarantor of this publication is Pr. Alexis Jacquier. One of the authors has significant statistical expertise. Notea: Adjusted odds ratios with 95% confidence intervals. b: The C-statistic is a measure of goodness of fit for binary outcomes in a logistic regression model. It is equal to the area under the receiver operating characteristic (ROC) curve and ranges from 0.5 to 1. Models were based on the training set of the prognostic value dataset (n=1135), and the C-statistic was estimated on the validation set (n=486) of the prognostic value dataset. All scores were standardized (mean=0, standard deviation=1) prior to the analysis. LCDT: low-dose computed tomography; ICU: intensive care unit. Institutional Review Board approval was obtained Analysis and/or interpretation of data: Sebastien CORTAREDONA, Axel BARTOLI Drafting the manuscript: Alexis JACQUIER, Matthieu MILLION, Didier RAOULT Revising the manuscript critically for important intellectual content A novel coronavirus from patients with pneumonia in China A novel coronavirus emerging in China -key questions for impact assessment Primary stratification and identification of suspected Corona virus disease 2019 (COVID-19) from clinical perspective by a simple scoring proposal CT imaging features of 2019 novel coronavirus (2019-nCoV) Sensitivity of Chest CT for COVID-19: comparison to RT-PCR Chest CT in COVID-19 pneumonia: a review of current knowledge COVID-19 pneumonia: a review of typical CT findings and differential diagnosis The Clinical and chest CT features associated with severe and critical COVID-19 pneumonia Association of radiologic findings with mortality of patients infected with 2019 novel coronavirus in Wuhan A survey on deep learning in medical image analysis Automated quantification of radiological patterns predicts survival in idiopathic pulmonary fibrosis Automatic nodule detection for lung cancer in CT images: a review Full-length title: early treatment of COVID-19 patients with hydroxychloroquine and azithromycin: a retrospective analysis of 1061 cases in Marseille, France Clinical and microbiological effect of a combination of hydroxychloroquine and azithromycin in 80 COVID-19 patients with at least a six-day follow up: a pilot observational study Rapid viral diagnosis and ambulatory management of suspected COVID-19 cases presenting at the infectious diseases referral hospital Chest CT severity score: an imaging tool for assessing severe COVID-19 3D Slicer as an image computing platform for the quantitative imaging network Fleischner society: glossary of terms for thoracic imaging Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction A noise-robust framework for automatic segmentation of COVID-19 pneumonia lesions from CT images SCOAT-Net: a novel network for segmenting COVID-19 lung opacification from CT images Applied logistic regression Favorable changes of CT findings in a patient with COVID-19 pneumonia after treatment with tocilizumab Artificial intelligence distinguishes COVID-19 from community acquired pneumonia on chest CT AI-Driven CT-based quantification, staging and short-term outcome prediction of COVID-19 pneumonia Impact of ultra-low dose CT acquisition on semi-automated RECIST tool in the evaluation of malignant focal liver lesions Rapid artificial intelligence solutions in a pandemic -the COVID-19-20 lung CT lesion segmentation challenge Federated semisupervised learning for COVID region segmentation in chest CT using multinational data from China, Italy Inf-Net: automatic COVID-19 lung infection segmentation from CT images Does non-COVID-19 lung lesion help? investigating transferability in COVID-19 CT image segmentation Artificial intelligence to codify lung CT in Covid-19 patients CT quantification of pneumonia lesions in early days predicts progression to severe illness in a cohort of COVID-19 patients Ground-glass nodule segmentation in chest CT images using asymmetric multi-phase deformable model and pulmonary vessel removal Authors would like to thank all the paramedical staff of our department who are managing the COVID-19 crisis with professionalism and effectiveness. Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.redii.2022.100003. The authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article.