key: cord-0838684-64xoz01d authors: Kumar, S.N.; Ahilan, A.; Fred, A. Lenin; Kumar, H. Ajay title: ROI extraction in CT lung images of COVID-19 using Fast Fuzzy C means clustering date: 2021-06-11 journal: Biomedical Engineering Tools for Management for Patients with COVID-19 DOI: 10.1016/b978-0-12-824473-9.00001-x sha: fcf1e0cedf676ab35c4b28002351e0ebd9ad994f doc_id: 838684 cord_uid: 64xoz01d The outbreak of coronavirus is intense in most countries around the world. The Region of Interest (ROI) extraction in medical images plays a vital role in the disease diagnosis and therapeutic planning. Clustering is extensively used in data mining applications for the grouping of data. The CT medical imaging modality is one of the diagnostic tools for COVID-19 and as a primary screening tool prior to the confirmation by reverse-transcription polymerase chain reaction (RT-PCR) lab testing. The FCM algorithm gains importance in medical image processing for the segmentation of anomalies. This research work proposes Fast Fuzzy C means clustering for the ROI extraction in CT lung images of Coronavirus Pneumonia. Prior to the segmentation, preprocessing was performed by median filter. The validation of fast FCM was done by partition coefficient and partition entropy. The computation complexity of Fast Fuzzy C means algorithm was low, when compared with the classical FCM algorithm. vital since it should not induce any artifacts, thereby blurring the resultant image. Fuzzy clustering is a data mining concept that has gained importance in widespread applications (Atkinson, 2005; Kannan, Ramathilagam, Sathya, & Pandiyarajan, 2010) . The authors proposed a hybrid algorithm comprising of Harmony search and Otsu for the extraction of COVID-19 infected regions on the X-ray images. From the extracted ROI, the infection rate was calculated . The lung region was segmented using a multiclass segmentation model based on deep multitask learning. The proposed model performance was compared with metrics like dice, sensitivity, specificity, and MAE, which outperform other deep learning network models (Elharrouss, Subramanian, & Al-Maadeed, 2020) . Kadry et al. (2020) , found that SVM generated better classification accuracy of 89.80% for the classification of the normal and COVID-19 lung CT images, when compared with classifiers like Naive Bayes, KNN, Decision Tree, and Random Forest techniques. Initially the images were preprocessed with Chaotic-Bat-Algorithm and Kapur's Entropy thresholding techniques for the enhancement. The author proposed a fully automated COVID-CXNet deep neural network for the detection of COVID-19 in the chest X-ray images. The proposed model achieves a better accuracy of 87.21% and F-score of 0.92, when compared with other related works (Haghanifar, Majdabadi, & Ko, 2020) . For the initial screening of the patients affected with COVID-19, a CNN-based deep learning method was proposed (Wang et al., 2020) . The features were randomly selected from ROIs and trained with a fully connected layer of M-Inception network model and predicted with Decision Tree and Adaboost classifier. The authors proposed a COVID-19 detection neural network (COVNet) with 4356 chest CT datasets for the detection of COVID-19, community acquired pneumonia, and other chest diseases. The deep learning model with area under ROC of 0.96 identifies COVID-19 and 0.95 identifies community acquired pneumonia . The authors suggested a CNN-based ResNet-50 model for the classification of COVID-19 and non-COVID-19 in CT images. The classification model was incorporated with gradient-weighted class activation mapping for visual verification of the model, and also wavelet transformation in the input CT images gave a better performance in terms of accuracy, sensitivity, specificity, PPV, NPV, F1 score, and Matthews Correlation Coefficient, when compared with the direct CNN model (Matsuyama, 2020) . Satapathy et al. (2020) proposed a cuckoo search optimization-based multithresholding for the segmentation of COVID-19 lesion on CT images. The hybrid technique comprising Chan Vese and Otsu thresholding generates efficient results when compared with the other algorithms. A deep reinforcement learning algorithm was proposed (Chaganti et al., 2020) for the automatic quantification of COVID-19 lesions on CT images. The proposed model generates 3D contours of the COVID-19 lesions on CT images with results validated by the correlation and regression techniques. For the detection of the COVID-19 in the chest X-ray images, a Parallel-Dilated COVIDNet (PDCOVIDNet) which performs convolution dilation rate in a parallel form to detect the features was proposed (Chowdhury, Rahman, & Kabir, 2020) . A transfer learning deep CNN model for the diagnosis and detection of COVID-19 in the X-ray images was proposed in this work. This model is validated with X-ray images containing confirmed COVID-19, bacterial pneumonia and normal subjects which achieve an overall accuracy of 96.78%, sensitivity of 98.66%, and specificity of 96.46% with 10-fold crossvalidation (Apostolopoulos & Mpesiana, 2020) . The deep residual U net architecture was proposed for the volumetric extraction of ROI on COVID-19 CT images. A mean dice coefficient of 0.85 was obtained when compared with the existing techniques (Usman et al., 2020) . The CNN coupled with the clustering technique was employed for the segmentation of lung parenchyma in CT images (Xu et al., 2019) . A novel deep learning architecture termed as Inf-Net was proposed for the automatic segmentation of COVID-19 lesions on CT images. Efficient results were produced, when compared with the existing models (Fan et al., 2020) . Section 6.2 describes the Fast Fuzzy C means clustering, Section 6.3 describes the results and discussion, and finally the conclusions are drawn in Section 6.4. The Fast Fuzzy C means clustering ROI extraction is proposed in this work for the lesion detection in COVID-19 CT images. The CT images are taken from http://coronacases.org, and comprise 10 data sets of confirmed corona disease. The results of typical slices from the data set are depicted here Clustering algorithms have gained importance in image segmentation. They are mainly classified into two categories; hard clustering and soft clustering. The K-means algorithm is a hard clustering technique and FCM is a soft clustering technique. Many modifications in conventional K-means and FCM have been proposed to improve the segmentation result. The conventional FCM algorithm produces a good result for noise-free images. The cluster centroids are initialized randomly in classical FCM algorithm and often get stuck at local minima value. The Fuzzy C means algorithm was first developed by Dunn and modified by Bezdek (Dunn, 1973; Hathaway & Bezdek, 1988) . The objective function of FCM is given in Eq. (6.1). where m is the fuzzy weighted exponent factor (m . 1), d ij is the Euclidean distance between the pixel (x i ) and cluster center (V j ), and u ij is the membership function. The Euclidean distance is represented in Eq. (6.2) The membership function satisfies the following constraint and is represented in Eq. (6.3) In the perspective of image processing, clustering is the process of grouping pixels into classes whose members are similar. Based on the membership function of pixels, each pixel is grouped into a specific class. In FCM, a pixel can belong to more than one cluster based on the degree of membership and its value lies in the range of 0À1. The Euclidean distance is the commonly used distance metric and its value should be minimum for the clustering technique. The steps in FCM algorithm for the segmentation are summarized as follows. Step 1: Divide the input medical image of size X 3 Y pixels into C number of clusters (C . 2) Step 2: Fix the cluster fuzziness value (q); it must be a real number greater than 1 and its default value is 2. Step 3: Initialize the membership matrix U, at random, such that U ijp A 0; 1 ½ , P C j51 U ijp 5 1:0 for each i and fixed value of p. Step 4: Estimate the cluster center C jp , for j-th cluster Step 5: Estimate the Euclidean distance between i-th data point and j-th cluster center ED ijp 5 :x ip 2 C jp : ð6:5Þ Step 6: Updation of fuzzy membership matrix U according to D ijp . If ED ijp . 0, then If ED ijp 5 0, then the data point coincides with the corresponding data point of j-th cluster center C jp and it has the full membership value, that is U ijp 5 1:0. Step 7: Repeat from Step 4 to Step 6 until the changes in U , ε, where ε is a prespecified termination criterion. The Fast Fuzzy C means clustering proposed in this work is based on the histogram analysis (Caldairou, Passat, Habas, Studholme, & Rousseau, 2011) . The histogram is a plot which provides the count of pixels with a particular gray value (for a gray scale image, it ranges from 0 # gray value # 255). Consider an image of arbitrary size Xi 3 Yj; F i; j ð Þ is the gray level intensity at the pixel position (i, j). The h(k) denotes the number of pixels having gray value k k 5 1; 2; . . .; L 2 1 ð Þ , where L is the maximum gray value in the image. The computation procedure is similar to that of the conventional FCM; however there will be a reduction in the number of data handled. In the case of gray scale image of size 512 3 512 with 8 bit gray level (L 5 256), the computation time of Fast FCM algorithm involves the 1024 data points ð 512 3 512 ð Þ =256Þ. Thus in the improved FCM, the number of data are reduced from X i 3 Y j to (X i 3 Y j )/L, where L is the maximum gray value in the image. The interval [l i , h i ] for the Fast Fuzzy C means clustering was estimated from the technique used by Zhang, Zhang, Tang, and Wei (2012) . The steps in Fast FCM are similar to conventional FCM, however the cluster centroids is expressed as follows The cluster centroids are initialized as follows The computation complexity of the Fast Fuzzy C means clustering algorithm is low, when compared with the classical FCM algorithm. The algorithms are developed in Matlab 2015a and tested on real-time lung CT DICOM images. Prior to segmentation, preprocessing was performed by a median filter. The median filter of kernel size 3 3 3 was used and the number of classes was set to three for all input images. The input images are depicted in Fig. 6 .1. The parameters of the Fast Fuzzy C means clustering algorithm are C 5 3 and q 5 2. The C represents the number of clusters and its default value is 2; q represents the fuzzy weighting exponent value and its default value is 2. The proposed Fast Fuzzy C means clustering requires only one tunable parameter C and for most of the cases, its value is set to 3. The Fig. 6 .1 depicts the input images; Figs. 6.2 and 6.3 depicts the clustering outputs corresponding to three clusters. In this chapter, for the analysis of Fast Fuzzy C means algorithms, 12 DICOM CT images of lungs are taken. For the first six DICOM CT images (ID1-ID6), intermediate outputs and the fuzzy membership function are plotted. For the second set of six DICOM images (ID7-ID12), the final ROI extracted output is depicted here. The fast fuzzy C means algorithm was found to be superior in performance, when compared with the classical FCM algorithm and was validated in terms of the performance metrics (Figs. 6.4 and 6.5). The fuzzy class membership detects the three intensity peaks (marked as red, green, and blue colors) The intensity peaks are used for partitioning the data into three, thereby generating the ROI (Figs. 6.6 and 6.7). There are various metrics for the evaluation of quality of clustering algorithm. This research work employs Partition Coefficient (PC) and Partition Entropy (PE) measures. The cluster number for all the images is chosen as 3. The expression for PC is as follows The expression for PE is as follows The PC estimates the average strength of belongingness of data and the good clustering algorithm should have high value of PC. The PE is the partition entropy and it represents the entropy measurement. A good clustering algorithm should have a low value of PE. From the Fig. 6 .8, it is evident that the PC value is high and the PE value is low representing the efficiency of the clustering algorithm. For the evaluation of computation complexity, FCM algorithm and Fast FCM algorithms are compared. The system specification is as follows: 1.8 GHz Intel Core i5 processor with 8 GB 1600 MHz DDR3 memory in Mac OS High Sierra Operating system. From the Fig. 6 .9, it is evident that Fast FCM has very low computation time, when compared with the classical FCM algorithm. The Fast Fuzzy C means clustering algorithm is proposed in this work for the COVID-19 lesion detection in CT images. An overview of segmentation algorithms for medical images is described by Pham et al. (2000) . The classical algorithms used in medical image segmentation are thresholding, region growing, and edge detection. The results of classical algorithms are depicted here corresponding to the input images. The thresholding output corresponding to the input images ID1 to ID 6 are depicted in Fig. 6.10 . The region growing output corresponding to the input images ID1 to ID 6 are depicted in Fig. 6 .11. The canny edge detector output corresponding to the input images ID1 to ID6 are depicted in Fig. 6 .12. Pham et al. (2000) evaluated ten widely used segmentation algorithms for medical images, and based on the characteristics the overall score was also estimated. The clustering algorithm was found to have an excellent score when compared with other techniques. The traditional thresholding technique is sensitive to noise and in the case of region growing, the seed points have to be manually selected by the user. The canny edge detector traces the boundary of ROI only and in the case of watershed segmentation algorithm, undersegmentation and oversegmentation are the discrepancies. The Fast Fuzzy C means clustering algorithm is an automatic ROI extraction technique and generates proficient results, when compared with the classical segmentation algorithm results. The future work will be the classification of ROI using a novel deep learning technique for the detection of stage of disease. This research work proposes a Fast Fuzzy C means clustering algorithm for ROI extraction in CT images of lungs affected by coronavirus. The Fast FCM algorithm based on histogram analysis yields efficient ROI extraction results, and the computation complexity is low, when compared with the classical FCM algorithm. The CT is being considered as a primary screening tool for COVID-19 prior to RT-PCR lab testing, and ROI extraction gains much importance. The automatic ROI extraction proposed in this research work yields proficient results for 2D DICOM CT images. The future work is the development of a deep learning model for the automatic classification of COVID-19CT images based on the severity of the regions affected in the lungs CT DICOM images. The ROI extracted by the Fast Fuzzy C means algorithm can be used as an input to the deep learning classification model for the automatic classification. The author S.N. Kumar would also like to acknowledge the support provided by Schmitt Centre for Biomedical Instrumentation (SCBMI) of Amal Jyothi College of Engineering. Covid-19: Automatic detection from x-ray images utilizing transfer learning with convolutional neural networks Sub-pixel target mapping from soft-classified, remotely sensed imagery. Photogrammetric Engineering & Remote Sensing A non-local fuzzy segmentation method: Application to brain MRI Automated quantification of CT patterns associated with COVID-19 from chest CT PDCOVIDNet: A parallel-dilated convolutional neural network architecture for detecting COVID-19 from chest X-ray images A fuzzy relative of the ISODATA process and its use in detecting compact wellseparated clusters An encoder-decoder-based method for COVID-19 lung infection segmentation Inf-Net: Automatic COVID-19 lung infection segmentation from CT images Fuzzy-crow search optimization for medical image segmentation. Applications of hybrid metaheuristic algorithms for image processing A method for modeling noise in medical images COVID-CXNet: Detecting COVID-19 in frontal chest X-ray images using deep learning Recent convergence results for the fuzzy c-means clustering algorithms Development of a machine-learning system to classify lung CT scan images into normal/COVID-19 class Effective fuzzy c-means based kernel function in segmenting medical images An overview of segmentation algorithms for the analysis of anomalies on medical images Artificial intelligence distinguishes COVID-19 from community acquired Pneumonia on chest CT Coronavirus disease 2019 (COVID-19): Role of chest CT in diagnosis and management A deep learning interpretable model for novel Coronavirus disease (COVID-19) screening with chest CT images Correlation between universal BCG vaccination policy and reduced morbidity and mortality for COVID-19: An epidemiological study Current methods in medical image segmentation Harmony-search and otsu based system for coronavirus disease (COVID-19) detection using lung CT scan images Segmentation and evaluation of COVID-19 lesion from CT scan slices-A study with Kapur/Otsu function and Cuckoo Search Algorithm Image processing, analysis, and machine vision Volumetric lung nodule segmentation using adaptive roi with multi-view residual learning A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19). MedRxiv Segmentation of lung parenchyma in CT images using CNN trained with the clustering algorithm-generated dataset Medical image segmentation using improved FCM