key: cord-0746795-tnyvzhil authors: Amin, Javaria; Anjum, Muhammad Almas; Sharif, Muhammad; Saba, Tanzila; Tariq, Usman title: An intelligence design for detection and classification of COVID19 using fusion of classical and convolutional neural network and improved microscopic features selection approach date: 2021-05-08 journal: Microsc Res Tech DOI: 10.1002/jemt.23779 sha: 05f1ae7c84a73387d7e6ff93949e489442011680 doc_id: 746795 cord_uid: tnyvzhil Coronavirus19 is caused due to infection in the respiratory system. It is the type of RNA virus that might infect animal and human species. In the severe stage, it causes pneumonia in human beings. In this research, hand‐crafted and deep microscopic features are used to classify lung infection. The proposed work consists of two phases; in phase I, infected lung region is segmented using proposed U‐Net deep learning model. The hand‐crafted features are extracted such as histogram orientation gradient (HOG), noise to the harmonic ratio (NHr), and segmentation based fractal texture analysis (SFTA) from the segmented image, and optimum features are selected from each feature vector using entropy. In phase II, local binary patterns (LBPs), speeded up robust feature (Surf), and deep learning features are extracted using a pretrained network such as inceptionv3, ResNet101 from the input CT images, and select optimum features based on entropy. Finally, the optimum selected features using entropy are fused in two ways, (i) The hand‐crafted features (HOG, NHr, SFTA, LBP, SURF) are horizontally concatenated/fused (ii) The hand‐crafted features (HOG, NHr, SFTA, LBP, SURF) are combined/fused with deep features. The fused optimum features vector is passed to the ensemble models (Boosted tree, bagged tree, and RUSBoosted tree) in two ways for the COVID19 classification, (i) classification using fused hand‐crafted features (ii) classification using fusion of hand‐crafted features and deep features. The proposed methodology is tested /evaluated on three benchmark datasets. Two datasets employed for experiments and results show that hand‐crafted & deep microscopic feature's fusion provide better results compared to only hand‐crafted fused features. disease, RT-PCR is not effective because its testing process is timeconsuming which may cause a delay in the proper treatment of the patients (Fang et al., 2020; Khan et al., 2021) . The effective imaging protocol is Computed Tomography (CT) which is noninvasive (Huang et al., 2020) . This is an essential diagnostic tool for early screening; however, there is difficulty in differentiating the COVID-19 features from other pneumonia types. It is beneficial to diagnose and quarantine the patient and to control the spreading of the disease. The task of object detection and classification of images is the main concern in deep learning. Artificial intelligence is being applied on data to recognize patterns and their analysis Rehman, Sabad, et al., 2021; Wang, Hu, et al., 2020) . Therefore, it can be more beneficial for COVID-19 detection and classification. Different AI methods are utilized for COVID-19 detection. However, there is still a gap in this domain because small lesion regions are detected as a normal region and existing algorithms failed when lesion appears at the lungs region's border (Haimed, Saba, Albasha, Rehman, & Kolivand, 2021; Saba, 2021) . Optimal features extraction and selection is another challenge. At the same time noisy features also reduce the overall model accuracy (Acharya et al., 2019; Albahri et al., 2020a Albahri et al., , 2020b . Therefore, a new approach focused on semantic segmentation and features analysis based on the hand-crafted and deep fusion features is discussed in this study. The core contributed steps of the proposed method include: 1. The affected lung region is segmented using a modified U-Net deep learning model. 2. Extracted hand-crafted deep features and selected optimized features using entropy are fused serially and supplied to the ensemble learning. Artificial intelligence (AI) approaches are commonly utilized in biomedical research (Kang et al., 2020; Saba, 2019; Saba, Bokhari, Sharif, Yasmin, & Raza, 2018; Salman, Ahmed, Khan, Raza, & Latif, 2017; Wang, Muhammad, et al., 2020; Wang, Sun, et al., 2020; Wang, Tang, et al., 2019; . AI methods are used in different applications that is, object detection, segmentation, and classification (Deepa, Devi,, & Technology, 2011) . COVID19 patients are suffered from pneumonia due to the RNA virus that infects the lung region (Lai, Shih, Ko, Tang, & Hsueh, 2020) . Deep learning is widely used to detect lung-infected regions (Shan et al., 2020) . Deep learning models are being used in three different ways such as pretrained model as fine-tuning, without fine-tuning, and trained from scratch. In the pretrained models, inceptionv3, VGG16, squeeze Net, Mobilenetv2, and ResNet-50 are widely used for pneumonia detection (Kassani, Kassasni, Wesolowski, Schneider, & Deters, 2020) . In the literature, a massive amount of work is done for COVID-19 detection; however, still, there is a need for an optimized features extraction/selection approach which helps for accurate detection of the COVID-19 (Albahri et al., 2020a (Albahri et al., , 2020b . Therefore, in this research hand-crafted and deep features are extracted and fused for better classification of COVID-19. The existing literature on COVID-19 are summarized in Table 1 . In this retrospective study, U-Net based semantic deep learning model is proposed for lung region segmentation, which is infected by an In the proposed method, hand-crafted and deep features are extracted for COVID-19 classification. The five hand-crafted features such as noise to the harmonic ratio (NHr), histogram orientation gradient (HOG), local binary patterns (LBPs), SFTA, and speeded up robust features (SURF) are extracted. The best features are selected based on the maximum scores using entropy and fused serially. The NHr (Kim, Moreau, & Sikora, 2006) features is extracted from the segmented region. NHr is a strong feature to identify the lesion region's variation because lung lesions have variable shape and size. The NHr features are represented mathematically as follows: where N (number of sample), M represents the maximum lag. The length of NHr features vector is 1 Â 128 and graphically illustrated in Figure 4 . Extracted speeded up robust features (SURF) features in terms of points, orientation, and scale from the input CT images having 1 Â 96 length that is used for the detection of interest points as illustrated in Figure 5 . The lesion regions have an irregular shape. Therefore, HOG (shapebased) features are extracted from the segmented lesion region having a vector length is 1 Â 8,100 and are shown in Figure 6 . The lung CT images having variant texture; therefore, LBP features are extracted from gray-level images. The feature vector length is 1 Â 59 and graphically shown in Figure 7 . 3.2.6 | SFTA SFTA features are extracted from the segmented lesion region, which creates its own segment pattern and helps classify. The length of the feature vector is 1 Â 22 and graphically shown in The features selection approach is presented using entropy as mentioned in Equation (2) extracted from input images. The complete process of the feature's extraction, selection, and fusion is shown in Figure 10 . Where, The vector of resultant fused feature is fed to classifiers. The ensemble classifiers are used with three kernels such as a boosted tree (Freund, 2009) , bagged tree (Breiman, 1996) , and RUSBoosted tree (Seiffert, Khoshgoftaar, Van Hulse, & Napolitano, 2008) . The ensemble classifiers are built by combining the week decision tree classifiers to create a powerful prediction model. Brief overview of the selected classifiers with the learning parameters is given in Table 3 . The proposed research is evaluated using three benchmark datasets. The proposed model is implemented on MATLAB core i7 CPU with Nvidia Graphic card. Two experiments are performed to test the proposed method performance. In Experiment 1, pixel-based segmentation results are evaluated with ground truth. In Experiment 2, classifications are performed using hand-crafted fused features and hand-crafted and deep features fusion. In this experiment, training/ testing data is divided on cross-validation of 10-fold. The performance of the proposed approach is assessed on various types of performance measurements described in Table 5 . In this experiment, segmented lesion region is evaluated with ground truth. IoU is mathematically expressed as: The segmentation model is trained on selected parameters, which are finalized after the extensive experiments alluded in Table 6 . The classification experiment is performed by extraction of the different hand-crafted features such as NHr, SFTA (Costa, Humpire-Mamani, & Traina, 2012) , HOG (Dalal & Triggs, 2005) , LBP (Ojala, Pietikainen, & Maenpaa, 2002) , and SURF (Bay, Ess, Tuytelaars, & Van Gool, 2008) , and at the same time informative features are selected using entropy and fused serially (1 Â 4055) vector length. That is provided to the ensemble method with three powerful learning models such as boosted, bagged, and the Rusboosted tree. The ensemble model prediction is graphically presented in the Figure 13 . The quantitative classification outcomes are presented in 1 Â 4,255 which is passed to the ensemble learning classifiers for classification. The results with class labels are presented in Figure 14 . In Figure 14 , d denotes the disease class and h represents the healthy. The model prediction is performed using the combination of classical and deep features as illustrated in Figure 15 . The empirical results are provided in Tables 11-13. the AUC and ROC is also plotted in Figure 16 . In Figure 16 , the maximum achieved AUC is 1.00 using the bag- Automated detection of Alzheimer's disease using brain MRI images-a study with various feature extraction techniques Systematic review of artificial intelligence techniques in the detection and classification of COVID-19 medical images in terms of evaluation and benchmarking: Taxonomy analysis, challenges, future solutions and methodological aspects Systematic review of artificial intelligence techniques in the detection and classification of COVID-19 medical images in terms of evaluation and benchmarking: Taxonomy analysis, challenges, future solutions and methodological aspects Computer Vision and Image Understanding Bagging predictors Computer-aided covid-19 patient screening using chest images (X-Ray and CT scans) An Efficient Algorithm for Fractal Analysis of Textures Histograms of oriented gradients for human detection A survey on artificial intelligence approaches for medical image classification Targeted Self Supervision for Classification on a Small COVID-19 CT Scan Dataset Sensitivity of chest CT for COVID-19 comparison to RT-PCR A more robust boosting algorithm Viral reverse engineering using artificial intelligence and big data COVID-19 infection with long short-term memory (LSTM) Delving Deep Into Rectifiers: Surpassing Human-level Performance on Imagenet Classification Deep Residual Learning for Image Recognition First case of 2019 novel coronavirus in the United States COVID-19 detection through transfer learning using multimodal imaging data Clinical features of patients infected with 2019 novel coronavirus in A heuristic neural network structure relying on fuzzy logic for images scoring Automatic Detection of Coronavirus Disease (COVID-19 Prediction of COVID-19-Pneumonia based on selected deep features and one class kernel extreme learning machine, computers & electrical engineering MPEG-7 audio and beyond: Audio content indexing and retrieval Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and corona virus disease-2019 (COVID-19): The epidemic and the challenges Multiresolution gray-scale and rotation invariant texture classification with local binary patterns Deep learning-based COVID-19 detection using CT and X-ray images: Current analytics and comparisons Real-time diagnosis system of COVID-19 using X-ray images and deep learning Automated lung nodule detection and classification based on multiple classifiers voting Computer vision for microscopic skin cancer diagnosis using hand-crafted and non-handcrafted features Machine learning techniques to detect and forecast the daily total COVID-19 infected and deaths cases under different lockdown types Fundus image classification methods for the detection of glaucoma: A review Artificial intelligence in bio-medical domain RUSBoost: Improving Classification Performance When Training Data is Skewed Lung infection quantification of covid-19 in ct images with deep learning Rethinking the Inception Architecture for Computer Vision Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirusinfected pneumonia in Wuhan Cerebral micro-bleeding identification based on a nine-layer convolutional neural network with stochastic pooling Cerebral micro-bleeding detection based on densely connected neural network Alcoholism identification via convolutional neural network based on parametric ReLU, dropout, and batch normalization Unilateral sensorineural hearing loss identification based on doubledensity dual-tree complex wavelet transform and multinomial logistic regression Contrastive cross-site learning with redesigned net for COVID-19 CT classification COVID-CTdataset: a CT scan dataset about COVID-19. ArXiv e-prints Covid-CT-dataset: A CT scan dataset about covid-19 An intelligence design for detection and classification of COVID19 using fusion of classical and convolutional neural network and improved microscopic features selection approach No specific funding received for this research. All authors declared that there is no conflict of interest. All authors are contributed equally in this manuscript. The data that support the findings of this study are openly available in (covid-CT-dataset) at (https://covid-19.conacyt.mx/jspui/ handle/1000/4157). https://orcid.org/0000-0001-6718-3866