key: cord-286887-s8lvimt3 authors: Nour, Majid; Cömert, Zafer; Polat, Kemal title: A Novel Medical Diagnosis model for COVID-19 infection detection based on Deep Features and Bayesian Optimization date: 2020-07-28 journal: Appl Soft Comput DOI: 10.1016/j.asoc.2020.106580 sha: doc_id: 286887 cord_uid: s8lvimt3 A pneumonia of unknown causes, which was detected in Wuhan, China, and spread rapidly throughout the world, was declared as Coronavirus disease 2019 (COVID-19). Thousands of people have lost their lives to this disease. Its negative effects on public health are ongoing. In this study, an intelligence computer-aided model that can automatically detect positive COVID-19 cases is proposed to support daily clinical applications. The proposed model is based on the convolution neural network (CNN) architecture and can automatically reveal discriminative features on chest X-ray images through its convolution with rich filter families, abstraction, and weight-sharing characteristics. Contrary to the generally used transfer learning approach, the proposed deep CNN model was trained from scratch. Instead of the pre-trained CNNs, a novel serial network consisting of five convolution layers was designed. This CNN model was utilized as a deep feature extractor. The extracted deep discriminative features were used to feed the machine learning algorithms, which were k-nearest neighbor, support vector machine (SVM), and decision tree. The hyperparameters of the machine learning models were optimized using the Bayesian optimization algorithm. The experiments were conducted on a public COVID-19 radiology database. The database was divided into two parts as training and test sets with 70% and 30% rates, respectively. As a result, the most efficient results were ensured by the SVM classifier with an accuracy of 98.97%, a sensitivity of 89.39%, a specificity of 99.75%, and an F-score of 96.72%. Consequently, a cheap, fast, and reliable intelligence tool has been provided for COVID-19 infection detection. The developed model can be used to assist field specialists, physicians, and radiologists in the decision-making process. Thanks to the proposed tool, the misdiagnosis rates can be reduced, and the proposed model can be used as a retrospective evaluation tool to validate positive COVID-19 infection cases. COVID-19, a new type of Coronavirus, has created a very critical chaotic situation, negatively affecting a large number of deaths and people's lives worldwide. It first appeared in Wuhan, China, in December 2019. It has spread to approximately 200 countries worldwide. In many countries, rulers and governments have taken new measures and created new lifestyles to combat COVID-19. Today's science and technology have made an extremely valuable contribution to the implementation of these new policies of states in this unknown and unpredictable process. As an example of technological developments, robots, and drones have been used to transport food and medicines to hospitals [1, 2] . While many researchers in the medical field develop vaccines to prevent the virus, many medicines and medical practices are being developed to heal infected patients and prevent them from passing on to others [3] . On the other hand, artificial intelligence and computer scientists have proposed and implemented real-life hybrid systems based on X-ray images and computed tomography (CT) to detect COVID-19. This artificial intelligence (AI) applications have been successfully applied in many areas [4] . The studies carried out in the literature and the studies carried out to give a more detailed description are given in the form of a table. Some studies and diagnostic methods regarding COVID-19 in the literature are briefly summarized below. In Y. Pathak et al. study [5] , they used Chest Computed Tomography (CT) images and Deep Transfer Learning (DTL) method to detect COVID-19 and obtained a high diagnostic accuracy. Mesut Toğaçar et al. proposed a novel hybrid method called the Fuzzy Color technique + deep learning models (MobileNetV2, SqueezeNet) with a Social Mimic optimization method to classify the COVID-19 cases and achieved high success rate in their work [6] . In the Ali Abbasian Ardakani et al. work [7] , they used the deep learning models including AlexNet, VGG-16, VGG-19, SqueezeNet, GoogLeNet, MobileNet-V2, ResNet-18, ResNet-50, ResNet-101, and Xception to diagnose the COVID-19 and compared them with each other with respect to the obtained classification accuracy. Ferhat Ucar et al. proposed a novel method called Deep Bayes-SqueezeNet based COVIDiagnosis-Net to classify the COVID-19 cases as the COVID-19 or normal (healthy) [8] . As for other work of Tulin Ozturk et al. [9] , they suggested a new method called the DarkCovidNet model for diagnosing the COVID-19 cases. Table 1 presents the conducted works regarding COVID-19 detection and diagnosis in the literature. Contributions of the proposed model can be listed as follows: (1) CNNs with rich filter family, convolution, abstraction, and weight sharing have ensured an effective deep feature extraction engine. (2) The deep features extracted from deep layers of CNNs have been applied as the input to machine learning models to further improve COVID-19 infection detection. (3) As a result, a cheap, fast, and reliable intelligence tool has been provided for COVID-19 infection detection. (4) The developed model can be used to assist the field specialists, physicians, and radiologists in the decision-making process. (5) Thanks to this study, the misdiagnosis rates can be reduced, and the proposed model can be used as a retrospective evaluation tool. The rest of this study is organized as follows: the dataset and the related methods are presented in Section 2. The results are reported in Section 3. A discussion is presented in Section 4, and lastly, concluding remarks are given in Section 5. Not only the structures of the samples in a database but also the distribution of the recordings among the classes have a great impact on the model to be developed. The morphological features, color, shape, and texture-based features directly affect the achievements of the intelligence computer-aided models [16] . Besides, it is important to ensure an equal number of samples, which cover all situations or cases for each class to produce a consistent and robust model. Recently, many studies have pointed out that chest CT images can be a vital evaluation means for diagnosing COVID-19 infection [6] [7] [8] [9] . Several specific patterns, including bilateral, peripheral and basal predominant ground-glass opacity (GGO), multifocal patchy consolidation, crazy-paving pattern with a peripheral distribution observed on chest CT images have been adopted as the findings of COVID-19 infection [17] [18] [19] . A subsample of the recordings belonging to COVID-19, normal and viral Pneumonia classes is shown in Fig. 1 . An open-access database that covers the posterior-to-anterior chest X-ray images was used in this study [20] . In fact, the COVID-19 Radiology database was generated by collecting the samples from four different resources. In other words, the samples collected from the Italian Society of Medical and Interventional Radiology (SIRM) COVID-19 Database [21] , Novel Corona Virus 2019 Dataset [22] , COVID-19 positive chest X-ray images from different articles and lastly chest X-ray [23] pneumonia images were combined. Totally 2905 images are presented with three classes in this database, as shown in Table 2 . CNNs are architectures consisting of a large number of sequenced layers. Layers that perform different functions are used in these architectures to reveal the distinctive features of the data applied as input [24] . In general, the tasks of these layers can be summarized as follows: (1) Convolution layer: This layer is the main building block of CNN architectures, and it is used to reveal the discriminative features of the input data. This layer applies some filter families to the data so as to reveal low and high-level features in the data [25] . After the convolution process, the size of the input data changes. These charges vary depending on the stride and padding. The outputs of the convolution layers are called activation maps and defined as follows: (1) The convolution process is defined as in Eq. (1). Herein, the previous layers are shown with , the learnable kernels are and the bias term is . matches the input map section. (2) Non-linearity layer: The convolution layer is ordinarily followed by the nonlinearity layer. This layer gives the system a non-linearity feature and called the activation layer. Since the neural network acts as a single perceptron, the outputs of the neural network can be calculated using linear combinations, so activation maps are used J o u r n a l P r e -p r o o f Journal Pre-proof [26] . To this aim, the most commonly used activation function is Rectifier (ReLU), and it is defined as follows: (2) (3) Pooling (Down-sampling) layer: This layer is often added between consecutive convolutional layers to reduce the number of the computational nodes. Average pooling, maximum pooling, and L2-norm pooling are used frequently. (4) Flatting layer: This layer collects the data in a single vector and prepares the data for the neural network. (5) Fully-connected layers: This layer is used to transfer the activations that are obtained by passing the data throughout the network for the next unit. Fully connected layers are located at the end of the architecture to ensure the connections between all activations and computational nodes in these layers [27, 46, 47, 48] . These layers are exploited when the CNNs are used as the feature extractors. Table 3 . J o u r n a l P r e -p r o o f Journal Pre-proof Offline or online data augmentation techniques can be used to realize a more efficient training for the computational models [24] . However, it is essential to be aware that the data augmentation techniques should not be used on the test set because of the overfitting problem. In the experiment, the whole data set was divided into two parts as the training and test sets with 70% and 30% rates, respectively. The distribution of the samples over the classes is imbalanced. To overcome this issue, the data augmentation approach has been used. To this aim, we focused on only the COVID-19 class since the number of samples in this class was lower compared to other classes, as shown in Table 3 . The overall block diagram of the proposed model is given in Fig. 4 . The whole dataset is divided into two sets as training and test sets with 70% and 30% rates, respectively. Only the number of samples in the COVID-19 class is increased by using the offline data augmentation approach, and then the proposed CNN model is trained and tested. Then, the deep features extracted from the proposed CNN model is considered. A combination of deep feature extraction and machine learning techniques are utilized to achieve a consistent and robust diagnosis model for COVID-19 infection diagnosis. Three different classification algorithms have been used to detect COVID-19 infection detection in this study. These classification algorithms are different in structure and have high performance. Each classifier algorithm was trained and tested using the 70-30% training and testing data partition. The used classifier algorithms were explained in the following subsections. Support vector machines (SVM) is a consulting machine learning algorithm that can be J o u r n a l P r e -p r o o f Journal Pre-proof (3) Where is the distance between data points of and . For more information about the multi-class-SVM classifier, the readers can refer to [28] [29] [30] . The decision tree classifier is used to solve simple and mostly classification problems. Applies the correct way to solve the classification problem. The decision tree classifier has a structure consisting of roots, leaves, and branches descending from top to bottom. The most used decision tree classification algorithms are ID3, C4.5, and C5. In our applications, we have used the C4.5 decision tree classifier. For more information about the decision tree classifier, the readers can refer to [31-33, 44, 45, 49 ]. The -NN ( -nearest neighbor) algorithm is one of the simplest and most widely used classification algorithms. NN is a non-parametric, lazy learning algorithm. Unlike eager learning, if we try to understand the concept of lazy, lazy learning does not have a training phase. It does not learn the training data; instead, it "memorizes" the training data set. When we want to make an estimate, it looks for the nearest neighbors in the whole dataset [34] . In the study of the algorithm, a k value is determined. The meaning of this value is the number of elements to be looked at. When a value arrives, the distance between the incoming value is calculated by taking the nearest element. The Euclidean function is generally used in distance calculation. As an alternative to the Euclidean function, City Block, Minkowski, and Chebychev functions can also be used [35] . After the distance is calculated, it is sorted, and the incoming value is assigned to the appropriate class. The parameters in the NN classifier have been optimized by using the Bayesian optimization method in our study. To evaluate the proposed model, we have used the confusion matrix, and some commonly The experiments were carried out on a workstation with Intel ® Xeon ® Gold 6132 CPU @2.60 GHz and NVIDIA Quadro P6000 GPU. The simulation environment was MATLAB As for a prediction, the confusion matrix is given in Fig. 6 (a) . As mentioned before, the test set was separated and frozen at the starting of the experiment. The number of samples belonging to the COVID-19 class in the test set was 66. 59 of these samples were identified correctly by the proposed CNN model. The rates of the classification achievements for normal and viral pneumonia cases were rather satisfactory. The final validation accuracy and final validation loss were 97.25% and 0.2032, respectively. The Se, Sp, and F-score were achieved as 94.61%, 98.29%, and 95.75%, respectively. The ROC curves of the proposed CNN model are also presented in Fig. 6 (b) . The AUCs were obtained as 0.9942, 0.9956, 0.9955 for COVID-19, normal, and viral pneumonia cases, respectively. As a result, an efficient CNN model ensured for diagnosis of COVID-19 infection. In the second step of the experiment, we focused on the activation maps in the proposed CNN architecture. These activation maps with different levels keep the discriminative features of the input data and finally collected in the fully connected layers. The activations may help us to understand what the model has learned. A visual representation of the activation maps is given in Fig. 7 . and set to 6 for fc1 deep feature, as shown in Fig. 9 (e). The Acc, Se, Sp, and F-score were 93.35%, 90.55%, 96.29%, and 90.06%, respectively. In addition, the best estimated feasible point considering the DT algorithm was 675 for fc2 deep feature set, as shown in Fig. 9 (f). The Acc was 96.10%, Se was 93.81%, Sp was 97.70% and F-score was 94.56%. All scores of the classifiers are reported in Table 4 , considering the two different deep feature sets. The SVM classifier was superior to NN and DT machine learning algorithms. It was seen that the SVM model ensured an improvement in the automated COVID-19 infection detection task. Unlike it was observed that the classification achievement was lightly decreased when the classification task was realized by NN and DT. In this section, we evaluate the superior aspects as well as the limitations of the proposed model by taking into account the state-of-art models. However, it is important to be aware of a one-to-one comparison is not feasible due to differences in datasets, methods, and various simulation environments. in these datasets were collected from different resources, as inferred from Table 5 . Recently, it is seen that the scientific community has focused on chest X-Ray images in order to contribute to the clinical evaluation of COVID-19 cases that have increased day by day. Many computational models based on CNN architecture have been proposed. The greatest advantage of these models is that they provide an end-to-end learning scheme by isolating handcrafted feature engine. To this aim, the transfer learning approach has been generally adopted to train the CNNs. Some of the computational studies have been focused on the deep features provided by the pre-trained models [39] . In this aspect, our study offers a novel CNN model that was trained from scratch, not a transfer learning approach. Also, instead of using pre-trained CNNs, fully-connected layers in the proposed architecture were considered, examined, and used for the COVID-19 infection detection task. Our study contains the innovative components in this respect. Besides, the proposed model works according to the end-to-end learning principle, and a handcrafted feature extraction engine is it can be argued that the database is not large enough. However, we think that there is nothing to worry about this issue. Because the performances of the CNN networks increase depending on the scale of the number of samples used in the training process, in such a case, it is only necessary to consider the calculation time and hardware resources. Another important issue is that when the positive COVID-19 cases are detected using X-ray images, the infection may have already significantly advanced. In other words, X-ray images may be a very significant means to confirm positive COVID-19 cases, but may not be clinically relevant for early diagnosis. General public health, global economy, and our routine life continue with new norms with the effect of COVID-19. The number of people affected by this infection is still increasing significantly. In this study, an automated COVID-19 diagnostic system has been proposed to contribute to clinical trials. The proposed model is based on the CNN architecture, and it is trained from scratch, as opposed to the transfer learning approach. Thanks to its convolution with rich filter families, abstraction, and weight sharing features, it automatically provides highly efficient deep, distinctive features. Thus, the handcrafted feature extraction engine is not performed. As a result, positive COVID-19 cases can be detected easily and with high sensitivity via the proposed tool using chest X-ray images. As a result of this study, a cheap, fast, and reliable diagnostic tool was obtained. The model provided an accuracy of 98.97%, a sensitivity of 89.39%, the specificity of 99.75%, and Fscore of 95.75%. When it is evaluated clinically, the developed model can support the decision-making processes of field specialists, physiologists, and radiologists. With this model, the misdiagnosis rate can be reduced, and positive COVID-19 cases can be detected quickly without having to wait for days. Estimation of COVID-19 prevalence in Italy Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Death and contagious infectious diseases: Impact of the COVID-19 virus on stock market returns AI-assisted CT imaging analysis for COVID-19 screening: Building and deploying a medical AI system in four weeks Deep Transfer Learning based Classification Model for COVID-19 Disease COVID-19 detection using deep learning models to exploit Social Mimic Optimization and structured chest X-ray images using fuzzy color and stacking approaches Application of deep learning technique to manage COVID-19 in routine clinical practice using CT images: Results of 10 convolutional neural networks COVIDiagnosis-Net: Deep Bayes-SqueezeNet based diagnosis of the coronavirus disease 2019 (COVID-19) from X-ray images Automated detection of COVID-19 cases using deep neural networks with X-ray Predicting the growth and trend of COVID-19 pandemic using machine learning and cloud computing An automated Residual Exemplar Local Binary Pattern and iterative ReliefF based corona detection method using lung X-ray image Diagnosis of Coronavirus Disease 2019 (COVID-19) with Structured Latent Multi-View Representation Learning A Weakly-supervised Framework for COVID-19 Classification and Lesion Localization from Chest CT Deep Learning COVID-19 Features on CXR using Limited Training Data Sets Data Augmentation Using Auxiliary Classifier GAN for Application of breast cancer diagnosis based on a combination of convolutional neural networks, ridge regression and linear discriminant analysis using invasive breast cancer images processed with autoencoders Essentials for Radiologists on COVID-19: An Update-Radiology Scientific Expert Panel Clinical characteristics and diagnostic challenges of pediatric COVID-19: A systematic review and meta-analysis CO-RADS -A categorical CT assessment scheme for patients with suspected COVID-19: definition and evaluation COVID-19 Radiology Database. Can AI Help Screen Viral Radiology IS of M and I. Italian Society of Medical and Interventional Radiology n COVID-19 Image Data Collection Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning Convolutional neural network approach for automatic tympanic membrane detection and classification BrainMRNet: Brain tumor detection using magnetic resonance images with a novel convolutional neural network model Waste Classification using AutoEncoder Network with Integrated Feature Selection Method in Convolutional Neural Network Models Computer-aided diagnosis system combining FCN and Bi-LSTM model for efficient breast cancer detection from histopathological images Support-vector networks A Comprehensive Survey on Support Vector Machine in Data Mining Tasks: Applications & Challenges Support Vector Machines for Classification BT -Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers A survey of decision tree classifier methodology Decision Trees Instance-based learning algorithms Nearest Neighbours without k: A Classification Formalism based on Probability An Improved k-Nearest Neighbor Classification Using Genetic Algorithm Application of Deep Learning for Fast Detection of COVID-19 in X-Rays using nCOVnet Covid-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-Ray Images Detection of Coronavirus Disease (COVID-19) Based on Deep Features COVIDX-Net: A Framework of Deep Learning Classifiers to Diagnose COVID-19 in X-Ray Images Automatic Detection of Coronavirus Disease (COVID-19) Using X-ray Images and Deep Convolutional Neural Networks Deep learning Enables Accurate Diagnosis of Novel Coronavirus (COVID-19) with CT images A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19) Binary particle swarm optimization (BPSO) based channel selection in the EEG signals and its application to speller systems Deep Learning Applications for Hyperspectral Imaging: A Systematic Review Hybrid Computerized Method for Environmental Sound Classification Time-Frequency Representation and Convolutional Neural Network based Emotion Recognition Robust Approach Based on Convolutional Neural Networks for Identification of Focal EEG Signals Surface EMG signals and deep transfer learningbased physical action classification