key: cord-0780969-t2onsn7e
authors: Gayathri, J.L.; Abraham, Bejoy; Sujarani, M.S.; Nair, Madhu S.
title: A computer-aided diagnosis system for the classification of COVID-19 and non-COVID-19 pneumonia on chest X-ray images by integrating CNN with sparse autoencoder and feed forward neural network
date: 2021-12-14
journal: Comput Biol Med
DOI: 10.1016/j.compbiomed.2021.105134
sha: d57a99f7c248ccfe71e92928c8c85a59fbe7bc4a
doc_id: 780969
cord_uid: t2onsn7e

Several infectious diseases have affected the lives of many people and have caused great dilemmas all over the world. COVID-19 was declared a pandemic caused by a newly discovered virus named Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) by the World Health Organisation in 2019. RT-PCR is considered the golden standard for COVID-19 detection. Due to the limited RT-PCR resources, early diagnosis of the disease has become a challenge. Radiographic images such as Ultrasound, CT scans, X-rays can be used for the detection of the deathly disease. Developing deep learning models using radiographic images for detecting COVID-19 can assist in countering the outbreak of the virus. This paper presents a computer-aided detection model utilizing chest X-ray images for combating the pandemic. Several pre-trained networks and their combinations have been used for developing the model. The method uses features extracted from pre-trained networks along with Sparse autoencoder for dimensionality reduction and a Feed Forward Neural Network (FFNN) for the detection of COVID-19. Two publicly available chest X-ray image datasets, consisting of 504 COVID-19 images and 542 non-COVID-19 images, have been combined to train the model. The method was able to achieve an accuracy of 0.957 8 and an AUC of 0.982 1, using the combination of InceptionResnetV2 and Xception. Experiments have proved that the accuracy of the model improves with the usage of sparse autoencoder as the dimensionality reduction technique.

The novel coronavirus outbreak was reported by officials in December 2019 in Wuhan City, China. The virulence of COVID-19 has affected more than 200 million lives and killed more than four million people across the world [1]. The Reverse Transcription-Polymerase Chain Reaction (RT-PCR) test is treated as the golden standard for detection of the SARS-CoV-2 virus. The rapid increase in the number of patients, and the lack of sufficient RT-PCR test facilities in several parts of the world causes delay in testing and detection of the disease. Accessible, fast and affordable methods could play an important role in the diagnosis of the disease. Radiographic methods are easily available and affordable compared to RT-PCR test. A computer-aided diag-nosis method using X-ray images could assist the medical practitioners in the detection of COVID-19 at an early stage.

Modalities for COVID-19 detection include Computed tomography (CT), X-Ray and Ultrasound imaging. COVID-19 manifests common abnormal X-ray findings including ground glass and consolidative opacities in the peripheral lung regions, nodular opacities and bilateral patchy and confluent patterns. Some image findings include small amounts of pleural effusions, which are uncommon.Viral pneumonia exhibits patches in bilateral areas of consolidation, thickening of bronchial walls, bilateral consolidations, and ground glass opacities or centrilobular nodules poorly defined. Since COVID-19 and other pneumonia share some of these X-ray characteristics, it is not easy to differentiate the COVID-19 images from other pneumonia. Typical radiological COVID-19 pattern including bilateral peripheral or focal round ground glass opacities with or without consolidation differentiates COVID-19 from other pneumonia [2] .

However, manual examination of the image modalities is time-consuming as the number of cases are increasing day by day. The application of machine learning in the biomedical field can assist physicians in the computer-aided diagnosis of medical images efficiently and effectively. CAD systems aid radiologists to expound the medical images. Hence computer-aided detection could assist the radiologists in distinguishing COVID-19 infected radiographic images. CAD systems using Artificial Neural Networks (ANNs) and Deep Learning (DL) have shown tremendous success in the field of medical data analysis [3] . Deep learning technologies widely used in disease diagnosis include CNNs, autoencoder, Deep Belief Networks (DBN) and Generative Adversarial Network (GAN) [3] . Several works using ANNs and Deep Learning has been published in the application of disease diagnosis, including detection of interstitial lung disease [4], depression screening [5], schizophrenia [6], ECG arrhythmias classification [7] , and ischemic heart disease [8] .

Various works related to computer-aided detection of COVID-19 was published. Abraham et al. [9] [10] developed models comprising an ensemble of CNNs to detect COVID-19. Ardakani et al. [11] discussed the usage of ten convolutional networks on COVID-19 detection. Horry et al. [12] highlight the use of different image modalities to help faster detection of the disease. Shaban et al. [13] adopted a new methodology for feature selection by integrating filter and wrapper methods and classifying using an ensemble learning technique. Phankokkruad et al. [14] developed a transfer learning technique that involves fine-tuning of the pre-trained network. Rekha Hanumanthu et al. [15] discussed different deep learning and transfer learning methods adopted for the early diagnosis of the disease. Wang et al. [16] adopted a method where the features are extracted using a UNet, and later, the classification was performed using a progressive classifier. Hassantabar et al. [17] used a convolutional neural network, where the softmax layer helps detect SARS-COV2 infection. Rahimzadeh and Attar et al. [18] used transfer learning methodology, which involves fine-tuning a concatenation of Xception and Resnet50V2 for diagnosing COVID-19. Phankokkruad et al. [14] implemented a model involving transfer learning experimented on three different pre-trained networks such as VGG16, Xception and InceptionResnetV2. Ucar and Korkmax et al. [19] presented an Artificial Intelligence structure based on Squeezenet pre-trained network accompanied by Bayesian optimization. Li et al. [20] explored a multi-task contrastive learning for COVID-19 diagnosis. The contrastive learning task has been implemented using supervised neural networks. The method involves aggregation through contrastive loss. Pandit et al. [21] adopted fine-tuning of VGG-16 network for the diagnosis of the COVID-19 from chest radiographs. Chandra et al. [22] utilized an ensemble learning methodology for the detection of coronavirus. The methodology involves majority voting from different weak learners. Saufi et al. [23] used stacked sparse autoencoders to extract features from X-ray and CT scans for the detection of COVID-19. Lazrag et al. [24] explored wavelet analysis for feature extraction followed by autoencoder for feature modelling to detect COVID-19. Behura et al. [25] used XGBoost and sparse autoencoder for feature selection and classification. Ismael and Sengur et al. [26] developed a model involving classification using Support Vector Machine(SVM) prospecting the features extracted from X-ray using Resnet50 for the diagnosis of COVID-19. Toraman et al. [27] accomplished a methodology for discovering COVID-19 infections using capsule networks utilizing X-ray lung imaging.

Most of the existing works have used Convolutional Neural Networks for the detection of COVID-19. No methods have explored the combination of CNN with sparse autoencoder for the diagnosis of COVID-19. The proposed method chose to explore sparse autoencoder as a dimensionality reduction method. Sparse autoencoder has been found successful in the field of disease diagnosis, including Alzheimer's [28] , Parkinsons disease [29] , heart disease [30] , identification of neonatal sleep state [31] , glaucoma [32] , cerebral microbleeds [33] , etc to name a few. Sparse autoencoder enforces the sparsity constraint directing the single layer network for code learning resulting in error minimization while code reconstruction [34] . The sparsity penalty imposed on the hidden layers on top of the reconstruction error eliminates overfitting [35] . Sparse representation of data has benefits in denoising robustness and improved classification performance in high dimensional latent spaces [36] . The proposed work has the following contributions.

• The method uses an ensemble of Xception and InceptionResnetV2 for feature extraction. The features are passed to a custom made sparse autoencoder for reducing dimensionality of feature vector, followed by a Feed Forward Neural Network (FFNN) for classification. No state-of-theart methods have employed such a pipeline for the detection of COVID-19.

• The proposed method explores neural network techniques for all stages of computer-aided diagnosis of the disease, namely feature extraction, dimensionality reduction and classification. The technique utilizes neural networks, namely, CNN, Sparse autoencoder, and FFNN, for feature extraction, dimensionality reduction, and classification. Most of the existing methods have either performed transfer learning using CNN or used nonneural network methods for feature selection and classification combined with features extracted using CNN.

The method has chosen the ensemble of CNNs, sparse autoencoder and FFNN empirically based on experimental analysis. The experiments we performed prove the effectiveness of the novel framework composed of CNN, sparse autoencoder, and FFNN in diagnosing COVID-19. The rest of the paper is organized as follows. Section 2.1 discusses datasets used for training the model. Section 2.2 describes the architecture of the proposed model. Section 3 is a result analysis phase utilizing single-CNN and different combinations of pretrained networks. The section also gives an overview of the comparison of the proposed model with other classifiers and dimensionality reduction techniques. Section 4 presents the conclusion of our work based on the result analysis phase.

Two publicly available datasets have been used to train the model. The first dataset is a public dataset by Cohen et al. [37] , available in Github , consisting of both CT and Chest X-ray images of COVID-19 observations, other types of pneumonia and healthy patients. From this dataset X-ray images are filtered for training the model. The dataset consists of 783 X-ray images, among which 504 are COVID-19 images and 279 are non-COVID-19 images. The second dataset is a public open data set from Kaggle created by Paul Mooney [38] . The dataset consists of 390 chest X-ray images of bacterial and other viral pneumonia. It was constructed before the COVID-19 outbreak.

A balanced dataset is essential for building an effective model [39] . To balance the dataset for achieving an effective model, the first 263 X-ray images of Pneumonia affected patients have been extracted from the second dataset. The combined dataset consists of 504 COVID-19 images and 542 non-COVID-19 images. The non-COVID-19 images consist of both normal and pneumonia images.

The proposed architecture is divided into three phases. The model consists of feature extraction, dimensionality reduction and the classification phase.

Feature extraction phase reduces the dimension of the initial raw dataset into manageable groups for optimizing the processing. Many of the recent research studies have worked on models based on pre-trained networks as a feature extractor [9] . In the proposed model, pretrained networks are used as a feature extractor. Pretrained networks are trained on an Imagenet database [40] consisting of 1000 image classes. Even though trained on non-biomedical images, pre-trained CNNs in combination with off-the-shelf classifiers were found successful in the detection of a wide range of diseases from X-ray images, including tuberculosis [41] , breast cancer [42] and pneumonia [43] . The convolutional layers built on top of each other, learn more complex features for reliable classification tasks. Automated feature extraction by CNN makes these networks highly efficient for classification tasks.

In the proposed model, images are pre-processed according to the input size in the input layer of the chosen pre-trained model, and then the dataset is fed into the network. Both single-CNN and multi-CNN has been utilized for the analysis. InceptionResnetV2 [44] , Xception [45] , EfficientnetB0 [46] , Darknet-53 [47] , Resnet101 [48] are used for the experimentation. The input size of different pre-trained networks differs in size. Table 1 denotes some of the pre-trained networks, their depth and their input size used for the analysis. The images are pre-processed to the respective input sizes of the pre-trained models before the feature extraction phase. The dataset used for the proposed model consists of of 1046 instances.

The method has used CNN as a feature extractor and not as a transfer learning method, where parameters of an end-to-end pre-trained CNN are fine-tuned to suit the data at hand. While using pre-trained CNN as a feature extractor, activations from any of the deep layers except softmax layer can be used as features for classification using an off-the-shelf classifier like FFNN. The layer from which features are to be extracted is a design choice [49] [50] . However, selecting features from the fully connected layer right before the softmax classification layer is a good option [49] [50] . The activations of the last fully connected layer represent global feature representation of the image [51] [52] . Another function of the last fully connected layer is dimension reduction [53] . Softmax layer output the vector of probability values of an input image belonging to one among the 1000 classes and hence it cannot be used as a feature extractor.

In the study by Abidin et al. [54] , features extracted from the last fully-connected layer outperformed features from the other layers. Several research works have used the last fully-connected layer to extract features for classification using an off-the-shelf classifier [ [64] . Based on the above-mentioned reasons, we have chosen activations of the last fully-connected layer as the feature vector. The output set after the feature extraction phase is a feature set with dimension 1046 ×1000.

The CNN includes three basic layers: convolution layer, pooling layer and a softmax layer. The center of the convolutional neural network is the convolution layer. Convolutional operation is performed in this layer which is the linear multiplication of the filter mask and the input array image to produce a feature map. Consider f (x, y) as the input image and h(x, y) be the filter mask. Convolution operation can be mathematically expressed as :

An activation function is applied to the output of the convolution layer. The activation function used is ReLU. The next block after the convolution layer is the pooling layer. The most commonly used pooling layer is max-pooling layer. The pooling layer reduces the number of parameters used for the computation of the network. The final layer of the convolutional neural network is the classification layer, where the instances are classified according to the respective classes. For multi-CNN the features are extracted from two CNN models and then concatenated to produce a new feature set. The dimension of the feature set is n×1000m, where n is the number of X-ray images and m is the number of pre-trained CNN models. 

Dimensionality reduction process reduces the dimension of feature vector by eliminating the features that will less contribute to the predictor variable. The presence of these irrelevant features may result in a decrease in the overall performance of the model. The proposed model uses sparse autoencoder for dimensionality reduction. Autoencoder is a network that imposes a bottleneck architecture, representing the input image in a compressed knowledge representation form [65] . The network follows an unsupervised learning technique for the task of representation learning [66] .

The basic idea of the autoencoder is that it encodes the input sensor data using its hidden layer and outputs the best feature expression. The concept of autoencoder lies in taking in an unlabelled set and framing it as a supervised problem to get an outputx, which is a reconstruction of the original input x. The amount of information that traverses in the whole network is constrained in the bottleneck that drives a learned compression of the input image. Only the variations in the input data are maintained by the model for avoiding the redundancies. An autoencoder is composed of an encoder and a decoder. An encoder maps the input to a latent space using encoder activation. Later the input is reconstructed by the decoder using a decoder activation function. The activation of a basic autoencoder is represented as:

where E x denotes the pre-activation values, l is the squared loss function, g e denotes the encoder activation function, and g d indicates the decoder function. The 4 J o u r n a l P r e -p r o o f 

where L(x,x), n, ρ and w i j denotes the reconstruction error loss, number of training samples (X-ray images), weight decay parameter and the weight at (i, j) th location, respectively. Different kinds of activation autoencoders are available, among which our proposed model uses sparse autoencoder. Sparse autoencoder constructs a loss function, and the network is allowed to learn encoding and decoding, which relies only on a small number of neurons. Rather than regularizing the activations, sparse autoencoders regularizes the weights of the network. Activations of different nodes of the neural network are data-dependent, as different inputs will activate different neurons.

Sparse autoencoders learn patterns by imposing sparsity constraints on the hidden layers [66] . The difference between the sparse autoencoder and the basic autoencoders lies in the cost function. Global regularization is used to solve the main objective function, whereas the sparsity penalty solves the trivial identity mapping and overfitting. In the model, features extracted using multiple pre-trained networks of the dimension 1046 × 1000m, where m denotes the number of pre-trained networks is reduced to most relevant features of the dimension 1046 × 1000 using the sparse encoder for improving the performance of the model.

The model utilizes Feed Forward Neural Network for classifying the predictor variables. Feed Forward Neural Networks were found successful in a wide range of medical applications including Alzheimer's disease [67] , chronic kidney disease [68] and lung cancer detection [69] to name a few. The wide usage of Feed Forward Neural Network in pattern classification is due to their prediction capability regardless of the probability distribution information of distinct labels. These networks gain efficiency from their parallel structure and their ability to improve their performance by experience. Hence, they can be used to efficiently classify the observations into different classes [67] . They can store the information in the network with less fault tolerance capacity. Advanced developments have proved these networks as the function approximators as they can approximate any arbitrary functions by fine-tuning the number of hidden layers and their parameters [70] .

The network includes connections, with each of the links designated to different weights. The information flow in the network exists only in one direction. The output of the previous layer serves as the input for its successive layer. The output of a neuron is represented as the weighted sum of its inputs. Equation 4 depicts the weighted sum of inputs of a layer.

where ρ k p represents the weighted sum of inputs, m k−1 represents the number of nodes in (k − 1) th layer, r k−1 q represents the output of the p th node in (k−1) th layer and w pq denotes the weight of the link. Equation 5 denotes the output of a layer.

where h represents the output function and p ranges from 1 to m k . Error loss is computed from the predicted and actual output by reforming the connection weights. Equation 6 depicts the error loss.

where δ represents the error loss, r p and t p are the predicted and actual outputs, respectively. Accounting the error loss, the connection weights are updated. The weights are updated such that the δ values are minimized. In each epoch of training, the weight reformation process happens for minimized error loss. Weight reformation takes place from the last layers proceeding towards the lower layers. Figure 1 shows the diagrammatic representation of the model. The set f 1 , f 2 ...., f n denotes the feature set extracted by the pre-trained networks, f 1 , f 2 ...., f n denotes the features after reducing dimensionality and h 1 , h 2 , h 3 , h 4 , ....., h 10 denotes the 10 nodes in the hidden layer of FFNN. These features are fed as input to the Feed Forward Neural Network.

The experimentation was performed on Intel core i5 processor with GPU support of 4 GB and 8 GB RAM. The model has been implemented using MATLAB. Neural network toolbox by Jingwei Too [71] is used for implementing FFNN. Accuracy, F1-Score, Precision, Specificity, Sensitivity, Area Under Curve(AUC) and Matthews Correlation Coefficient (MCC) have been computed to evaluate the model.

The experiments were performed on both single pretrained networks and concatenation of multiple pretrained networks. For single CNN, the features were extracted and passed to Feed Forward Neural Network. Ten-fold cross-validation has been performed for analysing the model. Random partitioning of data into 10 equal folds with 9 folds of data treated as training and the remaining 1 fold serves as the testing dataset at each iteration. A single iteration takes 90% of data for training and 10% for testing. Analysing the model using a single pre-trained network as the feature extractor and a Feed Forward Neural Network as an off-the-shelf classifier exhibits good performance. The pre-trained networks used for the analysis phase are EfficientnetB0, Resnet101, Darknet-53, InceptionResnetV2 and Xception. The last fully connected layers of the pre-trained networks have been used as the feature extractor, which outputs a feature set of dimension 1046 × 1000.

EfficientnetB0, Resnet101, Darknet-53, Inception-ResnetV2 and Xception was able to achieve an accuracy of 0.9321, 0.9216, 0.86043, 0.9149 and 0.9081, respectively. It was noticed that EfficientnetB0 achieved the highest performance among the five networks used for training. The specificity, sensitivity, precision, F1-Score and AUC of EfficientnetB0 was 0.9452, 0.9188, 0.9305, 0.9425 and 0.9756, respectively. Out of 504 COVID radiographic images, 475 instances were correctly classified by the FFNN. Among 542 non-COVID-19 images, 500 instances were correctly classified. Figure 2 represents the graphical analysis of the model performance using single-CNN without using the dimensionality reduction module. Table 2 indicates the performance of the model using the Feed Forward Neural Network as the off-the-shelf classifier and without performing dimensionality reduction using sparse autoencoder. Using multi-CNN, the greatest performance was achieved using the combination of Xception and EfficientnetB0 with an accuracy of 0.9301. Out of 542 instances, 461 instances were predicted correctly. Figure 3 depicts the graphical analysis of the multi-CNN model without using the dimensionality reduction module.

Dimensionality reduction using sparse autoencoder was performed in the second phase of analysis, which shows an improved result. Table 3 indicates the model's performance while using Sparse autoencoder and Feed Forward Neural Network. The best performance with an accuracy of 0.9578, F1-Score of To ensure that the results are statistically significant, p-value based on chi-square has been computed for the best performing model of InceptionResnetV2 and Xception. A p-value less than 0.00001 is achieved that show that the results are statistically significant at p less than 0.05. Statistical model evaluation test, Matthews Correlation Coefficient (MCC) measure has also been analysed for estimating model performance. MCC values close to one account for a strong correlation between predicted and the actual class. The highest MCC value of 0.9158 exhibited by the combination of Xcep-tion and InceptionResnetV2 denotes that the model is worth for distinguishing COVID-19 and non-COVID-19.

Xception networks denote extreme inception, with the inception architectures as the backbone of these networks. The convolutions in the original inceptions modules are restored with depthwise separable convolutions in Xception networks. This correlation scanning of 2D followed by 1D mapping is easier and more effective than full 3D mapping [72] . InceptionResnetV2, on the other hand, is an updated inceptionv3 network 8 J o u r n a l P r e -p r o o f capable of better performance achievement than other convolutional networks. A combination of the inception block followed by the residual block in the architecture and the shortcut connections adds to the performance enhancement of the model. Thus the concatenation of these two efficient networks results in improved feature set generation and improved results.

The combination of InceptionResnetV2 and Resnet 101 achieved the highest sensitivity of 0.9644 and the combination of Xception and Resnet101 achieved the highest AUC of 0.9893. Even though the abovementioned combinations achieved better sensitivity and AUC, the combination of InceptionResnetV2 and Xception outperformed the other combinations in all other performance measures. The experiments were repeated multiple times to ensure the stability of the results. As seed values were used to generate random weights and cross-validation folds, we could reproduce the same results each time.

The results show that the accuracy of the model has been improved with sparse encoder dimensionality reduction technique. However, no pre-trained CNNs were able to achieve 100% accuracy. Few X-ray images were misclassified in all the methods. The major cause of false positives and false negatives are the similarities in X-ray images of COVID-19 and pneumonia images, which makes the accurate prediction difficult. Among 504 COVID-19 instances, 482 images were correctly classified and the remaining 22 images were misclassified using the best performing combination of Xception and InceptionResnetV2 with sparse autoencoder and FFNN. Out of 542 non-COVID-19 images, 520 images were correctly classified, and 22 images were misclassified.

The statistical MCC test has proven that the model has improved its efficiency by incorporating sparse au-toencoder technique. Figure 4 represents the graphical analysis of the model after performing dimensionality reduction using sparse autoencoder. Figure 5 presents the graphical analysis of the accuracy comparison, integrating dimension reduction and without dimension reduction methodology.

To compare the performance of deep pre-trained CNN with shallow network, we further performed feature extraction using a single hidden layer sparse autoencoder and classification using FFNN. Feature extraction using the shallow sparse autoencoder and classification using FFNN achieved accuracy, specificity, sensitivity, f1-score, precision and AUC of 0.8528, 0.8345, 0.8755, 0.8412, 0.8095 and 0.9217, respectively. The proposed method, which used deep pretrained CNNs for feature extraction significantly outperformed feature extraction using shallow sparse autoencoder.

For training the model using sparse autoencoder and FFNN, some of the parameters have been assigned. The parameters for the model construction has been empirically initialized using trial and error method. Table  4 denotes the parameter setting used for training the model. The hidden size parameter for sparse autoencoder specifies the number of features to be extracted. The model is trained for an epoch of 100.

3.4. Comparison of the results with other feature selection techniques Sparse autoencoder has been empirically chosen for dimensionality reduction after performing experimental analysis with two major feature selection techniques, namely, Principal Component Analysis (PCA) and Correlation Feature Selection (CFS). Attribute selection using CFS and PCA has been performed on Weka 3.6. The selected features were then passed to feed forward neural network. Table 5 presents the results obtained while passing the feature extracted from InceptionRes-netV2 and Xception to different feature selection techniques. CFS and PCA were able to achieve an accuracy of 0.8556 and 0.8899, respectively. Among PCA and CFS, the proposed Sparse Autoencoder dimensionality reduction technique has proven its effectiveness with an accuracy of 0.9578. The Sparse Autoencoder has achieved an AUC of 0.9821, while CFS and PCA have acquired an AUC of 0.9326 and 0.9469, respectively. The time consumption for the sparse autoencoder, CFS and PCA were 190.308 s, 26 s and 260 s, respectively.

The superior performance of sparse autoencoder is attributed in the following. Sparse autoencoders learn the data projections more efficiently with the dimension and the sparsity constraints rather than the other feature selection techniques. Autoencoder networks learn nonlinear transformations and are also more constructive in terms of model parameters with various layers than PCA with a single transformation [34] .

FFNN has been empirically chosen as off-the-shelf classifier from experimental analysis with various other classifiers. Table 6 presents the performance of various classifiers with the best performing multi-CNN and sparse autoencoder.

It is evident that only FFNN achieved an accuracy above 90%. 

Different state-of-the-art methods were analyzed for proving the effectiveness of the proposed model. Table  7 presents a consolidation of results achieved by other state-of-the-art methods and the proposed method. The comparison has considered only the methods using Xray images for the classification of COVID-19 and non-COVID-19 images. The methods employing CT scans and ultrasound for COVID-19 detection have not been considered, as they are entirely different modalities.

The number of instances used for training the model is different for multiple methods analyzed. The methods by Pandit et al. [21] , Panwar et al. [75] , Sethy et al. [73] , Ismael and Sengur et al. [26] and Hemdan et al. [76] used a held-out validation set for the evaluation, whereas the other methods have used cross-validation. The results are on par with the state-of-the-art methods. Even though all the methods have achieved significant results, the proposed method achieved a better AUC, F1-score and precision than the other methods. Fair comparison between the different results analyzed 

Even though the method achieved significant results, some of the constraints are worth noting. The method is developed for binary classification of COVID-19 and non-COVID-19 images. The model has not been explored in a multi-class scenario in classifying normal, COVID-19 and pneumonia images. Also, a wide range of sparse autoencoder parameter values were not experimented in developing the model, which can be customized using the grid search method for relatively higher model performance. The method developed is specific to the diagnosis of COVID-19 from X-ray images. However, after empirical studies, it can be extended to diagnose other diseases. The method seems to have prospects in diagnosing other lung diseases and diseases that can be detected using X-rays. The strategy can also be applied to other imaging modalities, after customization. As a future research study, we propose applying the method for diagnosing COVID-19 from other imaging modalities such as CT and Ultrasound.

The proposed model implements a computer-aided model for COVID-19 detection utilizing chest X-ray images using Sparse Autoencoder and Feed Forward Neural Network. The concatenation of multiple pretrained networks for feature extraction has been implemented which outperforms single-CNN. The usage of Sparse Autoencoder has greatly contributed in improving the accuracy of the model. It is worth noting that the performance of the model has considerably increased with the usage of the dimensionality reduction phase rather than using the Feed Forward Neural Network alone. From the analysis phase it is observed that combination of Xception and InceptionResnetV2 achieved greatest accuracy in combination with the custom-made sparse autoencoder and FFNN.

[1] MAP COVID. Coronavirus cases, deaths, vaccinations by country. BBC News. Disponível em: https://www. bbc. com/news/world-51235105. Acesso em, 2, 2021, accessed on 21.09.2021.

[2] Liqa A Rousan, Eyhab Elobeid, Musaab Karrar, and Yousef Khader. Chest X-ray findings and temporal lung changes in patients with covid-19 pneumonia. BMC Pulmonary Medicine, 20(1):1-9, 2020.

[3] Ryad Zemouri, Noureddine Zerhouni, and Daniel Racoceanu. Deep learning in the biomedical applications: Recent and future status. Applied Sciences, 9(8):1526, 2019.

[4] A Dudhane, G Shingadkar, P Sanghavi, B Jankharia, and S Talbar. Interstitial lung disease classification using feed forward neural networks. Advances in Intelligent Systems Research, ICCASP, 137:515-521, 2017.

[5] U Rajendra Acharya, Shu Lih Oh, Yuki Hagiwara, Jen Hong Tan, Hojjat Adeli, and D Puthankattil Subha. Automated eeg-based screening of depression using deep convolutional neural network.

Computer methods and programs in biomedicine, 161:103-113, 2018.

[6] Madiha J Jafri and Vince D Calhoun. Functional classification of schizophrenia using feed forward neural networks. In 2006 International conference of the IEEE engineering in medicine and biology society, pages 6631-6634. IEEE, 2006.

[7] Giovanna Sannino and Giuseppe De Pietro. A deep learning approach for ecg-based heartbeat classification for arrhythmia detection. Future Generation Computer Systems, 86:446-455, 2018.

[8] K Rajeswari, V Vaithiyanathan, and TR Neelakantan. Feature selection in ischemic heart disease identification using feed forward neural networks. Procedia Engineering, 41:1818-1823, 2012.

[9] Bejoy Abraham and Madhu S Nair. Computeraided detection of covid-19 from x-ray images using multi-cnn and bayesnet classifier. Biocybernetics and biomedical engineering, 40(4):1436-1445, 2020.

[10] Bejoy Abraham and Madhu S Nair. Computeraided detection of covid-19 from ct scans using an ensemble of cnns and ksvm classifier. Signal, Image and Video Processing, pages 1-8, 2021.

[11] Ali Abbasian Ardakani, Alireza Rajabzadeh Kanafi, U Rajendra Acharya, Nazanin Khadem, and Afshin Mohammadi. Application of deep learning technique to manage covid-19 in routine clinical practice using CT images: Results of 10 convolutional neural networks. Computers in Biology and Medicine, 121:103795, 2020.

Covid-19 detection through transfer learning using multimodal imaging data

A new covid-19 patients detection strategy (cpds) based on hybrid feature selection and enhanced knn classifier. Knowledge-Based Systems

Covid-19 pneumonia detection in chest x-ray images using transfer learning of convolutional neural networks

Role of intelligent computing in covid-19 prognosis: A state-of-theart review

A weakly-supervised framework for covid-19 classification and lesion localization from chest ct

Diagnosis and detection of infected tissue of covid-19 patients based on lung x-ray image using convolutional neural network approaches

A modified deep convolutional neural network for detecting covid-19 and pneumonia from chest xray images based on the concatenation of xception and resnet50v2

Covidiagnosisnet: Deep bayes-squeezenet based diagnosis of the coronavirus disease 2019 (covid-19) from xray images

Multi-task contrastive learning for automatic CT and X-ray diagnosis of COVID-19

Automatic detection of covid-19 from chest radiographs using deep learning

Coronavirus disease (covid-19) detection in chest x-ray images using majority voting based classifier ensemble

Detection of covid-19 from chest x-ray and ct scan images using improved stacked sparse autoencoder

Sparse wavelet auto-encoder for covid-19 cases identification

A deep learning application for prediction of covid-19

Deep learning approaches for covid-19 detection based on chest x-ray images

Convolutional capsnet: A novel artificial neural network approach to detect covid-19 disease from x-ray images using capsule networks

Early diagnosis of alzheimer's disease: A multi-class deep learning framework with modified k-sparse autoencoder classification

Longitudinal and multi-modal data learning for parkinsons disease diagnosis via stacked sparse auto-encoder

Improved sparse autoencoder based artificial neural network approach for prediction of heart disease

Neonatal sleep state identification using deep learning autoencoders

A two layer sparse autoencoder for glaucoma identification with fundus images

Seven-layer deep neural network based on sparse autoencoder for voxelwise detection of cerebral microbleed

Sparse, stacked and variational autoencoder

Different types of autoencoders

Boosting sparsity-induced autoencoder: A novel sparse feature ensemble learning for image classification

Covid-19 image data collection

Pneumonia x rays

Addressing the class imbalance problem in medical datasets

Imagenet: A large-scale hierarchical image database

Pre-trained convolutional neural networks as feature extractors for tuberculosis detection. Computers in biology and medicine

Addressing architectural distortion in mammogram using alexnet and support vector machine

Artificial intelligence framework for efficient detection and classification of pneumonia using chest radiography images

Inception-v4, inceptionresnet and the impact of residual connections on learning

Xception: Deep learning with depthwise separable convolutions

Rethinking model scaling for convolutional neural networks

Yolov3: An incremental improvement

Deep residual learning for image recognition

Big Data and Deep Learning

Using deep convolutional neural network architectures for object classification and detection within x-ray baggage security imagery

Transferring deep convolutional neural networks for the scene classification of highresolution remote sensing imagery

Deep learning and parallel computing environment for bioengineering systems

Deep transfer learning for characterizing chondrocyte patterns in phase contrast x-ray computed tomography images of the human patellar cartilage

Can pre-trained convolutional neural networks be directly used as a feature extractor for video-based neonatal sleep and wake classification?

Skin lesion classification using hybrid deep neural networks

Locally supervised deep hybrid model for scene recognition

Feature extraction using traditional image processing and convolutional neural network methods to classify white blood cells: a study. Australasian physical & engineering sciences in medicine

A deep feature learning model for pneumonia detection applying a combination of mrmr feature selection and machine learning models

Waste classification using autoencoder network with integrated feature selection method in convolutional neural network models

Deep learning for identifying radiogenomic associations in breast cancer

Exploiting convolutional neural networks with deeply local description for remote sensing image classification

Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images

A new deep cnn model for environmental sound classification

Sparse autoencoder. CS294A Lecture Notes

Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research

Diagnosis of alzheimers disease using dual-tree complex wavelet transform, pca, and feed-forward neural network

Neural network and support vector machine for the prediction of chronic kidney disease: A comparative study

Early detection of lung cancer using wavelet feature descriptor and feed forward back propagation neural networks classifier

A seasonal feedforward neural network to forecast electricity prices

Neural network toolbox

An intuitive guide to deep network architectures

Detection of coronavirus disease (COVID-19) based on deep features and support vector machine

Covidgan: data augmentation using auxiliary classifier GAN for improved covid-19 detection

Application of deep learning for fast detection of covid-19 in X-rays using ncovnet

Covidx-net: A framework of deep learning classifiers to diagnose covid-19 in x-ray images

Covid-19 screening on chest x-ray images using deep learning based anomaly detection

Each author must complete his/her own form.1. Did the author of the manuscript receive funding, grants, or in-kind support in support of the research or the preparation of the manuscript?x NO YES, support received from the following persons, agencies, industrial or commercial parties is disclosed here: ___________ (Please use additional sheets if necessary)If yes, did the support include contractual or implied restriction on utilization or publication of the data and/or review of the data prior to publication? Please provide details.

YES. Describe the support: ___________ 2 Did the author have association or financial involvement (i.e. consultancies/advisory board, stock ownerships/options, equity interest, patents received or pending, royalties/honorary) with any organization or commercial entity having a financial interest in or financial conflict with the subject matter or research presented in the manuscript?x NO YES, the association or financial involvement of the authors is disclosed below. 1.

Author Name/Signature: Bejoy Abraham Title of Article: Detection of COVID-19 from X-Ray Images by Integrating CNN with Sparse Autoencoder