key: cord-0718730-tt113y89
authors: nan
title: Prediction of COVID-19 Using Genetic Deep Learning Convolutional Neural Network (GDCNN)
date: 2020-09-21
journal: IEEE Access
DOI: 10.1109/access.2020.3025164
sha: 5e9f06a29394f0f36e10b9a419d82d6c538352e3
doc_id: 718730
cord_uid: tt113y89

Rapid spread of Coronavirus disease COVID-19 leads to severe pneumonia and it is estimated to create a high impact on the healthcare system. An urgent need for early diagnosis is required for precise treatment, which in turn reduces the pressure in the health care system. Some of the standard image diagnosis available is Computed Tomography (CT) scan and Chest X-Ray (CXR). Even though a CT scan is considered a gold standard in diagnosis, CXR is most widely used due to widespread, faster, and cheaper. This study aims to provide a solution for identifying pneumonia due to COVID-19 and healthy lungs (normal person) using CXR images. One of the remarkable methods used for extracting a high dimensional feature from medical images is the Deep learning method. In this research, the state-of-the-art techniques used is Genetic Deep Learning Convolutional Neural Network (GDCNN). It is trained from the scratch for extracting features for classifying them between COVID-19 and normal images. A dataset consisting of more than 5000 CXR image samples is used for classifying pneumonia, normal and other pneumonia diseases. Training a GDCNN from scratch proves that, the proposed method performs better compared to other transfer learning techniques. Classification accuracy of 98.84%, the precision of 93%, the sensitivity of 100%, and specificity of 97.0% in COVID-19 prediction is achieved. Top classification accuracy obtained in this research reveals the best nominal rate in the identification of COVID-19 disease prediction in an unbalanced environment. The novel model proposed for classification proves to be better than the existing models such as ReseNet18, ReseNet50, Squeezenet, DenseNet-121, and Visual Geometry Group (VGG16).

I. INTRODUCTION Novel coronavirus has been formally named as Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-COV-2) is responsible for causing Coronavirus Disease 2019 (COVID-19) [1] . Few symptoms of COVID-19 are cough, fever, a disease of the respiratory system and in some cases, it leads to pneumonia [2] . Generally, pneumonia is termed as the infection that causes inflammation to air sacs present in the lungs for oxygen transfer. The other way of pneumonia infection is fungi, bacteria, and other viruses. The reason for severity is chronic diseases such as bronchitis or asthma, impaired or weak immune system, smoking, and

The associate editor coordinating the review of this manuscript and approving it for publication was Haris Pervaiz . aging people. The infected peoples are treated based on the infected organism, however, cough medicine, pain reliever, fever reducer, and antibiotics are given to patients based on the symptoms. If the patient is severely affected, they have to be hospitalized and treatment must be given in the Intensive Care Unit (ICU), if needed ventilator to be provided for breathing [3] . The pandemic of COVID-19 is due to its seriousness and its faster transmissibility [4] . Greater impact in the health care department is mainly due to the number of people getting affected day by day, as they need to provide mechanical ventilator for the serious patient admitted in ICU. Hence, number of beds in ICU also need to be increased drastically [5] . In the above situation, the initial diagnosis is vital for proper treatment which, in turn, reduces the pressure on the health care system. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Artificial Intelligence (AI) provides a major breakthrough for the diagnosis of COVID-19 and other types of pneumonia. Pneumonia is diagnosed using some of the standard images such as Computed Tomography (CT) scan and Chest X-Ray (CXR). The primary source for evaluating pneumonia is CXR, as CXR leads to misdiagnosis and less precision. However, CXR is used because of its cheaper rate, less exposure to radiation for patients, faster, and it's readily available in all health care systems [6] . Identification of pneumonia is no easy task, as the reviewer needs to look into the white patches present in the lungs and most of the air sacs filled with water or pus hence, it is a tedious process to differentiate between bronchitis or tuberculosis [7] .

The concepts of COVID-19 pandemic and pneumonia disease, hierarchical, and flat classification merge in our work are presented in this section. Furthermore, background related to data imbalances is also discussed.

The first COVID-19 case was reported in Wuhan, China, and it is gradually starting to spread across the rest of the world within a short interval of time. This indicates that the number of cases reported increases exponentially, as of now more than 8.24 million confirmed cases worldwide [1] . Epidemiological characteristics are still under the process of investigation, evidence prove that more or less, 80% of patients are in mild condition with few asymptomatic and approximately 20% are in severe condition among, this 10% have to be in ICU with ventilators [8] . The most important concern is the number of patients admitted to the ICU as there are only limited beds. The major problem of COVID-19 is pneumonia, as it infects a portion of the lung, which transfers gas termed as pulmonary parenchyma. Some of the organisms like fungi or bacteria and viruses are also present. Generally, pneumonia is termed as a group of diseases, hence diagnosis also needs to be different, therefore, Chest X-Ray image and CT scan used for diagnosis [9] .

Flat classification involves multi-label, binary, and multiclass classification problems, however, multi-label includes multiple classes, and the output is associated with each other. Binary classification is stated as the task of classifying the images from the given dataset into two categories on the basis of classification rules. Some of the methods used for classification are random forests, decision trees, support vector machines, Bayesian networks, probit models, neural networks, and logistic regression. Table 1 shows the parameters explanation in terms of symbol and its explanation. The features are represented by 'x' containing a set of parameter''x 1, x 2 '' and it is shown in equation (1) . The output is represented by 'y' as in equation (2), a decision function based on the weight for each parameter is evaluated using equation (3) . The algorithm function is represented by equation (4), thus the number of parameters is based on the real value function using equation (6) .

Decision function

Algorithm function

Number of Parameter

It is clear from the above context that pneumonia is based on multi-class classification, as many features need to extract from CXR images however, one label needs to be associated. Silla et al. stated taxonomy is considered to organize a tree hierarchy defined on the basis of incomplete order set [10] Moreover, the various ways to handle hierarchical classification problems in regards to labeling classification process also discussed [11] . Local classifiers (LC) are an approach that considers hierarchy and partial local information perception and thus allowing the multi-class/binary classifiers to handle the problem in a local manner [12] , [13] . Furthermore, the Global Classifier (GC) approach is a unique classification model built based on the training dataset. Considering class hierarchal as a whole, thus significant information on the pneumonia labels is found in the entire class hierarchy, thus, GC approached is widely used [14] , [15] . Multi-class classification output is given by 'y' using equation (7), the number of parameters is stated by the equation (9) and classification accuracy using equation (10) .

Multi class classification

Number of parameter

Classification accuracy

Class probabilities and the class score based on logits from a linear models using equation (11) 

Applying Softmax transform is represented by equation (13) σ

Loss function Multi class loss function is predicted by class probabilities model output using equation (14) 

M = mini batch size, f (x i ) = corresponding output of the penultimate layer of the DCNN, C = number of classes, w = last layer weight and b = last layer bias. Target value for class probabilities using equation (15) 

Similarity between 'z' and 'p' can be measured by the cross entropy using equation ( 

Class imbalance distribution problems have been faced by most researchers whenever, they deal with the real datasets [16] , [17] . However, classifier focus on minimizing the global error rate, thus the algorithm concentrates on the majority classes, but, it also focuses on minority classes based on the problem domain such as medical image classification and credit card fraud detection [18] , [19] . In the real world, classifying pneumonia type using CXR images is also considered as imbalanced learning as there are only a few people with affected pneumonia than considering health persons [20] , [21] . On the other hand, the number of people affected by various types of pneumonia disease is also imbalance [22] , [23] . Currently, the number of people affected by COVID-19 is very larger than compared with the people affected by MERS, Severe Acute Respiratory Syndrome Coronavirus (SARS), Streptococcus, Varicella, and Pneumocystis [24] , [25] . The imbalance problem in classification datasets has been resolved by the various authors and one such technique is data level solutions [26] , [27] . The techniques focus on re-balancing the class distribution using resampling the dataset which reduces the consequence of class imbalances, in other words, before the training phase, pre-processing the dataset has to be done [28] , [29] . Resampling further subdivided into two types, undersampling and oversampling. Both types are used to fine-tune the class distribution of a dataset, thus, it is the ratio among the various classes in the datasets. Undersampling includes removal of majority classes for distribution of samples, whereas, in oversampling few instances are duplicated to main the class distribution [30] .

The standard of many computer version tasks is greatly enhanced over the period with the help of Deep convolutional neural networks (DCNNs), some of them are GoogLeNet [31] , AlexNet [32] , DenseNet [33] , VGGNet [34] and ResNet [35] . DCNN is a professional system that reduced the infinite amount of human expertise involved in the analysis of data. Thereby, providing an identical feature extraction classification model to reduce the burden of handcrafted extraction in case of network design. In designing DCNN architecture, the main ambition of artificial intelligence is to develop an autonomous learning system with less human intervention also needs to consider [36] .

In recent years research focused more attention in automated design of DCNN architectures, which in turn leads to the development of many algorithms and it is generally categories into four groups: (1) Evolutionary optimization of DCNN architectures, (2) Optimization in DCNN architectures using deep learning, (3) DCNN architectures, selection in an available group of candidates and (4) DCNN architecture optimization using reinforcement learning. Among the four categories, evolutionary optimization proved to be a more promising approach in the case of multi-point global VOLUME 8, 2020 search thereby, leading to high-quality optimization solutions in complex search space [37] . Despite the success, many evolutionary methods have some restrictions on DCNN architecture some of them are fixed filter size, fixed pooling size, fixed depth, avoiding pooling operation [38] , or avoiding crossover operation [39] and fixed activation function [40] . All these restrictions leads to a reduction in computational complexity which in turn leads to performance degradation. Thus, for parallel optimization using evolutionary methods, thousands of computers are required [41] . Many classifiers that incorporate Genetic Algorithm (GA) techniques use a single-phased GA such as Non-dominated sorting GA-II (NSGA-II) (Deb et al. 2002) . A classifier model based on bi-phase needs to be developed which includes classification rule extraction. Local heuristic search techniques are used for pattern discovery with rule induction methods in case of data mining (Chiu et al. in 2005) . The major issue with the local search method is that it frequently gets trapped with local optima and furthermore it is sensitive to the initial solution. The above drawback is overcome by using GA as it discovers the best classification rules, moreover, local optimum issued is also addressed. Better feature interaction is done with GA than compare to the greedy rule induction algorithm, (Freitas et al. in 2003) some of the issues that exist in the GA is that, as there is no guarantee of achieving global optimum and its computational cost linearly increases with the search space. Fuzzy classification rules are easily understood by humans as it deals with the uncertainty problems but the only drawback is that the complexity involved is more in fuzzy classification rule extraction than crisp classification rule extraction (De Jong et al. in 1988) .

In this research, the proposed genetic-based DCNN design for spontaneously producing the architecture of a DCNN to solve image classification issues. The complexity involved DCNN architecture is reduced by developing a suitable encoding scheme which includes all the operations performed in the DCNN, some of them are pooling convolution, activation, batch normalization, drop out, optimizer, and full connection is encoded in the form of integer vectors.

The main aim of this research is to explore the various types of pneumonia due to pathogens using CXR images. CXR image samples are used because of its advantages in terms of faster and minimal cost. Even though CT scans hold the better standard in the diagnosis of pneumonia, the major setback is it is costly and scarce. Our main aim is to predict COVID-19 pneumonia using CXR images, as it is widely spread across the world. The validation of the precision is done using a Macro-Avg F1-score. Since the database is imbalanced known resampling techniques are used. CXR images are analyzed by identifying the texture which is one of the main attributes present, thus, exploring a few texture descriptors used for training CNN models. A hierarchical classification approach is used for extraction features and a Genetic based deep learning approach is used for prediction of COVID-19 and other pulmonary diseases. Dataset consisting of more than 5000 image samples is chosen for prediction of COVID-19, from publicly available dataset consisting of 2, 24,316 Chest X-ray.

Section I describes the pandemic situation in the health care system due to COVID-19 and the early reasons for diagnosing it. Section II states the existing techniques used for classification of images and Deep Learning Convolutional Neural Network models used for prediction of COVID-19. Section III depicts the proposed Genetic Deep Learning Convolutional Neural Network comprising of Ordered Distance Vector population techniques for optimal prediction of COVID-19. Section IV represents the experimental analysis of the proposed work and it is compared with the existing DCNN models. Some of the parameters used for performance analysis are sensitivity, accuracy, specificity, recall, precision and F1-score. Final conclusion of the proposed GDCNN models in future works.

In-depth study of various techniques used for classification of images is performed. Furthermore, discussed the existing DCNN models used for prediction of COVID-19 using CT and CXR images. The analysis is stated in terms of accuracy for various prediction models. Comprehensive study is performed in automation of DCNN architecture for searching and classification of images.

Nanni et al. in 2010 [41] compared different texture descriptors which are handcrafted and obtain from Local Binary Pattern (LBP) used in medical applications. Three different LBP evaluators are Elongated Quinary Pattern (EQP), Local Ternary Pattern (EQP), and Elliptical Binary Pattern (EBP) [42] . These descriptors are calculated on various medical applications such as classification of cell phenotype image with 2D-hela dataset [43] . Detection of pain expression with a facial image of COPE database and Papanicolaou test used for diagnosing cervical cancer. Data is collected from Herlev university containing 917 images obtained from the microscope and digital camera. Support Vector Machine (SVM) is used for validating the EQP descriptor and it performs comparatively better for all the tasks. Parveen et al. in 2010 [44] carried a texture analysis of images used in radiotherapy applications. The mathematical technique stating grey-level patterns in case of tumor heterogeneity. Specially focused on tissue, causing radiation and for tumor, analysis is performed based on radiotherapy medical images. The major drawback of this technique is that it lacks in the biological interpretation of predicting tissue infected by radiation [45] .

Zhou et al. in 2020 [46] suggested a deep learning model for distinguishing influenza pneumonia taken from CT images and novel coronavirus pneumonia. CT images are better than CXR images as it shows pulmonary infection clearly but it's much costlier. Li et al. [47] identified COVID-19 using Artificial Intelligence (AI), thus dataset comprising of affected COVID-19 images, various pneumonia, and diagnosed patients with pneumonia. The images are gathered from Chinese hospitals containing 2969 images of the training set, viral pneumonia 1396, more than 400 images of COVID-19 patients, and 1173 non-pneumonia. K. He, X. Zhang et al. in 2016 [34] stated the 3D learning model for prediction of COVID-19, non-pneumonia, and various viral pneumonia and CT image is given as input. The output of the prediction clearly shows that for COVID-19 AUROC value is 0.96 and for other viral pneumonia is 0.95. Narin et al. in 2020 [48] detected COVID-19 using CXR images with three unique deep neural networks such as Incep-tionResNetV2, ResNet50, and Inception-V3. The dataset consists of 100 CXR images comprising of 50 COVID-19 positives and 50 COVID-19 negatives. The result is validated using a fivefold cross, where 87% of accuracy is achieved for inception-ResNetV2, 97% for Inception-V3, and 98% using the ResNet50 model. Gozes et al. in 2020 [49] detected the COVID-19 using deep learning models with CT images as an input. The evolution is performed for patients with the help of 3D volume, thereby, producing cornea score. The main aim of the work is to track the progress COVID-19, the dataset consists of 157 CT images collected from the USA and China. Furthermore, detection has been carried out using 3D and 2D deep learning models, with few changes in the already existing AI models and associated with clinical understanding. With AUROC of 0.996 differentiating with non-corona image and cornea images.

Wang et al. in 2020 [50] developed COVID-Net which is an open-source deep neural network used for detecting COVID-19 with CXR images. The dataset is created in such a way it supports COVID-Net experimentation thus, comprising of 16,756 patients. COVID-Net architecture developed on the basis of best practices and human-driven design merged with network architecture. The detection is performed with 92.4% accuracy, sensitivity rate 95%, and the infection rate is 80%. Khan Many researchers contributed a lot of effort in automating DCNN architectures for searching and classification of images. Jin et al. [53] proposed super-modular and submodular optimization in the construction of DCNN architecture and proposed rules for setting the depth and width of DCNN. Fernando et al. [54] proposed a new algorithm, termed PathNet algorithm which models sub-network from super DCNN architecture and proven that the proposed algorithm is capable of supporting transfer learning both in reinforcement and supervised learning settings. Moreover, optimization of the architecture/weights of DCNN is done using another deep neural network. Ha et al. [55] used a Discrete Cosine Transform (DCT) and hyper network to progress weights of fixed DCNN architecture. De Barbandere et al. [56] used producing filters for DCNN architecture to take care of dynamic filter networks, which is divided into a dynamic filtering layer and filter-generating network. The filter generating network produces runtime sample-specific filter parameters based on input condition and dynamic filters use those filters as an input. Nowadays reinforcement learning is used in design architectures of DCNN, Zoph and Le [57] maximize the accuracy of image validation in DCNN architectures using the recurrent neural network, which is trained by reinforcement learning, however, in this technique, a fixed depth in DCNN architecture is created on each layer by layer, thus allowing a fixed number of filters and fixed filter size. Furthermore, it uses asynchronous parameters with 800 graphs processing units (GPUs) and distributed training.

Baker et al. in 2016 [58] on the basis of reinforcement learning introduced MetaONN techniques for DCNN architectures. Furthermore, the techniques use a Q-learning agent to exploit and explore the space ideal architectures based on experience replay and greedy strategy. In the design of the neural networks, many evolutionary algorithms have already been applied. Miikkulainen et al. in 2017 [59] proposed techniques containing all neurons associated with DNA, produced architecture by mutation techniques are divided into three types (1) weight modification (2) whenever splitting connection occurs a new neuron had to be inserted (3) a new connection is added to the existing connections. Suganuma et al. in 2017 [60] suggested the CoDeepNEAT algorithm where chromosomes populations are created with minimal complexity. Furthermore, the structure is added iteratively via mutation generations. Minaee et al. [61] proposed genetic programming for DCNN architecture and it is encoded by Cartesian genetic programming which is directed acyclic graphs having a two-dimensional grid-based on computational neurons moreover, it is said to be a more dominant algorithm. The algorithm uses a heuristic search for selection and fitness function. Wang et al. in 2020 [62] proposed Genetic DCNN based on a fixed-length binary string encoding scheme, only the pooling layer is considered for encoding and thus neglected fully connected layer thereby it leads to minimum number of layers with least filter number and filter size. Real et al. [40] designed a DCNN architecture for CIFAR-100 dataset with the help of GA, the DNA encoding scheme is used. Basically, this architecture originated based on the mutation operations furthermore, encounters filters size is dealt with by restructuring non-primary edges along with interpolation. DCNN architecture produced by these techniques is fully trained and uses a distributed algorithm which involves more than 250 computers. Desell et al. [39] proposed the EXACT method comprising flexible filter size and connection on the basis of GA using asynchronous evolutionary techniques. It includes more than 4500 dedicated computers. The MNIST dataset is used to train the model for 120000 DCNN [53] .

Matteo Polsinelli et al. in 2020 [63] proposed a light Convolutional Neural Networks (CNN) for diagnosing COVID-19 using CT images. Few changes in the SqueezeNet CNN model are made, thereby, achieving 83% of accuracy, specificity of 81%, 81.73% of precision, F1-score of 0.8333 and with 85% of sensitivity. Shreshth Tuli et al. in 2020 [64] proposed Long-Short-Term-Memory method based on Weibull for predicting started and ending the cycle of COVID-19. Basically, this model is used for understanding the relationship between infection rate and deaths. The model works on cloud which is useful for dynamic prediction and helpful in providing guidelines for administration, policy makers and health care system. Adarsh Kumar et al. in 2020 [65] proposed a drone based on network system for identifying the number of people affected by COVID-19. The model is deployed in remote and congested areas, where there doesn't exist internet or wireless connectivity. The model is used for health care system for sanitizing and identifying the infected patients. Parnian Afshar et al. in 2020 [66] proposed a framework based on capsule network termed as COVID-CAPS for identifying COVID-19 using X-ray images. The major drawback of the proposed method is that it is used for small datasets, however, accuracy of 95.7%, specificity of 95.8 and a sensitivity of 90% is achieved.

The existing models proposed by various author lacks in the accuracy and the computation time for prediction of COVID-19 is also considerably larger. Hence, there arises a need for early prediction. The architecture needs to be continuous and autonomous learning algorithm for early diagnosis using XRay image samples.

Proposed an independent and continuous learning algorithm for generating a DCNN architecture spontaneously. The process includes the operations of partitioning DCNN into numerous weighted fully connected and meta convolutional block. Each block possesses the operations like pooling, convolution, batch normalization, dropout, fully connection and activation operation. Thereby converting the DCNN architecture into a standard integer code. The genetic operations such as selection, crossover and mutation process are performed to evolve the population for DCNN architectures. The individual population is increasing and progressed using the design of the proposed genetic DCNN. Furthermore, encoding is performed with acceptable DCNN architecture. Population initialization is performed randomly using random function, moreover fitness of each individual is calculated based on the performance of genetic DCNN encoding used for specific image detection problems. On the basis of the existing generation, a new generation is performed using genetic operators such as selection, crossover operator, and mutation for improvising overall fitness values. The evolution is carried out in iteration manner based on generation-by-generation till it reaches the criteria or for a particular generation number.

The proposed genetic based DCNN architecture evolved on the basis of locus on a chromosome. Thus chromosomes are divided into two parts, namely, q-arm and p-arm. Gene map is termed as the method of loci known for a specific genome. Operations need to be performed by DCNN is observed as the loci on a chromosome, thereby, it is clear that, based on gene map all the encoding operation of DCNN is performed. Basically, convolutional block has five major operations, they are, convolution, pooling, normalization, drop out and activation. Table 2 shows the range of values at every locus of the code, (N f ) various from 16 to 512, S f are 7 × 7, 5 × 5 and 3 × 3, pooling operation is indicated by three values they are 0, 1 and 2. '0' denotes no pooling, '1' state's maximum pooling and '2' for average pooling. Usually B n take the value '1' and '0', '1' indicates batch normalization is performed and '0' not performed. 'A' various from 0 to 5 stating ELU [2] , ReLU [8] , PReLU [12] , TReLU [18] , softmax and LeakyReLU [22] . The value of 'O' ranges from 0 to 6 denoting SGD [6] , Adadelta [17] , Adamax [31] , Adam [31] , Adagrad [35] and RMSprop [36] . Thus, based on the coding scheme p-arm contains the sequence [N f S f B n PDA] [N f S f B n ] and q-arm sequence is [N f S f B n DA]. 

Ordered Distance Vector population initialization techniques are used, which inhabit individual diversity, randomness and potential sequence. This is shown using equation (17) . The individual populations are produced and this type of population has a more potential permutation of images and better individual diversity. Thus, it is more effective and better solution with minimum convergence time.

Deep Convolutional Neural Network (DCNN) with convolutional block is stated as N c n and with 'n' filter it is N f n .

The selection is based on the highest fitness value obtained by each individual. Only those with higher fitness ranking guarantees highest fitness value using elitism roulette wheel selection scheme shown in equation (18) and code length using equation (19) .

A pair of DCNN P ODV i and P ODV j is selected, thus a point is located randomly to break the DCNN architecture in two segments. Two new DCNN segment is generated by swapping them, that is P ODV i and P ODV j thus the depth is different compared with parents. Let us assume, cross point 'k i ' is chosen within the 'cp i 'convolutional blocks [N f S f B n PDA] cp i on the convolutional arm selected '[N f B n DA] P ODV i position is stated as '(cp i − 1) * l c + x' similarly, other convolutional arm cp j and its position is stated as (cp j −1) * l c +x. The code length of the cross operator is given by the equation (20) and (21).

It is clear that '8' learnable layer is needed if the cross point is 'k i ' is positioned at '3l c + 1' and 11 learnable layers are required at the cross point 'k j ' positioned at '5l c + 1'. Furthermore, after crossover operation the number of layers for DCNN required is 9 and 10 respectively, which is shown in equation (22) and (23),

D. MUTATION DCNN architecture is altered by applying the mutation operator, these mutation operation maintains the diversity in each generation. The new DCNN architecture is accelerated using 'q m ' for the population 'P ODV ' in the range [ 8 L n , 0.5]. Once the mutation process completed, there will be a change in convolutional block (example from 5 × 5 to 3 × 3, or 7 × 7 to 5 × 5), in some case the max pooling layer may be removed and in fifth convolutional layer there will be change in batch normalization process such as 327 to 513. Furthermore, optimizer change from Nadam to RMSprop. The figure 2 shows the proposed GDCNN designer, it tries to improve the individual population with a permitted encoding scheme, CXR image samples are initialized and the fitness of an individual is validated based on the encoding using unique classification problem. On the basis, of present generation new generation is produced using genetic operation which include selection, crossover and the process of mutation. Thus, overall fitness is improved furthermore evolution is performed at generation, until it reaches the defined number.

The process involved in the genetic DCNN design architecture is, population initialization where the population is initialized randomly, thereby, it continuously progresses the population on the basis of generation-by-generation for developing better architectures using redefined genetic operations. The Selection operation involves creating a random operation and batch normalization process is performed.

Feeding population to convolution neural network activation is processed and maxpooling is performed, train the GDCNN model for achieving fitness value. Furthermore, model fitness is produced using a generator. Selection, Crossover, and mutation activation are performed. The proposed approach is evaluated on two image classification data set for identifying pneumonia, COVID-19, normal and other pneumonia diseases. Our results show that proposed genetic-based DCNN architecture outperforms well and its performance is comparable to the state of the art.

The experiment is performed using Intel i7 2.50 GHz and NVIDIA Tesla TitanXp GPU, 512 GB memory, 240 SSD, and tensor flow. Furthermore, the dataset is downloaded from github repository consists of more than 5000 chest x-ray images which include COVID-19, normal, and other pneumonia diseases. 

The dataset is collected from various parts of the world based on the publications containing chest x-ray images, thus, it requires proper care to verify the labels with boardcertified radiologist specialists. The dataset consists of chest x-ray samples of clear sign of COVID-19 using radiologists, 

The dataset contains only small samples of COVID-19 infected cases, hence patients with severe symptoms also need to be analyzed. Furthermore, cases with mild symptoms missing, and some people are even quarantined without examining them. Some of the pneumonia samples are collected previously and there is no suspected of coronavirus VOLUME 8, 2020 infection, finally, data deals to demographic characteristics and other risk factor related to patients is not available.

Accuracy is one of the important metrics used for evaluating the classification models, accuracy states whether our model is right and it is defined as the number of correct predictions of COVID-19 to the total number of prediction samples. The confidence interval of the accuracy rates can be calculated as equation 24 ,

Accuracy is also stated as the sum of True Positive (TP) and True Negative (TN) to the sum of True Positive (TP), True Negative (TN), False Positive (FP) and False Negative (FN) using equation (25) , as shown at the bottom of the page. 

H. F1-SCORE F1-score is used to measure the balance between the precision and recall. Furthermore, it is stated as the twice the product of precision and recall to the sum of product and sensitivity. Equation (30) states the precision calculation,

F1_score is stated as the weighted average of precision and recall, '1' is stated as the best score of 'F1' and '0' as the worst score.

Log loss state the logarithmic loss function and it is stated by equation (31) log

Four cases of log loss are, Case 1: (32) When y i = 1, and p i = high, it states that the model is working perfectly, this is generally due to the true value of response mostly agreed with highest probability. There is a ''n'' expand in sum as y i (log(p) i ) is high and the other term is zero as

Thus higher the value there is the possible influence in sum and in mean, this is mainly due to

Case 2: (36) In this case y i = 1 and p i = low it is totally adverse as the probability of y is 1 and being low furthermore, as the value of y = 1 there is little impact on sum. Case 3: 

Case 4: (38) In case 3 and case 4 there is as drastically expand in sum and thus it affects the model. The validation of the proposed model is evaluated using some of the benchmark metric functions such as accuracy, precision, sensitivity, specificity, and F1-score. The experiment is carried out for 100 trail to achieve better performance. Figure 4 shows the chest x-ray image sample of COVID-19 and healthy lung persons, persons who are affected by COVID-19 CXR image are not clearly visible whereas, normal persons' lungs images are clearly identifiable. Table 3 shows the confusion matrix for pneumonia and from the table, it is clear that the number of people affected by COVID-19, normal and other pneumonia is easily identifiable. Table 4 shows the performance metric for evaluation of the proposed model with other existing models and Table 5 shows the analysis of accuracy with proposed and other existing models for 10 trails with 100 iterations. Figure 5 shows the performance of accuracy and it is compared with the proposed model with the other existing models such as ResNet18, ResNet50, SqueezeNet, Densenet-121, and VGG16. The proposed model has the highest accuracy of 98.84%, whereas, VGG16 has the lowest accuracy of 88.05%. The other models like ResNet18, ResNet50, and Densenet-121 with an accuracy of 92.7% furthermore, SqueezeNet with 96.60% accuracy is comparable with the proposed model. The accuracy bar graph shows that the proposed model outperforms well than the rest of the existing models. Figure 6 shows the comparative analysis of proposed model precision along with the other existing models, GDCNN model with the highest precision of 93.0%, whereas, ResNet18 and DenseNet121 with a precision of 87%.

The other three models like ResNet50, SqueezeNet, and VGG16 with an average precision of 82.13%. The precision bar graph shows that other existing models have an average precision of 85.34% and thus proposed a model with higher precision of 93.0%. Figure 7 shows the analysis of the F1-score bar graph with the other models, thus, the proposed model with an F1-score of 96.37%, whereas other models with an average F1-score of 86.53%. Thus the F1-score shows how the model best in terms of prediction of COVID-19 and normal diseases. The proposed model F1-score indicates that better classifications are performed. Figure 8 shows the sensitivity bar graph analysis of proposed models with other available models, 100% sensitivity is achieved for the proposed model, whereas, ResNet18, ResNet50, and Densenet121 with an average of 87.50%. Squeezenet and VGG16 with a sensitivity of 93.53% are achieved. Sensitivity is one of the important metrics to validate the performance of the proposed model. Figure 9 shows the specificity bar graph analysis in terms of the proposed model with other existing models, specificity is one of the significant benchmark metrics for performance evaluation. Specificity with an average of 97% is achieved for GDCNN, ResNet18, ResNet50, Squeezenet, and Densenet-121. Minimum specificity of 83.30% is achieved for VGG16. Figure 10 shows the combo chart of accuracy and precision analysis of the proposed models with existing models. The combined analysis of both accuracy and precision is plotted in the graph, accuracy is represented in terms of a bar graph and precision is shown in the form of a line graph. The proposed model with an accuracy of 98.84% and a precision of 93.0% is achieved thus both of them show higher performance than the existing model. Meanwhile, Squeezenet with an accuracy of 96.60%, whereas, the precision of only 83.23% is achieved. VGG16 with a minimum accuracy of 88.05 and a precision of 82.20 is obtained. The combo graph states that both accuracy and precision with higher performance is obtained. Figure 11 shows the accuracy and recall performance analysis, it is clear that the GDCNN model with an accuracy of 98.84% and 100% recall is achieved able using the proposed model, on the other hand, Squeezenet with 96.60% and recall of 95% is obtained. Resnet18, Resnet50, and Densenet with an average recall of 87.5% are achievable, whereas, the accuracy of 92.36% is obtained and VGG16 with the lowest recall rate of 88.05% and accuracy of 92.80 is obtained. It is the same for figure 13 as both recall and sensitivity are the same. Figure 12 shows the comparative analysis of accuracy with the F1-score, for the proposed GDCNN model with an accuracy of 98.84% and an F1-score of 96.37% is achieved, whereas for the rest of the models it is comparatively low. F1-score is drastically lower than compared with the accuracy for the other existing models and VGG F1-score of 87.18 is obtained. Figure 14 shows the comparative analysis of accuracy and specificity analysis of the proposed model with other models, furthermore, it is clear that both accuracy and specificity are comparatively high for both proposed as well as other existing models. Figure 15 shows the comparison of bar graph between precision and recall for various models. It is clear from the figure that the proposed GDCNN model performs well compared to other existing models such as ResNet18, ResNet50, SqueezeNet, DenseNet-121 and VGG16. Proposed GDCNN models with 93.00% of precision and recall of 100% is achieved, whereas ResNet18, DenseNet-121 with precision of 87% is achieved. Recall value of ResNet18, SqueezeNet and VGG16 is above 92%. The minimum value of precision and recall is for ResNet50 with 83.40% and 87.51% considerably. Figure 16 represents a comparison bar for precision and f1-score, from above comparison, it states that proposed GDCNN is considerably high with 93% of precision and 96.37% of f1-score. Lowest precision and f1-score is obtained VOLUME 8, 2020 for ResNet50 and SqueezeNet, with precision of 83.40 and recall of 85.41% for ResNet50 and for SqueezeNet it is 83.23% and 88.73%. Figure 17 shows the comparison graph for precision and sensitivity for various models. Proposed GDCNN model with precision of 93.00% and a sensitivity of 100.00% is achieved, whereas reset of the models is low. ResNet18, ResNet50 and DenseNet-121 precision value Figure 18 depicts the precision and specificity comparison graph, proposed GDCNN with 93.00% and 97.00% obviously. All the models try to perform more or less same, However, VGG16 performs lowest of all the models with precision of 83.30% and specificity of 82.20%. Figure 19 shows the sensitivity and specificity comparable graph for proposed models and it is compared with the other existing models. Sensitivity and specificity out perform well with 100.00% and 97.00% considerably. All the other models also try to perform, but it lacks in sensitivity as it's approximately 87.50 for ResNet18, ResNet50, SqueezeNet and DenseNet-121 whereas specificity above 90% is achieved. Figure 20 shows the line graph for the proposed GDCNN models with 100 iterations for 10 trails. Accuracy is predicted for various trails starting 95.83% to a high of 98.84%, this is mainly due to training models as the models get trained better accuracy is achieved. Figure 21 and achieved, whereas for DenseNet-121 it is 94.62%. A gradual increase in the accuracy is notifiable for DenseNet-121, as its initial accuracy of 86.97% and then achieving up to 94.62%. Figure 24 shows the accuracy analysis for 10 trails with each 100 iterations for VGG16. It is clear that accuracy also tends to increase gradually as the number of trail increases. Figure 25 line graph shows the comparative analysis of accuracy of the proposed model with the available models, thus the proposed model performs well with respect to the number of trails. VGG16 is with the lowest accuracy, while resnet18 and resnet50 both models with similar accuracy is achieved.

Our main goal of this research is to develop GDCNN based approaches for predicting the lung infection due to COVID-19 using chest x-ray images. Healthy versus pneumonia samples are identified with an accuracy of 98.72%

The tool is developed based on the GDCNN model which is very helpful for physicians and act confident in the treatment of a COVID-19 affected patient, while they are waiting for the second opinion confirmation with the radiologist. Furthermore, it provides a measurable score to consider and to use in research studies. VOLUME 8, 2020 L. DISCUSSION ON COMPLEXITY Training of each chest x-ray image generated by DCNN is fixed to 100 epochs, thus the proposed GDCNN has high computation and space complexity, and this is mainly due to storing and evaluating a huge amount of DCNN structure. Lacks security as the health care data are stored in the cloud environment and hence the proper security mechanism need to be implemented for retrieving data [67] , [68] .

The fast spread of COVID-19 creates a pandemic all over the world as there exists an exponential increase in the number of cases. Early diagnosis of diseases is in urgent need in the treatment of COVID-19, which should be faster and cheaper. In the above context, a deep learning method is used for the prediction of COVID-19 from CXR image samples. In the real world, only a few people have bee affected by pneumonia whereas, many of them remain unaffected. Hence, there arises an imbalance in the prediction of pneumonia between the affected person and a normal person. In this research, the GDCNNN method is proposed for classifying COVID-19 and normal person, and it is done through CXR image samples. More than 5000 image samples are taken from the publicly available repository, consisting of pneumonia, healthy lung images, and other pneumonia diseases. The proposed method with F1-score of 0.96337, val_accuracy of 0.99 (99.0%), loss of 0.32 and val_loss of 0.05 is achieved. Furthermore, it is compared with other existing models such as resenet18, resenet50, SqueezeNet, Densenet-121, and VGG16 to evaluate the performance of the proposed model. It is clear from the analysis table that the proposed method outperforms well than compared to the existing model. The main aim of the research is to provide a better identification rate for COVID-19 prediction in the earlier stage of diagnosis and provide greater help emergency of patients in earlier treatment. The organization can use this model for earlier prediction COVID-19 as GDCNN tool hosted resides in the cloud computing environment. The health care system can use this tool for earlier diagnosis of diseases. In the future we hope to apply this method for a large scale database for achieving better hierarchical classification accuracy.

Our prediction model is available online at https://github.com/ BABUKARTHIKRG/covid19.git.

Few interactive graphs can be seen at https://collaboration. coraltele.com/covid2/.

The author thanks Sean Mullan for helping in the data preparation process and Joseph Paul Cohen for collecting the dataset of COVID-19 CXR images. They also thank the data set providers of ChexPert for helping in negative samples.

R. G. BABUKARTHIK is currently working as an Assistant Professor with the Department of Computer Science and Engineering, Dayananda Sagar University. Prior to that, he worked with Pondicherry University (Central University), the Pondicherry Engineering College, and the Central University of Tamil Nadu, as a System Analyst. He also worked as a Software Developer with Keane India Pvt., Ltd. He has done a research on optimization problems used for solving Web service selection in cloud computing. With experience of more than 12 years, his research interests include cloud computing, Web service, optimization algorithms, deep learning, and data science.

V. ANANTH KRISHNA ADIGA received the B.Tech. degree in computer science from Dayananda Sagar University. His research interests include deep learning and machine learning field.

G. SAMBASIVAM (Member, IEEE) received the Ph.D. degree in computer science and engineering from Pondicherry University, Puducherry, India. He is the Dean of the Faculty of Information and Communication Technology, ISBAT University Kampala, Uganda. His research interests include artificial intelligence, machine learning, deep learning, and Web service computing.

Assistant Professor with the Department of Computer Science and Engineering, Madanapalle Institute of Technology and Science, Madanapalle, India. He is working on e-waste management, disaster management, bio-inspired algorithms, and privacy preserving generic framework for cloud data storage, optimization approach for minimizing agro-crops. His area of interests include distributed Web service, Web service (evaluation) testbed, software metrics, GVANET and cloud computing, opportunistic computing, evolutionary computing, service computing, software engineering, multi-agent, pervasive and ubiquitous computing, fog and edge computing, underwater communication, and privacy and security.

Coronavirus disease 2019 (COVID-19)

Clinical characteristics of coronavirus disease 2019 in China

Community-acquired pneumonia

Influenzaassociated pneumonia as reference to assess seriousness of coronavirus disease (COVID-19),'' Eurosurveillance

Critical care utilization for the COVID-19 outbreak in Lombardy, Italy: Early experience and forecast during an emergency response

High discordance of chest X-ray and computed tomography for detection of pulmonary opacities in ED patients: Implications for diagnosing pneumonia

The role of chest imaging in patient management during the COVID-19 pandemic: A multinational consensus statement from the fleischner society

COVID-19 image data collection

Chestx-ray8: Hospital-scale chest X-ray database and benchmarks on weaklysupervised classification and localization of common thorax diseases

A survey of hierarchical classification across different application domains

Multiclass imbalance problems: Analysis and potential solutions

Early versus late fusion in semantic video analysis

The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China

COVID-19 and italy

The definition and classification of pneumonia

The radiological diagnosis of pneumonia in children

The complete official codebook

Hierarchical classification in credit card data extraction

Medical image modality classification using discrete Bayesian networks

NHL pathological image classification based on hierarchical local information and GoogLeNet-based representations

XMIAR: X-ray medical image annotation and retrieval

Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches

Classification by pairwise coupling

In defense of one-vs-all classification

ADASYN: Adaptive synthetic sampling approach for imbalanced learning

SMOTE: Synthetic minority over-sampling technique

Borderline-SMOTE: A new oversampling method in imbalanced data sets learning

An experiment with the edited nearest-neighbor rule

A study of the behavior of several methods for balancing machine learning training data

Local binary patterns variants as texture descriptors for medical image analysis

ImageNet classification with deep convolutional neural networks,'' in Proc

Very deep convolutional networks for large-scale image recognition

Going deeper with convolutions

Deep residual learning for image recognition

Densely connected convolutional networks

Mastering the game of go without human knowledge

Evolving neural networks through augmenting topologies

Genetic CNN

Large scale evolution of convolutional neural networks using volunteer computing

Large-scale evolution of image classifiers

Introduction to neonatal facial pain detection using common and advanced face classification techniques

A multiresolution approach to automated classification of protein subcellular location images

Pap-smear benchmark data for pattern classification,'' in Proc

Detection of pneumonia in chest Xray images

Texture analysis of medical images for radiotherapy applications

Improved deep learning model for differentiating novel coronavirus pneumonia and influenza pneumonia

Artificial intelligence distinguishes COVID-19 from community acquired pneumonia on chest ct

Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks

Rapid ai development cycle for the coronavirus (covid-19) pandemic: Initial results for automated detection & patient monitoring using deep learning ct image analysis

COVID-net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images

CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest X-ray images

Gradient-based learning applied to document recognition

Neural network architecture optimization through submodularity and supermodularity

PathNet: Evolution channels gradient descent in super neural networks

Dynamic filter networks

Neural architecture search with reinforcement learning

Designing neural network architectures using reinforcement learning

Evolving deep neural networks

A genetic programming approach to designing convolutional neural network architectures

Deep-covid: Predicting covid-19 from chest X-ray images using deep transfer learning

A hybrid differential evolution approach to designing deep convolutional neural networks for image classification

A light CNN for detecting COVID-19 from CT scans of the chest

Modelling for prediction of the spread and severity of COVID-19 and its association with socioeconomic factors and virus types,'' medRxiv

A drone-based networked system and methods for combating coronavirus disease (COVID-19) pandemic

COVID-CAPS: A capsule network-based framework for identification of COVID-19 cases from X-ray images

Survey on lightweight primitives and protocols for RFID in wireless sensor networks

Survey and taxonomy of key management protocols for wired and wireless networks

J. AMUDHAVEL (Member, IEEE) received the Ph.D. degree in computer science and engineering from Pondicherry University, Puducherry, India. He is currently working as a Senior Assistant Professor with the School of Computer Science and Engineering, VIT Bhopal University, India. His research interests include artificial intelligence, machine learning, bio-inspired algorithm, evolutionary computing, and distributed systems.