key: cord-0724479-i2tg61pr
authors: Barnawi, Ahmed; Chhikara, Prateek; Tekchandani, Rajkumar; Kumar, Neeraj; Alzahrani, Bander
title: Artificial intelligence-enabled Internet of Things-based system for COVID-19 screening using aerial thermal imaging
date: 2021-05-26
journal: Future Gener Comput Syst
DOI: 10.1016/j.future.2021.05.019
sha: 3ebfc97e25701fdd0254a50189f8e49837e66e0d
doc_id: 724479
cord_uid: i2tg61pr

Internet of Things (IoT) has recently brought an influential research and analysis platform in a broad diversity of academic and industrial disciplines, particularly in healthcare. The IoT revolution is reshaping current healthcare practices by consolidating technological, economic, and social views. Since December 2019, the spreading of COVID-19 across the world has impacted the world’s economy. IoT technology integrated with Artificial Intelligence (AI) can help address COVID-19. UAVs equipped with IoT devices produce raw data that demands computing analysis to make significant actions without human intervention. To mitigate the effect of COVID-19, in this paper, we design an IoT-UAV-based scheme that collects raw data using onboard thermal sensors. The thermal image captured from the thermal camera is used to determine the potential people in the image (of the massive crowd in a city), which might have COVID-19, based on the temperature recorded. An efficient hybrid approach for a face recognition system is proposed to detect the people in the image having high body temperature from infrared images captured in real-time. Also, a face mask detection scheme is introduced, which detects whether a person has a mask on the face or not. The schemes’ performance evaluation is done using various machine learning and deep learning classifiers. We use the edge computing infrastructure (onboard sensors and actuators) for data processing to reduce the response time and real-time analytics and prediction. The proposed approach delivers an average accuracy of 99.5% by using less than half the parameters for real-time predictions.

There is currently an outbreak of respiratory infection caused by a virus named 'SARS-CoV-2', and the caused disease has been called coronavirus disease 2019 . It has devastated our daily lives and generated substantial economic damage [1] . The COVID-19 has affected many sectors such as healthcare, transport, education, finance [2] , manufacturing, etc., to name a few. As many transactions are happening online, Information-centric networks have shown their benefits in terms of improved reliability, efficiency, and fast information delivery lifted it as a competent interconnected networking form for Internet infrastructure [3] . On January 31, 2020, Health and Human Services (HHS) announced a notice of a public health crisis for COVID-19 and prepared the Operating Divisions of HHS. Moreover, on March 13, the United States President declared a national emergency in response to COVID-19 [4] . The typical symptom includes fever, typically appearing within two weeks after the contact. Lately, due to people's health consciousness, it drives us to give special attention to remote E-healthcare services [5] . Technology's adoption can be boon to society during current uncertainty and constant fear and save many lives. The COVID-19 has badly impacted many people's lives and lifestyles, both personal and professional. Though technology will not be able to prevent the beginning of a pandemic; however, it can aid in controlling it more efficiently. Infrared and wireless thermometers are generally used medical tools at toll gates, entry and exit gates of buildings, airports, marketplaces, hotels, railway terminals, stores, clinics, and various other public places. These temperature monitoring tools estimate the body temperature of individuals without any physical contact; these temperature estimating tools have also been succeeded in recognizing the people who might have COVID-19 and need further attention [6] .

Designing systems such as remote surgery, e-healthcare, ebanking, or e-shopping system, is a challenging task [7] [8] . The continuity and criticality of operation in mission-critical methods depend on their delay, capacity, reliability, and energy [9] [10] . At present, there are no viable methods to diagnose and monitor all cases of COVID-19 infection. It is possible to trace the infection by reporting the cases of fever, accompanied by further analysis. The patients suffering from COVID-19 usually show several symptoms, such as fatigue, shortness of breath, dry cough, and high fever [11] . Fever is the most frequent symptom in patients with COVID-19 virus [12] . Fever is recognized as a complicated and potent biologic response to infection and injury. The rise in body temperature represents cell signaling and gene expression patterns, which influences the immune system function and cell recovery [13] . The fever can be classified as acute if the duration is less than a week, sub-acute if the fever stays for a week or two, and beyond two weeks, it is classified as chronic. Also, the fever can be classified based on temperature as well. Suppose 2 the temperature is between 38 and 39 • C. In that case, it can be classified as low-grade fever, moderate grade fever if the temperature range is 39-40 • C, high-grade fever if the temperature lies in between 40 and 41 • C, and hyperpyrexia if the temperature is beyond 41 • C [14] .

In recent years, IoT has gained promising research ground as a new research topic in various academic and industrial disciplines, especially in healthcare. The IoT revolution has reshaped present healthcare systems by incorporating technological, economic, and social prospects [15] . Telethermographic systems or thermal imaging systems (TIS) detects skin temperature. These systems incorporate a thermal infrared camera with a temperature reference source. Skin temperature is a suitable and efficient sign for an accurate estimation of human sensations and thermal states with respect to the surrounding thermal stresses. This information is gathered from thermal images collected by infrared imaging [16] . The advantage of employing thermal imaging systems for primary temperature evaluation for triage use is its potential use in high throughput areas (e.g., airports, businesses, shopping complexes, organizations' offices) and in places where other temperature assessment products may be in less quantity. TIS can accurately measure a person's surface skin temperature without any physical contact with the person. These systems provide certain benefits as other systems need closer contact for measuring the temperature. TIS cameras can't recognize a virus, but they can identify raised body temperatures in public and private spaces in a quick, non-invasive way, warning security and healthcare teams to test the person for illness. There is a possibility that a person with COVID-19 may not have a fever [17] . So, TIS is not the best alternative that can effectively determine whether the person has COVID-19 or not. Hence, a diagnostic test must be conducted for the confirmation of COVID-19. Several countries have opted for TIS during epidemics, but reports have been mixed about their credibility as part of efforts to reduce the disease's spread. Global health concerns have arisen with the spread of the virus because of the flattened mortality curve worldwide. To handle the disease, multi-swarm UAVs are increasingly deployed in an IoT environment, such as places of mass gatherings, smart cities, smart nations, and indoor corridor environments, etc. [18] [19] [20] . IoT-enabled drones have been demonstrated to aid medicines and other vital medical equipment to residents in developing nations. Drones have been used in sanitizing public spaces or identifying COVID-linked symptoms. For example, UAVs can be deployed to detect violations of lockdown, stayat-home, or social distancing directives during pandemics like COVID-19 [21] .

Various fever screening methods to detect the possibility of the COVID-19 are currently used. These screening methods have few limitations. Unfortunately, high-quality clinical-grade thermometers are not widely available at every place. Hence clinicians use oral thermometers. These thermometers operate close to the possibly infected person. The results are also affected by the intake substance's temperature just before the checkup. Noncontact infrared thermometers (NCITs), generally known as forehead screeners, are being used at fitness centers, schools, and corporations. Even in a properly operating environment, some NCITs fail to differentiate between people with hypothermia (35 • C) and those with a severe fever (40 • C). NCIT sensors are close to accurate; however, they are easily affected by the surrounding air temperature. If NCITs were to report temperatures in real-world conditions, they would often be absurd. Some devices do list wild readings, while others appear to report close-to-normal temperatures most of the time. Some operators say the ridiculous readings, while others ignore impossibly low readings. That makes many NCITs ineffective for clinical purposes. The thermal imaging field is where we see lots of new products that fulfill the need of the hour. These can work from a safe distance automatically. Thermal sensing can detect and convert the measured light into a temperature. An NCIT uses a single-pixel sensor that is operated close to the skin. However, thermal imaging systems, on the other hand, use a combination of identical pixel sensors to output an image of the luminous intensity. For measuring a temperature, an infrared device must first acquire an accurate surface temperature measurement using a skin patch. Core body temperature can then be extrapolated using a previously calibrated relationship between the skin temperature, air temperature, and core body temperature. The system performs better as there is a consistently thin insulation level between core blood and air at the tear duct, the region where the eye joins the bridge of the nose.

IoT healthcare is a new worldview that provides services and medical data. The IoT system in medicine is promptly in an advanced setup that includes varieties of mechanisms like smart sensors, medical equipment, telemedicine, clinical information system, and many more [23] . IoT technique is categorized into; remote monitoring of patients, remote health tracking, and monitoring of people wearing masks, social distancing. Since the pandemic began, authorities in New Delhi, Italy, Oman, and many other countries have started experimenting with fever-finding UAVs as a mass COVID-19 screening system. The efficacy of the results has witnessed that UAVs can better understand people's health at a large scale and identify potentially infected individuals, who can further be pulled aside for further diagnostic testing. This paper aims to devise an IoT-based UAV system using Thermal Corona Combat Drone (TCCD) to automatically detect the COVID-19 from the thermal images at a higher speed and with almost zero human intervention. The specifications of TCCD has mentioned in Table I . Moreover, this paper presents a nonintrusive infrared thermography framework for determining an individual's thermal comfort level with the help of collected skin temperature data using thermal cameras. The thermal cameras' output for two different people, one with a fever and the other having an average temperature, are shown in Figure 1 . Unlike existing systems dependent on putting sensors straight on humans for skin temperature measurement, the proposed system collects the skin temperature data using UAV's swarm, then extracts facial regions, and finally interprets thermal comfort conditions. The proposed scheme also tells whether an individual is wearing a mask or not. [22] .

The contributions of the research work are as follows.

1) We develop a fever screening and tracking system using thermal and normal cameras mounted on the UAV. The fever symptoms can be detected in a crowded area by identifying the person using face recognition. 2) With the proposed AI scheme, we can trace the potential patients. Real-time notification is then sent to the person on his/her email or contact number. A feedback mechanism is also added to the scheme, by which if a person who is notified as having a high fever has COVID-19 after the checkup at the hospital. We can then use the recorded video footage to warn the people near the patient at that particular location and ask them for the quarantine and checkup. 3) Besides, various thermal camera systems contain a face recognition feature for identifying anyone who triggers a warning. Hospitals, market centers, and office buildings worldwide are now collecting face images of every individual who uses their facilities, and based on that, needful measures have been taken. The critical question is whether these organizations follow the necessary security norms for data collection and storing private information securely. Our proposed system is not storing personal data (images) that the user uploaded on the platform. We are only keeping the image embeddings in a central server.

individual is wearing a mask on the face or not. If a person is found not wearing the mask then a notification in the form of a suggestion is sent to the person. The COVID-19 mask detector could potentially be used to help ensure the person's safety and the safety of others as per guidelines issued by the government. The proposed scheme is shown in Figure 2 , with IoT-linked technologies mentioned in Table II . 

IoT-based systems have shown great importance among healthcare application developers after the progression of information and communication technologies [24] . IoT applications' rise considers IoT as a highly viable alternative for COVID-19 control. IoT technologies can be used for tracking the COVID-19 contamination pattern, diagnosing COVID-19 patients, providing telemedicine services, and combining these applications to wearable devices such as smartwatches, smart bands, etc. [25] . Singh et al. [26] used cutting-edge technologies such as IoT, AI, UAVs, etc., for testing, contact tracing, spread analysis, sanitization, protocol enforcements to prevent the COVID-19 spread. Moreover, UAVs can be used in resolving several technological difficulties arising from the pandemic [27] . As many people, irrespective of their infection status, are in quarantine for a long time, UAVs can help scenarios such as UAV-based food delivery the automatic parcels delivery [28] . UAVs can also be used to deliver life-saving medicines in distant hospitals [29] . Small robotic systems can be planted in hospitals and other places to recall users about following the social distancing norms and the use of masks/gloves. AI models can also be used for facial emotion recognition that can recognize the stress in health workers by providing them with further attention [27] . Kumar et al. [30] examines the drone-based methods, COVID-19 pandemic conditions, and introduces an architecture for handling pandemic in various situations by applying realtime and simulation-based scenarios. The observation was that using the UAV-based healthcare system can cover a large area for sanitization, thermal image acquisition, and patient identification within a short period (2 KMs within 10 min). Khan et al. [31] introduced an automatic COVID-19 pandemic emergency response system that can be used efficiently to provide essential items to the required location and minimal transportation cost. Their proposed system can recognize the crowd in several cities and can recognize people without masks in public places. They have also used 3-D printing technology to create emergency equipment for COVID-19 and UAVs to deliver them to the needed locations. Dobrea et al. [32] developed a quadcopter system with a pre-programmed flight route's capabilities and simultaneously detecting humans in a crowded area and warning the system operator to reinforce the quarantine zones. Jat et al. [33] discussed the current advancements in the applications where drones are used and their connectivity in an IoT network to enhance their efficiency in situations of COVID-19. Gupta et al. [34] suggest a UAV communication system that uses a blockchain-envisioned scheme using the intelligent connectivity of 6G network, Terahertz (THz) frequency bands, and virtualization of link and physical-level protocols. The suggested system results have less processing delay and low packet loss reduction than the existing 4G/5G-based communication systems.

TIS cameras collect skin temperature data without making any contact with the object. These sensors give a complete 5 image frame of thermographic areas from which the temperature reading at each pixel location in the frame can be extracted [35] . Silvino et al. [36] recommended the use of Information Technology as an approach to promptly and safely mass-screen the skin temperature to detect febrile people who may have been infected with the COVID-19. Tan et al. [37] developed a fever screening and tracking system that detects patients with fever symptoms and identifies the patient's identity by applying face recognition. Ordun et al. [38] provide an introductory overview of recent advances for thermal Face Emotion Recognition (FER) and proposed a thermal imagery system that provides a semi-anonymous modality for computer vision. They summarized thermal FER and the constraints of accumulating and collecting thermal FER data for the model training. Farooq et al. [39] discussed the importance of Infrared Thermography (IRT) and the purpose of AI in thermal-medical image analysis for diagnosis of many diseases and human health monitoring in the early stages. They used state-of-the-art Convolution Neural Network (CNN) to classify breast tumors using thermal breast images. Li et al. [40] presented a framework that uses infrared thermography to interpret thermal comfort on a real-time basis of indoor conditions with minimum interference of building residents. The framework uses thermoregulatory theory, machine learning, and computer vision. The outcomes show that face skin temperature obtained from non-intrusive infrared thermal cameras can deliver a strong thermal comfort prediction. Metzmacher et al. [41] presented a real-time scheme for the analysis of skin temperatures of a person using sensor fusion and thermal image recognition. They have used a dynamically calibrated thermal imaging camera to track individual faces and then measure different facial areas' temperatures. The temperature readings from the thermal camera were confirmed with a reference sensor attached to the skin. Abouelenien et al. [42] proposed a scheme that uses both thermal and physiological features to indicate the stress in a person, and the scheme can be employed in clinics and a variety of applications. The results showed that the thermal features performed better in many cases with increased performance instead of physiological features. Ranjan et al. [43] applied various statistical approaches to correlate the thermal sensation captured by the thermal cameras. Burzo et al. [44] created a thermal discomfort detection scheme using infrared thermography, which reduces energy usage while improving its inhabitants' convenience. This model detects thermal discomfort considering the thermal features along with physiological features collected from an indoor atmosphere. Oliveira et al. [16] deals with face thermographic image analysis in which thermal areas of interest are obtained, such as left and right front, left and right cheek, left, and right periorbital region. They examined the extracted area of interest by the Fast Fourier Transform (FFT) power spectrum.

The UAVs are mainly used for the face mask, medicine, and other necessary medical equipment to the needed patients who are far away from hospitals in lesser time. Some research work analyzes the spread of the virus and crowdsensing. In the proposed work, we use UAVs for face detection using onboard sensors in crowded streets. The captured face image can be further used to identify whether the person has COVID-19 symptoms or not and can be used for face mask detection. Further actions like quarantining the patient and suggesting the people in contact with him/her for a medical checkup. The proposed work also gives warning to the people who are not wearing the mask on the streets. Also, in previous work, X-rays and CT scans are used for the detection of COVID-19. In those cases, first, the CT scan is performed. The experienced doctor checks for the possible signs of the virus or uses computer vision models to classify those X-ray images into COVID-19 or normal. The problem with CT-scans is that it is not highly scalable, hence to check people in large number it will take a high amount of time. In a real-life situation, the thermal camera method is better suited to monitor vulnerable people at high risk from COVID-19. Then once the monitoring system has identified the irregular breathing patterns, an alarm can be raised with a career or family member. Therefore, the proposed approach uses the advantages of both UAVs and thermal cameras to detect possible COVID-19 patients.

This paper proposed a COVID-19 screening system based on the temperature of a person in the outdoor environment. For that, a UAV swarm is deployed in a city that is responsible for gathering the information. The information will be the abnormal temperature with the respective person's face. The UAV is equipped with two different cameras (normal or optical camera and thermal camera), enabling the collection of detailed information of the people's temperature and face. The combination of dual-camera setup focuses on the region of interest and finally determines the individual's temperature. After screening a specific area, the people identified as having high temperatures and possibly may have COVID-19 will get an independent emergency call by the central station. The location and the identity of the person will also be sent to the central server. The proposed scheme also provides functionality in which, if a person is found to have COVID-19 after the checkup, then based on the recorded footage, we can found the people who were with that person during the time and can ask them to have a checkup as well or ask them to stay in quarantine. Using the onboard camera's power, we have also devised a lightweight model based on MobileNet, which used the stored recording as input and found the people who are not wearing the masks. This information can be used to warn people to use the mask to minimize the spread of the COVID-19 virus. The data flow diagrams, the pipeline of the proposed approach, and the model training technique are described in detail in the next section.

IV. PROPOSED APPROACH AI and IoT together have created a new concept called the artificial intelligence of things or AIoT. The idea behind AIoT is to enhance human-machine interactions, improve data management, and perform data analytics. Although AIoT is a relatively new concept, the combination of AI and IoT delivers an entirely new range of possibilities. For the proposed work, we have used IoT for the data gathering and communication part; the gathered data is then stored in a central server and is consumed by the AI models to give the appropriate actions. With AI technology, automatic pattern identification and anomaly detection in the gathered data can be performed. AI models can make operational predictions faster and with greater accuracy than traditional business intelligence tools.

The proposed information flow diagram in the AIoT environment is shown in Figure 3 . For the data collection, we have used thermal cameras that are very promising tools, often used by firefighters to track smoldering embers and search for out-of-sight suspects. This section will discuss the proposed scheme that finds an individual's temperature using an onboard thermal camera on a UAV. The person showing unusual skin temperature will be notified that he/she should consult a doctor. The normal human body temperature is about 37 • C (98.6 • F). A high temperature is usually above 38 • C, but the normal temperature varies from person to person and changes throughout the day. Getting an absolute measurement of core body temperature isn't simple, so mainly face is considered while measuring the temperature. So, how can we detect the facial temperature of the person who is with a face mask? The heat emitted from the skin will be affected by wearing face masks; hence most heat measurements are based on the forehead, which is usually exposed. While implementing the proposed approach, the International Organization for Standardization (ISO) guidelines are strictly obeyed due to the requirement for quality assurance, routine calibration, training, and documentation of the sensors' obtained data. In the proposed work, thermal imaging with facial recognition is used as the first stage of the evaluation, not the final decision. The suspected person must be medically checked to confirm any infection exposure. The proposed scheme is divided into three parts. The first part is the face recognition part. The second one is the temperature identification system, which uses the detected people's coordinates as an input to calculate the person's temperature, and the third part is the face mask detection. The proposed face recognition scheme's workflow is shown in Figure 4 

In simple deep learning algorithms, we typically train a network where we input a single image, and corresponding to that image, we get a label or class in return. Problems such as estimating the AQI value in a city [45] , the classification between cats and dogs, the classification between pneumonia and normal cases [46] , etc., can be done by picking a suitable machine learning algorithm by feeding in data and getting the result [47] . But for the task of determining faces in an image, we use a technique called object detection and localization, which comes under deep metric learning. Instead of giving output as a single label, we are alternatively outputting a realvalued feature vector or embedding vector. The 'DLIB' (Dlib is a toolkit for creating real-world machine learning and data analysis applications in C++. The library is written in C++ and it can be easily bound to Python [48] ) facial recognition network gives a 128-dimensional output feature vector that is used to distinguish between the human faces with the different feature value set. Face recognition consist of a series of several related problems such as.

1) Firstly, we have to locate all the person faces in the image.

2) The face may never always be straight; it could be turned in various directions. There may be some brightness variation in the captured images, but still, the person is the same and needs to be identified accurately.

3) It is necessary to extract unique features of a person's face that can be used to distinguish him/her from other people (such as the length of eyebrows, facial dimension, facial features or marks, etc.). 4) Finally, the extracted features of that face are used to determine the person's name. 1) Face Localization in an image: Face detection is the first step in our pipeline. We need to find the faces of persons in a video frame before labeling them with the names. Face detection came to the picture in early 2000 when Paul Viola and Michael Jones developed a method for detecting human faces [49] . The method was fast enough and can be deployed on cheap cameras as well. After that, in 2005, a new technique named histogram of Oriented Gradients (HOG) [50] was developed, which performs better than the previous one. So, in the proposed scheme, the HOG features are used to encode an image to create a reduced image version. With the help of this smaller version of the image, we find the portion of the image that resembles a generic HOG encoding of a face.

2) Network Achitecture: The Dlib model is inspired by the ResNet-34 model [51] . From the regular ResNet structure, few layers are removed to create a 29 convolutional layered architecture. The input to the model is a 150×150 RGB image. The network is called an embedding network because for an input image with an aligned face, the model outputs a 128dimensional embedding vector that defines the embedding of that particular identity. The model is pre-trained for more than two weeks on several datasets, including FaceScrub [52] and VGGFace2 [53] . The model was trained on over three million samples, and the model has learned to find face representations. The model is tested on labeled faces in the wild (LFW) dataset [54] . The accuracy achieved is 99.38%, which is better than the human accuracy of 97.53% on the same dataset. During the training, a fixed distance margin is used between different faces, which means that all the potential images of a given individual lie in a hyper-sphere of radius below 0.6 threshold value.

3) YOLO to increase precision: Object detection is a subcategory of computer vision that localizes multiple objects and classifies each object detected in the image. It is a challenging computer vision task that needs a robust 'object localization' to locate and create a bounding box over each detected object in an image and 'object classification' for the localized object's true class prediction. Redmon et al. [55] designed 'You Only Look Once' (YOLO), which is a deep learning model for fast object detection. The architecture includes a single deep convolutional neural network that divides the input image into a grid of cells, and each cell predicts a bounding box. This results in many candidate bounding boxes incorporated into a final prediction by a post-processing step. YOLO and Faster RCNN share some similarities in that they both use an anchor box-based architecture, and both use bounding box regression. YOLO is a faster object detector than Faster RCNN as it makes classification and bounding box regression simultaneously. So, J o u r n a l P r e -p r o o f 8 for real-time prediction applications, YOLO is preferred more. In the proposed scheme, there may be chances when the UAV's onboard normal camera may detect unwanted objects (like a cat, dog, poles, etc.) as humans. This is because the face detection module is scanning the complete image for the faces, which is a time-consuming process and can detect wrong objects. So, we used YOLO before the face detection module for capturing the area of interest first. Being a fast object detector will find the area of interest before feeding the image to the face detection module as the model works on a real-time basis, so latency is an essential factor that should be minimized. The YOLOv1, YOLOv2, and YOLOv3 are the variations of this approach. The first version introduced the general architecture, whereas the second version improved the design and used predefined anchor boxes to enhance the bounding box proposal. The third version further polished the model architecture and training process. For the proposed work, we have used the YOLOv3 architecture. 4) Finding the person's name from the encoding: In this step, we find the person in our database of known people having the closest measurements to the newly captured image. This can be achieved by using machine learning classification algorithms. For this, we need a classifier that takes in input embeddings of a new test image and determines which known person is the closest match. Running these classifiers takes milliseconds, and the output of the classifier is the name of the person. In the proposed work, we have used and compared the results of five machine learning classifiers (Support Vector Machine (SVM), K-nearest neighbor (KNN), XGBoost (XGB), Logistic Regression (LR), and Multi-layered perceptron (MLP)). The comparison between them is made in Section IV.

The TIS has two parts for the identification of temperature as follows.

1) User's part: A web portal is created on which a person can register with the details. Each user has to sign in with the details such as phone number, email address, home address, and with few photographs of the user's face in different orientations. The user has the option to submit a minimum of three and a maximum of nine photographs on the web portal of his/her face covering both the eyes, mouth, nose, etc. The uploaded photos are never stored in any database; rather, the model will convert the face's embeddings and send them to the central server. Sending face embeddings as 128×1 vector rather than sending the full image will help preserve the user's privacy. The other information like contact number and email will be used to notify the user about his/her health status in the future. The data flow diagram at the user's end is shown in Figure 5 .

2) UAV's end: The swarm of UAV will be deployed in a city. Each UAV will have a GPS and two cameras; one thermal camera to capture the video and find the temperature of a person in the video frames, and the second camera is the normal camera which will do the same, that is capturing the video, but this video footage will be used for the identification of the people based on the temperature abnormality recorded using the thermal camera. Both cameras have the same resolution and give similar frames per second, such that there exists no delay during the capture of thermal and normal image frames by the two cameras. The frames from both the cameras are aligned. In every second, we will be taking five frames for the evaluation.

1) All the faces with their coordinates will be extracted using the face recognition pipeline as mentioned in Figure 2 . 2) From each of the human face coordinates extracted from the normal camera frame, we will check corresponding faces from the frame captured from the thermal camera. 3) If the face's forehead has a higher temperature than the mentioned normal human body temperature, then that person's face will be extracted from the image and converted into a 128-dimensional vector and sent to the central server. 4) Further, the 128-dimensional output will be compared from the existing database with the help of a machine learning classifier. 5) After finding the match, a notification will be sent to the person to have a health checkup with the medical facility soon. The data flow diagram at the UAV's end is shown in Figure  6 .

In the current pandemic, multidisciplinary efforts have been organized to reduce the spread of the pandemic. The AI community has also been a part of these endeavors. The proposed face recognition pipeline, as shown in Figure 4 is used, which detects faces even when masks partially cover them. In addition to that, we created a classifier to distinguish faces with and without masks. To draw a complete picture, we created a face detection scheme as follows.

1) Detect people using the YOLOv3 model, in which the image frames captured from the UAV onboard camera are passed. 2) Identify face mask usage using the best state-of-the-art model. 3) Collect reliable statistics such as the number or the percentage of people wearing masks. For the face mask detection classifier, we have used the transfer learning approach shown in Figure 7 . Transfer learning is a type of learning in which the knowledge gained while solving one problem is saved and further applying it to a different but related problem. In our case, we used state-ofthe-art ImageNet models trained on millions of images, thus classifying them into 1000 classes. We removed the final layer with 1000 nodes and added a layer with two nodes, signifying mask or non-mask classes. In the face mask detection module, we have used seven state-of-the-art ImageNet/CNN models (VGG-16, VGG-19, InceptionV3, ResNet50, MobileNet, DenseNet121, and Incep-tionResNetV2). In each of the model, the dense output layer of 1000 nodes are removed, and in place of that, a dense layer containing two nodes is added which depicts the mask or nonmask classes. The input RGB image is resized to 196×196 before sending it as an input to the model. The training of the models is explained in the following steps :

1) Image augmentation is done, comprising the horizontal flip, zooming (range = 0.2), and the shearing factor of 0.2 is also performed on the fly to the input images. While training all the models, layers of the state-of-theart models are kept frozen, which means there will be no update in the weights during backpropagation, but there will be weight updates between the last two layers. 2) Keras Learning Rate Finder is used to find the highest learning rate by defining lower and upper bounds where loss is still noticeably decreasing.

3) The last added dense layer is then trained for 3-4 epochs. 4) Then the layers are unfrozen, and their corresponding learning rates are reduced by ten times. 5) Repeat Step 2. 6) Train the full network until the model overfits, or the model achieves a good fit.

From the steps above, steps 2, 4, and 5 are concerned about learning rate.

Overfitting is one of the critical issues while training a neural network on the sample data. If the number of epochs is specified more than necessary, then the model starts learning patterns specific to the train data to a great extent. Due to this, the model shows less performance on a new dataset. However, the model will give high accuracy on the training set but fails to generalize. To avoid overfitting and to increase the model's generalization capacity, an optimal number of epochs should be used. We have started training our model with 50 epochs. The loss and accuracy on the training set and the validation set are checked with respect to the number of epochs, and the threshold point was noticed, after which the model starts overfitting. We monitored loss/accuracy values by the Early stopping call back function implemented in the Keras library. When the loss is monitored, the training is stopped when an increment is observed in the loss. If we monitor accuracy, then the training is paused when there is a decrement observed in accuracy values. Hence, after evaluation, the epoch value that best suited our use case is ten epochs. The number of epochs can be increased based on data size while training the model in future work. The total number of parameters and the total number of trainable parameters are shown in Table  III . The MobileNet and DenseNet121 models are the lightest models with few numbers of total parameters. Hence, they are perfect models to deploy in cases where we want prediction on a real-time basis. The training is done for ten epochs. The batch size is kept as 64 during training with 'adam' optimizer, 'accuracy' as the metric, and 'categorical cross-entropy' as the loss function. 

This proposed approach utilizes edge computing for data modeling and primary decision-making. Edge computing does not fetch all UAV data. The UAV does the self-processing and keeps the data of the captured images in the form of embeddings [56] . Edge computing conserves time and resources while maintaining the data collection, pre-processing, and analysis on a real-time basis. In the proposed approach, the edge computing system sometimes requires making the decision locally, and other times it has to send the data to fog servers for further detailed processing. The cost of transferring data to fog servers increases with an increase in the scalability of sensors, IoT, and UAV swarm networks. Large data would require resources and intelligence [57] , [58] . Here, edge computing shreds its load by performing data aggregation at the initial level and transfer the necessary data to fog or cloud networks as and when required [59] . This system adds fog computing services in the architecture for commuter profiling, monitoring, and decision-making processes executed in the primary phase. After that, data analytics helps in smart and intelligent commuter trajectory profiling, monitoring, and decision-making. In the system, multiple drones collect a person's information that differs in attributes. This way of collecting the data is much convenient to make a person's profile. Likewise, a COVID person's profile helps in tracking the COVID-19 cases' chain. A high-level conceptual diagram that elaborates the use of UAV swarm in the proposed work is shown in Figure 8 . The UAV platform in Figure 8 refers to the UAVs, the software, and hardware associated with the low-level and high-level controls of these vehicles, with their corresponding onboard sensors. The sensors are accountable for perceiving the environment and analyzing the gathered data from the environment and other UAVs present in the swarm, and the Communication & networking block enables the propagation of information between devices in the network (such as UAVs, ground control stations, or central servers). The Coordination block manages the decision-making, such as path planning and task sharing, that act as a feedback mechanism block. The communications between the blocks and the expected functionality from each block are dependent on the use case [60] . 

We have used two independent datasets for the validation and verification of the proposed scheme. One dataset is used for the training purpose, and the other is used for the testing purpose to ensure that the proposed model does not overfit the training data. The details of the datasets are as follows.

1) Train dataset: For the training purpose, we have used the 'Labeled Faces in the Wild' dataset [54] , a dataset of face images of people intended for solving the task of unconstrained face recognition. The dataset comprises more than 13,000 images of human faces. Each face has been labeled corresponding to the name of the person pictured. There is a total of 5760 people in the dataset, and each has at least two distinct photos in the dataset. Each image in the dataset is centered on a single face. The images in the dataset are of 250×250 dimensions and in RGB format. 2) Test dataset: The dataset contains 100 images of each celebrity and is used for testing the proposed face recognition technique. The dataset is created from the web scrapping the google images pages for each of the celebrities. The dataset was manually checked to ensure that the celebrity is present in the image or not.

We have used two datasets for the face mask detection as mentioned in Table IV . The dataset consists of the mentioned split into the training and testing set. 

After getting the embeddings (128-dimensional vector) for an image from the Dlib model's output, we input these embeddings to the five different machine learning classifiers with their default hyper-parameters. The comparison of four celebrities with 100 images is performed using the five classifiers, as shown in Table V . SVM and MLP show the best results with 47 and 50 wrong predictions of 400 test images out of the five mentioned classifiers. Although 50 images are labeled incorrect out of 400, we have an average accuracy of 87.5%, which is not a good number. The reason behind this is that the input dataset is not clean. Some images of the celebrity in the dataset are older or newer photos compared to the training photo. Age plays a vital role in facial feature recognition because of the facial feature changes with aging. So, the images in the database should be updated regularly to handle this issue. To handle such scenarios, we can use an online learning approach.

D. Face mask detection results analysis 1) Results evaluation on Dataset 1: Figure 9 shows the training accuracy curve of all the seven classifiers used. According to the figure, MobileNet and DenseNet start with a training accuracy of over 98% in the first epoch, while the rest are below 95%. At the second epoch, all the classifiers (except ResNet50) reached almost the same accuracy. As the number of epochs increases, the training accuracy of all the classifiers (except ResNet50) is practically equal. Until the ninth epoch, the accuracy of ResNet50 is below 95%, and at the tenth epoch, it is around 96%. Simultaneously, the rest of the classifiers are touching the 100% mark, with InceptionV3 being slightly less. The training accuracy curve shows that the MobileNet and DenseNet are better than other classifiers. Their accuracy was higher than others in the starting and stayed 100% mark for the rest of the epochs. Whereas ResNet50 performed the worst in terms of training accuracy. MobileNet and DenseNet being light models with fewer parameters, will help achieve higher accuracy on a real-time basis because of the fast inference power. Figure parameters, less training loss will perform better in real-world scenarios.

2) Results evaluation on Dataset 2: Figure 11 shows the train accuracy curve of all the classifiers on Dataset 2. Till the second epoch, VGG16 and VGG19 have less training accuracy than the rest of the classifiers (except ResNet50). As the number of epochs increases, all the classifiers (except ResNet50) converge towards the 100% mark. After the tenth epoch, ResNet50 finally touched the training accuracy of 92%. Moreover, the rest of the classifiers has almost 100% accuracy, and VGG19 has slightly lesser accuracy than 100%. MobileNet, DenseNet, InceptionV3, and InceptionResNetV2 shows the highest training accuracy in comparison with all the seven classifiers. The best approach will be to either choose MobileNet or DenseNet. The reason is that these two models have a lesser number of parameters and hence will give output results in less time. Figure 12 shows the train loss curve of all the classifiers on Dataset 2. At the first epoch, all the classifiers have a training loss of more than 0.05. At the second epoch, VGG16 shows the least loss, followed by InceptionV3, DenseNet121, MobileNet, VGG19, InceptionResNetV2, and then ResNet50. As the number of epochs increases, the loss of all the models starts to decrease. Even after the tenth epoch, the training loss of ResNet50 is never below 0.05 (not visible in the zoomed graph). The curve of VGG16 and VGG19 shows a decline in training loss after every epoch. At the tenth epoch, VGG16 and DenseNet121 show the least training loss, followed by VGG19, MobileNet, InceptionResNetV2, InceptionV3, then ResNet50. The best model with minimum training loss is VGG16, VGG19, and DenseNet. The best model in terms of lesser training loss and faster inference time is DenseNet. Table VI shows the comparison of all the seven used stateof-the-art deep learning models, based on the test set of the mentioned two datasets: Dataset 1 and Dataset 2. On Dataset 1, VGG16, VGG19, MobileNet, and DenseNet show a perfect 100% score on all the classification metrics. Moreover, all the models except ResNet50 are showing 100% precision. While on Dataset 2, the highest accuracy is shown by InceptionV3 and InceptionResNetv2. MobileNet, InceptionV3, and Incep-tionResNetv2 deliver the highest precision, and VGG16 and VGG19 give the highest recall value. The comparison based on accuracy with respect to both datasets 1 and 2 is shown in Figure 13 . A few examples of the correct and incorrect outputs of the face mask classifier are shown in Figure 14 .

In the current pandemic, the need for tools that ensure symptom detection in the patients is in high demand because not all proposed solutions show better results as advertised. In such cases, IoT can help with applications such as crowd monitoring, giving timely suggestions and notifications to registered people, and tracking the causes of disease spread. AI linked with IoT is more compelling for rational decision-making and can assist in combating this pandemic. Presently numerous fever screening methods are used for the detection of the COVID-19 in the patient. These screening methods have some J o u r n a l P r e -p r o o f drawbacks. Also, there is an unavailability of high-quality clinical-grade thermometers in the majority of places. Therefore clinicians use oral thermometers which operate close to the possibly infected person. The results are also biased because of the intake substance's temperature just before the checkup. The better alternative is the use of thermal cameras that detect fever-like symptoms in highly crowded areas such as clinic entrances, shopping complexes, and warehouses. This paper proposes a fever screening and faces recognition scheme using onboard cameras on UAVs to detect COVID-19 patients with fever symptoms. Moreover, our scheme traces the patients based on the facial features collected by the UAV. When a suspicious patient is identified, a real-time alert is sent to the personal mobile app location on which the user has done the registration. With this alert, proper actions can be taken to track and quarantine the patient and give the recommendation to consult a doctor. The proposed face detection model is evaluated using SVM, KNN, XGB, LR, and MLP models.

The proposed system can also determine whether a person J o u r n a l P r e -p r o o f is with a mask or without a mask using a deep learning model. The evaluation of the proposed face mask detection model is done on two different face mask datasets using stateof-the-art deep learning models. Hence, our system gives an efficient means to control the virus spread. A mass-screening system using AI technology could significantly advance the possibility of conducting high-quality COVID-19 pandemic countermeasures.

Though the proposed approach facilitates fighting COVID-19, it can be further improved by considering the below points.

1) The comparison of manual vs. UAV-based thermal scanning is required to show the importance of both types of systems in the medical system. For example, a manual system is favored if all medical facilities (such as personal protection equipment, a surgical gown, gloves, respiratory protection, eye protection, face shield, etc.) are timely available. Otherwise, UAV based system is preferred in case of scarcity of medical facilities (like at the beginning stages of the COVID-19). 2) In the proposed work, UAVs are used in thermal scanning in both real-time and simulation-based environments. In both cases, residential areas are considered for scanning.

Here, UAVs are used to scan people in both high-rise buildings and pedestrians in real-time experimentation. However, a swarm of UAV needs to perform indoor scanning operations if people do not take self-initiatives. 3) Currently, we are handling the outdoor environment; we plan to operate at a long distance with indoor operations in future work. The accessibility to operate UAVs and data collection without collision is a significant issue that needs to be addressed in the future. 4) The proposed face detection model and the face mask detection model are trained on the datasets with less but sufficient number of images. In future work, we will be training the models using a diverse dataset gathered from different sources worldwide. By that, we will get better results, and our model will be more generalized and less likely to overfit. 5) In the current version, we are deploying an offline scheme, in which if we want to train our model on a new set of images, then we have to do it from scratch. But in future work, we will be converting it to an online learning scheme, which will be re-trained on only the new set of images. This will save our training time.

A survey on how computer vision can response to urgent need to contribute in covid-19 pandemics

Prediction of the price of ethereum blockchain cryptocurrency in an industrial finance system

A trust management scheme to secure mobile information centric networks

Proclamation on declaring a national emergency concerning the novel coronavirus disease (covid-19) outbreak, (mar. 13, 2020)

A constrained framework for contextaware remote e-healthcare (care) services

Role of technology in the era of covid-19 pandemic

Managing computational complexity using surrogate models: a critical review

Ensemble of surrogates and cross-validation for rapid and accurate predictions using small data sets

Computation of the reliable and quickest data path for healthcare services by using service-level agreements and energy constraints

A rule-based method for automated surrogate model selection

Asymptomatic and presymptomatic sars-cov-2 infections in residents of a long-term care skilled nursing facility-king county, washington

Association of chemosensory dysfunction and covid-19 in patients presenting with influenza-like symptoms

Fever, hyperthermia, and the lung: It's all about context and timing

Fever, fever patterns and diseases called 'fever'-a review

Internet of things for current covid-19 and future pandemics: An exploratory study

Infrared imaging analysis for thermal comfort assessment

Thermal imaging systems (infrared thermographic systems / thermal imaging cameras)

Dcnn-ga: A deep neural net architecture for navigation of uav in indoor environment

Bayesian coalition game for the internet of things: an ambient intelligence-based evaluation

Survey of computer vision algorithms and applications for unmanned aerial vehicles

Blockchain-enabled trustworthy group communications in uav networks

New standards for fever screening with thermal imaging systems*

Toward a novel design for coronavirus detection and diagnosis system using iot based drone technology

Technical, temporal, and spatial research challenges and opportunities in blockchain-based healthcare: A systematic literature review

Building resilience against covid-19 pandemic using artificial intelligence, machine learning, and iot: A survey of recent progress

Preventing covid-19 spread using information and communication technology

Restructured society and environment: A review on potential technological strategies to control the covid-19 pandemic

Drones for parcel and passenger transportation: A literature review

Journal of special operations medicine: a peer reviewed journal for SOF medical professionals

A drone-based networked system and methods for combating coronavirus disease (covid-19) pandemic

Automated covid-19 emergency response using modern technologies

An autonomous uav system for video monitoring of the quarantine zones

Artificial intelligence-enabled robotic drones for covid-19 outbreak

Blockchain-envisioned softwarized multi-swarming uavs to tackle covid-i9 situations

A drone-based networked system and methods for combating coronavirus disease (covid-19) pandemic

Identifying febrile humans using infrared thermography screening: Possible applications during covid-19 outbreak

Fighting covid-19 with fever screening, face recognition and tracing

The use of ai for thermal emotion recognition: A review of problems and limitations in standard design and data

Infrared imaging for human thermography and breast tumor classification using thermal images

Non-intrusive interpretation of human thermal comfort through analysis of facial infrared thermography

Real-time human skin temperature analysis using thermal image recognition for thermal comfort assessment

Human acute stress detection via integration of physiological signals and thermal imaging

Thermalsense: determining dynamic thermal comfort preferences using thermographic imaging

Using infrared thermography and biosensors to detect thermal discomfort in a building's inhabitants

Federated learning and autonomous uavs for hazardous zone detection and aqi prediction in iot environment

Deep convolutional neural network with transfer learning for detecting pneumonia on chest x-rays

Machine learning is fun! part 4: Modern face recognition with deep learning

Dlib-ml: A machine learning toolkit

Rapid object detection using a boosted cascade of simple features

Histograms of oriented gradients for human detection

Deep residual learning for image recognition

A data-driven approach to cleaning large face datasets

Deep face recognition

Labeled faces in the wild: A database forstudying face recognition in unconstrained environments

You only look once: Unified, real-time object detection

Edge computing-based security framework for big data analytics in vanets

Fog data analytics: A taxonomy and process model

Fog computing for smart grid systems in the 5g environment: Challenges and solutions

Optimized big data management across multi-cloud data centers: Software-definednetwork-based analysis

Drone networks: Communications, coordination, and sensing

Periocular recognition

Face mask 12k images dataset

ACKNOWLEDGMENT This project was funded by the Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah, under grant no. GCV19-2-1441. The authors, therefore, acknowledge with thanks DSR for technical and financial support.

Prof. Neeraj Kumar (SMIEEE) (Recipient of 2019, 2020 highly-cited researcher from WoS) is working as a Full Professor in the Department of Computer Science and Engineering, Thapar Institute of Engineering and Technology (Deemed to be University), Patiala (Pb.), India. He is also adjunct professor at Asia University, Taiwan, King Abdul Aziz University, Jeddah, Saudi Arabia. He has published more than 400 technical research papers (DBLP: https://dblp.org/pers/hd/k/Kumar_0001:Neeraj) in top-cited journals and conferences which are cited more than 14016 times from well-known researchers across the globe with current h-index of 65 (Google scholar: https://scholar.google.com/citations?hl=en&user=gL9gR-4AAAAJ ) . He has also edited/authored more than 10 books from Elsevier, Springer, CRC and other well-known publishers. He has guided many research scholars leading to Ph.D. and M.E./M.Tech. His research is supported by funding from various competitive agencies such as-DST, CSIR, UGC, TCS etc. His broad