key: cord-1010287-j00tyuab authors: Sunitha, D.; Patra, Raj Kumar; Babu, N.V.; Suresh, A.; Gupta, Suresh Chand title: Twitter Sentiment Analysis Using Ensemble based Deep Learning Model towards COVID-19 in India and European Countries date: 2022-04-18 journal: Pattern Recognit Lett DOI: 10.1016/j.patrec.2022.04.027 sha: 9c3b152fd85c147475f963f43e4ae8d072fff120 doc_id: 1010287 cord_uid: j00tyuab As of November 2021, more than 24.80 crore people are diagnosed with the coronavirus in that around 50.20 lakhs people lost their lives, because of this infectious disease. By understanding the people's sentiment's expressed in their social media (Facebook, Twitter, Instagram etc.) helps their governments in controlling, monitoring, and eradicating the coronavirus. Compared to other social media's, the twitter data are indispensable in the extraction of useful awareness information related to any crisis. In this article, a sentiment analysis model is proposed to analyze the real time tweets, which are related to coronavirus. Initially, around 3100 Indian and European people's tweets are collected between the time period of 23.03.2020 to 01.11.2021. Next, the data pre-processing and exploratory investigation are accomplished for better understanding of the collected data. Further, the feature extraction is performed using Term Frequency-Inverse Document Frequency (TF-IDF), GloVe, pre-trained Word2Vec, and fast text embedding's. The obtained feature vectors are fed to the ensemble classifier (Gated Recurrent Unit (GRU) and Capsule Neural Network (CapsNet)) for classifying the user's sentiment's as anger, sad, joy, and fear. The obtained experimental outcomes showed that the proposed model achieved 97.28% and 95.20% of prediction accuracy in classifying the both Indian and European people's sentiments. COVID-19 is a novel viral disease, where the first case is observed in china during December 2019, and it infected over 24.80 crore people worldwide by November 2021, causing an estimated deaths of 50.20 lakhs people (Zhai et al. 2020; Mahajan & Mansotra 2021; Arora et al. 2021 ). Many strategies are followed to decrease the number of infected people such as business closures, self-quarantines, travel bans, and social distancing measures that significantly transforms the society structures all around the world (Gupta et al. 2020) . The vaccination procedure started in several nations for preventing their people from serious ill with COVID-19 disease (Aggrawal et al. 2021) . In this pandemic circumstance, the social media platforms such as Twitter, Instagram, Facebook, WhatsApp, etc. helps in gathering insightful information related to COVID-19 disease (Osakwe et al. 2021; Lamsal 2021; Al-Shaher 2020) . The content about medical services, epidemic sign and the communities affected by COVID-19 disease outbreaks . Compared to other social media sites, the twitter is effective in sharing informative messages with a length of 280 characters. Active users tweets has multiple insightful information about location and travel history of the patients, cases recovered, suspected and confirmed, and the symptoms of the patients like body pains, running nose, headache, fever and cold (Kwan & Lim 2021) . The COVID 19 related tweets are labelled as "informative" tweets, and the irrelevant user tweets are labelled as "uninformative" tweets (Rustam et al. 2021; Xue et al. 2020) . The objectives of this study are given as follows: (i) automatically finds European and Indian people"s sentiments expressed on Twitter platform related to and (ii) identifies the most discussed topics by the twitter users while expressing their emotions about COVID-19 (Afroz et al. 2021; Das & Dutta 2021; Cabezas et al. 2021) . The major contributions of this study are determined below:  After real-time twitter data collection, the data preprocessing is accomplished to eliminate special characters, punctuations, numbers, repeated words, non-English characters, hashtag symbols, un-necessary spaces, tabs and newlines from the tweets for better user"s sentiment prediction.  Further, the exploratory investigation: key word trend investigation and topic modelling is carried out for better understanding of the collected data. In addition, the feature extraction is performed using TF-IDF, GloVe, pre-trained Word2Vec, and fast text embedding"s for extracting the discriminative feature vectors from the pre-processed data.  The extracted discriminative feature vectors are fed to the ensemble classifier (combination of GRU and CapsNet) for classifying the Indian and European people"s sentiments as fear, joy, sad and anger. The proposed ensemble classification model reduces the vanishing gradient problem by utilizing the activation function. Here, the activation function is a linear inter-polation between the candidate and the prior candidates.  In this scenario, the effectiveness of the ensemble based deep learning model is evaluated in terms of f-score, accuracy, recall, Matthews"s Correlation Coefficient (MCC), and precision. The proposed ensemble based deep learning model achieved better performance compared to the existing models like support vector machine, logistic regression (Majumder et al. 2021 ). This paper is prepared as follows: a few existing articles on the topic "twitter sentiment analysis related to COVID-19" are reviewed in section 2. Technical description and experimental analysis of the proposed ensemble deep learning model are denoted in the sections 3 and 4. The conclusion of this work is specified in section 5. Gupta et al. 2021 developed a new emotion care approach to evaluate multi-modal text data that contain real time COVID-19 tweets. Initially, the collected tweets were transformed into lower case strings and then the punctuation, special characters (except A-Z, a-z), user mentions, retweets, links, and stop-words were eliminated effectively. After data cleaning, tokenization was processed to dis-integrate sentence into words. Further, a multimodal vector was developed after identifying the frequently used words in the tweets utilizing the Term frequency-Inverse document frequency. Finally, the eight-scale emotions were classified like trust, surprise, sadness, joy, fear, anticipation, anger and disgust. Majumder et al. 2021 collected twitter data across India during the time period of March 2020 to June 2020, and then the collected twitter data were converted into lower case to eliminate hyperlinks, punctuations, and abbreviation of retweets. Next, a label encoding technique was used to transform the data labels (negative, neutral, and positive tweets) into numeric form (0, 1, and 2) to ease the process of data classification. After performing lemmatization and Text Blob, the classification was carried out utilizing Support Vector Machine (SVM), and logistic regression to classify the emotions of the tweets. Imran et al. 2020 analyzed the people"s sentiments about COVID-19 lockdown actions taken by different countries. In this study, the deep Long Short Term Memory (LSTM) network was utilized to estimate the emotions and sentiment polarities from the people tweets. Naseem et al. 2021a acquired twitter data from COVID-Senti dataset and then the Text Blob was used to label the emotional sentiments into neutral, negative, and positive. Further, the data cleaning technique was used to pre-process the collected twitter data, because the raw data are often noisy, informal, short and unstructured. In addition, the improved word vector and hybrid word ranking methods were applied to incorporate the context of tweets and sentiments for twitter sentiment analysis. Finally, the sentiment classification was accomplished using different deep and machine learning techniques like LSTM network, SVM, decision tree, Naïve Bayes, Convolutional Neural Network (CNN), and random forest. Shamrat et al. 2021 used supervised K-Nearest Neighbor (KNN) technique to classify the sentiments of people about COVID-19 vaccination. Chintalapudi et al. 2021 collected Indian people"s tweets from twitter websites during the time period of 23rd March 2020 to 15th July 2020 for sentiment analysis. Initially, the collected data were labelled as anger, joy, fear, and sad. Then, the Bi-directional Encoder Representation from Transformer (BERT) approach was implemented for text analysis. From the experimental examination, the presented model attained high prediction accuracy compared to other classification techniques such as LSTM, SVM, and logistic regression in twitter sentiment analysis related to COVID-19 lockdown. Basiri et al. 2021 implemented a new deep learning technique for sentiment analysis by using COVID-19 tweets. In this literature, the people"s sentiments were analyzed for eight countries like Canada, England, Australia, Spain, Italy, Iran, China and United States. Firstly, the tweets were collected from eight countries between the time period of 24.01.2020 to 23.04.2020 using coronavirus related key-words. A novel hybrid model was implemented for sentiment analysis, which combines five deep classifiers such as CNN, fast text model, naïve Bayes based SVM, Bi-directional Gated Recurrent Network (Bi-GRU), and BERT. In this study, the presented deep learning model performance was investigated on Stanford sentiment 140 twitter dataset that includes 1,600,000 tweets. Hanschmidt and Kersting 2021 used German language tweets that contain "#corona" and "#cocid-19" for sentiment analysis, where the twitter data were collected between the time period of 18.03.2020 to 24.04.2020. In this literature, the bi-term topic methods were used to analyze the people"s emotions (anger, sad, anxiety, and positive) on sixteen topics: infection related concerns, social contact restrictions, and impact of the pandemic on private and public life. Villavicencio et al. 2021 analyzed the Philippines people"s sentiments (negative, neutral, and positive polarities) towards COVID-19 vaccines. In this literature, the Naïve Bayes classifier was used to classify the people"s sentiments. The obtained experimental outcome indicates that the 8% of the tweets in the Philippines were negative against COVID-19 vaccines, 9% people were neutral, and the remaining 83% of the tweets were positive against COVID-19 vaccines. Malla and Alphonse 2021 used a dataset of 226,668 tweets related to COVID-19 that was collected between December 2019 to May 2020. In this literature study, the majority voting based ensemble deep learning model was used for sentimental analysis. The presented model obtained better prediction accuracy and fscore value on the COVID-19 English labelled tweets dataset. Görmez, et al, 2020 developed a stacked ensemble method for sentiment analysis on the Turkish movie and SemEval-2017 datasets. By reviewing the existing works, the main concerns that exist in the developed models are: incapability in dealing with the complex sentences, which need simple sentiment words for analyzing, in-ability to perform well in dissimilar domains, and in-adequate performance in sentiment analysis, due to insufficient labelled data. To overcome the aforementioned issues and to improve twitter sentiment analysis, a novel ensemble based deep learning model is proposed in this research article. Generally, the sentiment analysis aims in identifying the people"s attitudes and opinions from their comments in the social media platforms towards different aspects of events and products (Budiharto & Meiliana 2018) . Recently, the twitter sentiment analysis has been carried out on the different topics such as political events, product reviews, movie reviews, drug reviews, classification of twitter streams during out-breaks and numerous other subjects (Ruz et al. 2020; Öztürk & Ayvaz 2018) . In the past few decades, the twitter sentiment analysis has gained great attention among the researcher"s communities, due to the advance in machine and deep learning techniques. In this research, the European and Indian people"s opinions are examined towards COVID-19 during the specified time period. The experimental outcome superiorly reflects the European and Indian people"s opinions and sentiments towards COVID-19. In this article, the key-words used for collecting the tweets are Quarantine, social distancing, lockdown, coronavirus, corona, Covid-19, corona outbreak, pandemic, stay home, and coronavirus outbreak. The proposed twitter sentiment analysis framework contains five phases such as twitter data collection, data pre-processing, exploratory analysis, feature extraction, and sentiment prediction. The flowchart of the proposed model is graphically stated in figure 1. In this research, the data of Indian and European users' tweets are collected separately from the different twitter websites during the lockdown of COVID-19. The generated dataset consists of 3100 tweets (2000 European tweets and 1100 Indian tweets), which are extracted from github.com (https://github.com/gabrielpreda/CoViD-19-tweets). Hence, the generated dataset comprises of relevant tweets on the topics of lockdown, coronavirus, covid, etc. The generated dataset consists of the extracted tweets from the Indian and European platforms that are considered for experimental analysis. As mentioned earlier, the tweets concentrated on the topics of social distance, COVID-19, corona outbreak, quarantine, lockdown, coronavirus, corona, stay home, pandemic, and coronavirus outbreak. After data collection, every tweet is annotated as "sad", "joy", "fear", and "anger". In this research, the Text Blob tool is utilized to label the emotional sentiments into "sad", "joy", "fear", and "anger". The Text Blob tool represents the sentence attitude by estimating the polarity score between -1 to 1. The sentiments are regarded as joy, if the polarity score is higher than 0.1. Correspondingly, the sentiments are regarded as "fear", "sad", and "anger", if the polarity scores are in the range of -0.1 to -0.3, -0.4 to -0.7, and -0.8 to -1. The polarity score estimation is mathematically depicted in equation (1). Labelled as "fear" If ( ): Labelled as "anger" Else Labelled as "joy" End For Output: Labelled tweets: . After collecting the tweets, the quality of the raw labelled data is enhanced by performing the following pre-processing operations.  Eliminate numbers, punctuations, and special characters from the dataset, where it majorly won"t improve the prediction performance.  Eliminate repeated words. For instance: "sooooo boring" is converted as "so boring".  Eliminate non-English characters, because this study mainly concentrated on the analysis of information in English language.  Eliminate hashtag symbols (#china, #lockdown, #Wuhan, etc.), uniform resource locators, and @users from the tweets, because it won"t contribute in analyzing the messages.  Eliminate un-necessary newlines, tabs, and spaces from the tweets.  The emoji"s are converted into short textual description using python emoji2 library. The sample pre-processed tweets are represented in table 1. After pre-processing the data, the exploratory investigation is carried-out for obtaining a more comprehensive view of the datasets. The exploratory investigation includes two steps such as key word trend investigation and topic modelling. Firstly, the keyword trend investigation is carried-out on the pre-processed twitter data for identifying the frequently mentioned words. The European and India people are commonly talking about social distancing, staying in home, coronavirus cases, coronavirus pandemic, covid outbreak and crisis due to coronavirus. In this section, the topic distribution is accomplished using the Latent Dirichlet Allocation (LDA) technique for quantitatively analyzing the topics in the generated dataset (Jamal et al. 2020 ). The LDA technique is an effective topic model, which captures the topics from the weighted features and then each tweet is classified based on the concepts. The LDA technique creates a topic for related words or tweets as dirichlet distributions, and it describes each tweet with a probability distribution function that is mathematically stated in the equations (2), (3), and (4). Where, specifies text review, denotes Dirichlet distribution, represents gamma function, indicates Dirichlet parameter, states topics, represents topic assignment upto text, states document level topic vectors, denotes number of tweets, and specifies observed text. In LDA, the number of topics is fixed as 6, and the respective topics are represented by a word distribution. After performing exploratory investigation, the feature extraction is carried out using word embedding"s, and vectorization techniques. In this research, the TF-IDF technique is used for text vectorization that extracts meaningful feature vectors from the twitter data. The TF-IDF technique estimates how often a term arises in a tweet (Kadhim 2019) , where the mathematical expressions of TF-IDF technique are given in the equations (5) and (6). Similarly, the word embedding is accomplished using GloVe, pre-trained Word2Vec, and fast text embedding"s. The word embedding techniques address two issues in twitter sentiment analysis (i) improves semantic relationship between the words by reflecting the words in the direction and distance of the vectors, and (ii) word embedding helps in obtaining dense feature vectors with low dimensionality. The GloVe does not rely on the local context information of words (local statistic), where it incorporates word co-occurrence (global statistic) to achieve better word vectors (Rezaeinia et al. 2019; Sharma et al. 2020 ). The Word2Vec and fast text embedding"s utilizes a neural network model for learning the word"s relation from a large text corpus. Once the network is trained, the synonymous words are detected and suggested extra words for a partial tweets. Hence, the Word2Vec includes two models such as continuous bag of words model and skip gram for learning word embedding (Church 2017) . Additionally, a hybrid model such as improved word vector and hybrid ranking are used to incorporate context of the tweets and sentiments for better twitter sentiment analysis (Naseem et al. 2021b ). After feature extraction, the twitter sentiment analysis is carried out using an ensemble classifier, which combines CapsNet and GRU model (Wang et al. 2018 ). The GRU is an updated version of recurrent neural network, where it integrates forget gate and input gate into "update gate" and includes an additional gate named "reset gate". The GRU model has few tensor operations, and parameters compared to recurrent neural network and LSTM, so it has lesser prediction time and faster convergence speed (Cao et al. 2020 ). Firstly, the GRU modulates the extracted feature information inside the unit without memory cell. In the GRU model, the activation function is a linear inter-polation between the candidate and the previous candidate activation function that is mathematically defined in equation (7). Where, indicates activation function of GRU model, states previous candidate activation function, denotes time, and ̃ specifies present candidate activation, which is defined in equation (8). In addition, the update gate in the GRU model decides how much the unit is required to update its activation function. The update gate is mathematically determined in equation (9). Where, represents reset gates that is mathematically expressed in equation (10). (10) Where, indicates sigmoid function, denotes hyperbolic tangent function, and represents parameter or weight function. In the GRU model, the Stochastic Gradient Descent (SGD) iterative method is used to optimize the stochastic objective function on the basis of low order moments. By using gradient function , the SGD iterative method updates the present weight function and multiplied with the learning rate . Further, the reset gate is updated, as mentioned in equation (11). (11) Where, states extracted feature information, denotes gradient loss function and . The hyper-parameter settings of GRU model is denoted as follows: decay is 0.9, dropout rate is 0.6, number of epoch is 200, and batch size is 30. Additionally, the CapsNet is an effective deep neural network that consists of several capsules (group of neurons) for predicting the user"s sentiment"s (Zhang et al. 2019) . In the CapsNet model, every capsule is responsible to find an individual component of the objects, and all the capsules jointly finds the overall object structure (Chao et al. 2019) . The input and output of the CapsNet model are feature vectors, where the feature vectors directions encodes different characteristics (position, size, etc.) of an individual component and the output length of the feature vector represents the existence probability of an individual component. The predictive vector indicates the belief ̂ | that encodes the relationship between the capsule in the high level capsules and capsule in the low level capsules utilizing a linear transformation matrix , which is mathematically expressed in equation (12). In the low level capsules, is represented as the sum values of ̂ | with weight function and in the high level capsules, and are indicated as input and output value of the capsule . In the CapsNet model, an iterative dynamic routing approach is used to find the coupling coefficient, which is mathematically expressed in equation (13). , the capsule information is transmitted to the high level capsule , and there is no information is transmitted between the capsules , if . In addition to this, a non-linear squash function is used in the CapsNet model that compress the longer feature vectors to 1 and the shorter feature vectors to 0. The undertaken non-linear squash function is mathematically denoted in equation (14). The value becomes large, if the higher and lower level capsules are consistent with the predictions and the value becomes small, if the higher and lower level capsules are inconsistent. The dynamic routing approach in the CapsNet ensures that the higher level capsules consistently send their prediction feature vectors to the lower level capsules for an effective sentiment analysis. The hyper-parameter setting of CapsNet model is listed as follows: number of nodes in the hidden layer is 256, batch size is 148, number of network iteration is 100, learning rate is 0.05, and routing time is 2 seconds. In this application, the proposed ensemble based deep learning model performance is validated using Python environment on a computer with 128 GB random access memory, 4TB hard disk, i9 Intel Core processor, and windows 10 operating system. In this research article, the ensemble based deep learning model performance is valuated individually for Indian and European nations in light of prediction accuracy, recall, MCC, precision, and f-score. The positive identification that belongs to the positive class is defined as "precision", and the "recall" is defined as the ratio of total positive predictions, which are correctly determined from the all positive scenarios. Similarly, the weighted harmonic mean of recall and precision value is represented as "f-score". The "prediction accuracy" is stated as the ratio of correctly predicted samples and the overall samples. The "MCC" is a more reliable performance measure that delivers a high score, if the prediction obtained better results in the twitter sentiment analysis. The mathematical expressions of prediction accuracy, recall, MCC, precision, and f-score are mathematically defined in the equations (15) (16) (17) (18) (19) . Where, True Positive (TP) represents that the number of informative tweets are predicted correctly, False Positive (FP) denotes that the number of informative tweets are predicted incorrectly, True Negative (TN) indicates that the number of uninformative tweets are predicted correctly and False Negative (FN) represents that the number of uninformative tweets are predicted incorrectly. In this scenario, the quantitative study on the European tweets is carried out using proposed ensemble based deep learning model and the individual classifiers with different feature extraction techniques. Among 3100 collected tweets, around 2000 tweets are commented by European people in that 1600:400 tweets are utilized for proposed model training and testing. By investigating table 2, the performance valuation is carried out with different feature extraction techniques like GloVe, Word2Vec, fast text embedding, and hybrid feature and different classification techniques such as CapsNet, GRU, and Ensemble. As seen in table 2, the combination: ensemble classifier with hybrid features obtained superior performance in twitter sentiment analysis in light of prediction accuracy, recall, MCC, precision, and f-score. Whereas, the ensemble based deep learning model achieved 95.2% of prediction accuracy, 97.78% of recall, 97.7% of MCC, 98.32% of precision, and 96.65% of fscore, where the achieved experimental results are superior compared to other combinations. In addition, the proposed ensemble based deep learning model performance is validated with different cross-folds like 3 folds, 5 folds, and 10 folds. By inspecting table 3, the proposed model attained effective performance in 10 folds in terms of prediction accuracy, recall, MCC, precision, and f-score. By doing cross validation, the proposed model is protected against overfitting concern, while the amount of data is limited. Similarly, the ensemble based deep learning model performance with different cross-folds on the Indian tweets is depicted in table 5. By inspecting table 5, the ensemble based deep learning model obtained better performance, while performing 10 fold cross-validation. In addition, the standard deviation of the results are better compared to individual features and classifiers on both European and Indian datasets. In the recent periods, the COVID-19 or coronavirus is the biggest human challenge, where all the nations" governments and researchers are trying to decrease the mortality rate of this pervasive disease. In this article, an experiment is conducted to determine the general opinion (sentiment) of the people in India and the European countries. Firstly, the Indian and European people tweets are collected between 23.03.2020 to 01.11.2021, and then data pre-processing is carried out to improve the quality of the collected data. Additionally, the exploratory investigation and feature extraction are performed for extracting the discriminative feature vectors that are finally fed to the ensemble classifier for user"s sentiment prediction. The experimental investigation showed that the proposed ensemble based deep learning model obtained better performance in sentiment prediction related to individual feature extraction techniques and classifiers by means of MCC, prediction accuracy, recall, precision, and f-score. In the twitter sentiment analysis, the proposed ensemble based deep learning model achieved 97.28% and 95.20% of accuracy in classifying both Indian and European people"s sentiments. However, the computational complexity of the proposed method is high, while performing the experiment with large feature length from GloVe, Word2Vec, Fast text embedding features. Therefore, as a future extension, a novel hybrid optimization technique can be included in the proposed model to select relevant features for further improving the prediction accuracy with limited computational complexity. Funding: This research received no external funding. Sentiment Analysis of COVID-19 Nationwide Lockdown effect in India Psychometric Analysis and Coupling of Emotions Between State Bulletins and Twitter in India During COVID-19 Infodemic A hybrid deep learning and NLP based system to predict the spread of Covid-19 and unexpected side effects on people Role of emotion in excessive use of Twitter during COVID-19 imposed lockdown in India A novel fusion-based deep learning model for sentiment analysis of COVID-19 tweets Prediction and analysis of Indonesia Presidential election from Twitter using sentiment analysis Pandemic Using Text Analysis Prediction of dissolved oxygen in pond culture water based on K-means clustering and gated recurrent unit neural network Emotion recognition from multiband EEG signals using CapsNet Sentimental Analysis of COVID-19 Tweets Using Deep Learning Models Characterizing public emotions and sentiments in COVID-19 environment: A case study of India Sentiment Analysis of Lockdown in India During COVID-19: A Case Study on Twitter An emotion care model using multimodal textual analysis on COVID-19 FBSEM: A Novel Feature-Based Stacked Ensemble Method for Sentiment Analysis Emotions in Covid-19 Twitter discourse following the introduction of social contact restrictions in Central Europe Cross-cultural polarity and emotion detection using sentiment analysis and deep learning on COVID-19 related tweets Sentimental Analysis based on hybrid approach of Latent Dirichlet Allocation and Machine Learning for Large-Scale of Imbalanced Twitter Data Term weighting for feature extraction on Twitter: A comparison between BM25 and TF-IDF TweetCOVID: A System for Analyzing Public Sentiments and Discussions about COVID-19 via Twitter Activities Design and analysis of a large-scale COVID-19 tweets dataset Predicting Geolocation of Tweets: Using Combination of CNN and BiLSTM Sentiment Analysis of People During Lockdown Period of COVID-19 Using SVM and Logistic Regression Analysis COVID-19 outbreak: An ensemble pretrained deep learning model for detecting informative tweets A comprehensive survey on word representation models: From classical to state-of-the-art word representation language models Covidsenti: A large-scale benchmark Twitter data set for COVID-19 sentiment analysis Identifying public concerns and reactions during the COVID-19 pandemic on Twitter: A text-mining analysis Sentiment analysis on Twitter: A text mining approach to the Syrian refugee crisis Sentiment analysis based on improved pre-trained word embeddings A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis Sentiment analysis of Twitter data during critical events through Bayesian networks classifiers Sentimental short sentences classification by using CNN deep learning model with fine tuned Word2Vec Analyzing the depression and suicidal tendencies of people affected by COVID-19"s lockdown using sentiment analysis on social networking websites Sentiment analysis on Twitter tweets about COVID-19 vaccines using NLP and supervised KNN classification algorithm Twitter Sentiment Analysis towards COVID-19 Vaccines in the Philippines Using Naïve Short-term load forecasting with multi-source data using gated recurrent unit neural networks Public discourse and sentiment during the COVID 19 pandemic: Using Latent Dirichlet Allocation for topic modeling on Twitter Examine the effects of neighborhood equity on disaster situational awareness: Harness machine learning and geotagged Twitter data Remote sensing image scene classification using CNN-CapsNet The authors declare that they have no conflict of interest.