key: cord-209697-bfc4h4b3 authors: Shanthakumar, Swaroop Gowdra; Seetharam, Anand; Ramesh, Arti title: Analyzing Societal Impact of COVID-19: A Study During the Early Days of the Pandemic date: 2020-10-27 journal: nan DOI: nan sha: doc_id: 209697 cord_uid: bfc4h4b3 In this paper, we collect and study Twitter communications to understand the societal impact of COVID-19 in the United States during the early days of the pandemic. With infections soaring rapidly, users took to Twitter asking people to self isolate and quarantine themselves. Users also demanded closure of schools, bars, and restaurants as well as lockdown of cities and states. We methodically collect tweets by identifying and tracking trending COVID-related hashtags. We first manually group the hashtags into six main categories, namely, 1) General COVID, 2) Quarantine, 3) Panic Buying, 4) School Closures, 5) Lockdowns, and 6) Frustration and Hope}, and study the temporal evolution of tweets in these hashtags. We conduct a linguistic analysis of words common to all hashtag groups and specific to each hashtag group and identify the chief concerns of people as the pandemic gripped the nation (e.g., exploring bidets as an alternative to toilet paper). We conduct sentiment analysis and our investigation reveals that people reacted positively to school closures and negatively to the lack of availability of essential goods due to panic buying. We adopt a state-of-the-art semantic role labeling approach to identify the action words and then leverage a LSTM-based dependency parsing model to analyze the context of action words (e.g., verb deal is accompanied by nouns such as anxiety, stress, and crisis). Finally, we develop a scalable seeded topic modeling approach to automatically categorize and isolate tweets into hashtag groups and experimentally validate that our topic model provides a grouping similar to our manual grouping. Our study presents a systematic way to construct an aggregated picture of peoples' response to the pandemic and lays the groundwork for future fine-grained linguistic and behavioral analysis. COVID-19 (also known as the novel coronavirus) is a truly global pandemic and has affected humans in all countries of the world. While humanity has seen numerous epidemics including a number of deadly ones over the last two decades (e.g., SARS, MERS, Ebola), the grief and disruption that COVID-19 has already inflicted is incomparable. At the time of writing this paper, COVID-19 is still rapidly spreading around the world and projections for the next few months are grim and extremely disconcerting. The learnings from COVID-19 will also enable humankind to prevent such epidemics from transforming into global pandemics and minimize the socio-economic disruption. In this work, our goal is to analyze the societal impact of COVID-19 in the United States of America during its early days, understand the chain of events that occurred during the spread of the infection, and draw meaningful conclusions so that similar mistakes can be avoided in the future. Though Twitter data has previously been shown to be biased [1] , Twitter has emerged as the primary media for people to express their opinion especially during this time and our study offers a perspective into the impact as self-disclosed by people in a form that is easily understandable and can be acted upon. We summarize our main contributions below. We collect 530,206 tweets from Twitter between March 14 th to March 24 th , a time period when the virus made its first significant inroads into the US and quantitatively demonstrate the disruption and distress experienced by the people. We group the hashtags into six main categories, namely 1) General COVID, 2) Quarantine, 3) School Closures, 4) Panic Buying, 5) Lockdowns, and 6) Frustration and Hope, to quantitatively and qualitatively understand the chain of events. We observe that general COVID and quarantinerelated messages remain trending throughout the duration of our study. In comparison, we observe calls for closing schools and universities peaking in the middle of March and then reducing when the closures go into effect (e.g., #closenycschools). We also observe a similar trend with panic buying with essential items particularly toilet paper becoming unavailable in stores (e.g., #panicbuying, #toiletpapercrisis). We conduct a linguistic analysis of the tweets in the different hashtag groups and present the words that are representative of each group. We observe that words such as family, life, health, and death are common across hashtag groups. Additionally, for example, if we consider the School Closures category, we observe that unigrams (e.g., teacher, learn) and bigrams (e.g., home school, kid home) reflect the most discussed issues. We also conduct sentiment analysis to unearth the overall sentiment of the people. Our investigation reveals that people reacted positively to school closures and negatively to the lack of availability of essential goods due to panic buying. We next adopt a state-of-the-art semantic role labeling approach to identify the action words (e.g., fear, test) that are uniquely representative in each hashtag group. These action words help understand peoples' actions in each group. We leverage a LSTM dependency parsing model to analyze the context of the above-mentioned action words (e.g., verb deal is accompanied by nouns such as anxiety and stress). Finally, we develop a scalable seeded topic modeling (seeded LDA) approach to automatically categorize tweets into specific topics of interest, especially when the topics are rarer in the dataset. We experimentally validate our seeded LDA model and observe that it provides a grouping similar to our manual grouping. Our study summarizes the critical public responses surrounding COVID-19, paving the way for future fine-grained linguistic and graph analysis. In this section, we discuss our methodology for data collection from Twitter to investigate the societal impact of COVID-19 in the United States during its early days. We collect data using the Twitter search API. The results presented in this paper are based on the data collected from March 14 to March 24, 2020. We track the trending COVID related hashtags every day and collect the tweets in those specific hashtags. We repeat this process to collect a total of 530,206 tweets during this time period. We group the hashtags into six main categories, namely 1) General COVID, 2) Quarantine, 3) School Closures, 4) Panic Buying, 5) Lockdowns, and 6) Frustration and Hope to quantitatively and qualitatively understand the chain of events. We collect data on per day basis for the different hashtags as and when they become trending. Table I shows the number of tweets in each category, while Table II shows the grouping of some of the representative hashtags by category. We observe that the total number of tweets as grouped by hashtags is 664,476, which is higher than the total number of tweets. This is because tweets can contain multiple hashtags and thus the same tweet can be grouped into multiple categories. We present some example tweets in Table III to illustrate the types of communications occurring on Twitter during this period. Alongside, people also rallied to support workers working hard to keep essential services running. With the beginning of April approaching, many people started to worry about their next month's rent. Due to the data collection limits imposed by Twitter, we are able to only collect and analyze a portion of the tweets. Though we started collecting data as quickly as we conceived of this project, we were unable to collect data during the first week of March. Though we ran our script to collect data as far back as March 8, because of the way Twitter provides data, we obtained a limited number of tweets from March 8 to March 13. Additionally, due to the rapidly evolving situation, it is likely that we have inadvertently missed some important hashtags, despite our best efforts. As is the case with most studies based on Twitter data, we also acknowledge the presence of bias in data collection [1] . Having said that, the goal of this study is to provide a panoramic summarized view of the impact of the pandemic on people's lives and aggregate public opinion as expressed by them. Due to the nature of this study, we are confident that the results presented here help in appreciating the sequence of events that transpired and better prepare ourselves from possible future waves of COVID-19 or another pandemic. In this section, we present observations and results based on our linguistic analysis of the tweets. We study the popularity and temporal evolution of individual hashtags and hashtag When are we going to #CancelRent in this state? Hundreds of thousands are filing for unemployment and can't pay rent. Sure, we can't be evicted, but what's preventing companies from coming after us after this is over? groups. We explore the word-usage (i.e, unigram and bigram) frequencies for each hashtag group to understand the main points of discussion. We then conduct a sentiment analysis to understand the prevailing sentiments in the tweets. We adopt a semantic role labeling approach to identify the action words (i.e., verbs) as well as the corresponding contextual analysis of these action words. Finally, we develop a scalable seeded LDA based topic model to automatically group tweets and validate its effectiveness with our manual grouping. Figure 1a shows the top 20 hashtags observed in our data. As expected, we see that hashtags corresponding directly to COVID or coronavirus are the most popular hashtags as most communications are centered around them. We observe that hashtags around social isolation, staying at home, and quarantining are also popular. Figure 1b shows the most popular hashtags by date. Similar to Figure 1a , we observe that hashtags related directly to COVID and social distancing trend most on Twitter. The figures and the number of tweets highlight how the pandemic gripped the United States with its rate of spread. We investigate the evolution of the number of tweets in various hashtag groups over time. To calculate the number of tweets in each hashtag group, we count the number of mentions of hashtags in that group across all the tweets. If the tweet contains more than one hashtag, it is counted as part of all the hashtags mentioned in it. As the number of tweets for hashtag groups vary significantly, we plot the groups that have similar number of tweets together. Similar to Figure 1 , we observe from Figure 2a that the total number of tweets in the General COVID and Quarantine categories are relatively high throughout the time period of the study. Interestingly, from Figure 2b , we observe that panic buying and calls for school closures peak around the middle of In this section, we present results from a linguistic word usage analysis across the different hashtag groups. Our goal is to identify the words that are uniquely representative of the particular group. To accomplish this, first, we identify and present the most commonly used words across all the hashtags. To construct the group of common words across all hashtags, we remove the words that are same or similar to the hashtags mentioned in Table II as those words are redundant and tend to also be high in frequency. We also remove the names of places and governors such as New York, Massachusetts and Andrew Cuomo. After filtering out these words, we then rank the words based on their occurrence in multiple groups and their combined frequency across all the groups. We observe words such as family, health, death, life, work, help, thank, need, time, love, crisis. In Table IV , we present some notable example tweets containing the common words. While one may think that health refers to the virusrelated health issues, we notice that many people also refer to mental health in their tweets as a possible consequence to social distancing and anxiety caused by the virus. We also observe the usage of words such as death and crisis to indicate the seriousness of the situation. Supporting workers and showing gratitude toward them is another common tweet pattern that is worth mentioning. Second, we present the most semantically meaningful and uniquely identifying words in each hashtag group. To do this, we remove the common words calculated in the above step from each group. From the obtained list of words after the filtering, we then select the top 10 words. Due to space constraints, we only present results for four hashtag groups. Figure 3 gives us the uniquely identifying and semantically meaningful words in each hashtag group. In the General Death. We must act very fast. First, we take care of the health care and emergency workers. Then, we take care of whoever is in charge of keeping Netflix and Hulu running or it's going to get ugly #distancesocializing #coronavirus COVID group, we find words such as impact, response, resource, and doctor. Similarly, for School Closures, we find words such as teacher, schedule, educator, book, and class. The Panic Buying top words mostly resonate the shortages experienced by people such as roll and tissue (referring to toilet paper), hoard, bidet (as an alternative to toilet paper), wipe, and water. Top words in the Lockdown group include immigration, shelter, safety, court, and petition, signifying the different issues surrounding lockdown. We analyze words that co-occur to understand the contextual information surrounding the words. Co-occurring bigrams capture pairs of words that frequently co-occur in each group. To do this, we first filter out stop words and perform stemming and lemmatization. We calculate the overall frequencies of each word and its frequency within each class and calculate the bigram association using Pearson's Chi Squared independence test, which determines if pairs of words occur together more than they would randomly. We select the top 10 bigrams with the highest collocation statistics that are most intuitive for the human reader. Figure 4 shows the top 10 bigrams for each group. We can clearly see how bigrams give better understanding compared to unigrams. Bigrams such as 'toilet paper', 'panic buy', 'wash hand' clearly articulate the intents of the tweets in the panic buying group. Similarly, in the lockdown group, we see 'stay home', 'work home', and 'minimize spread' emerging as top bigrams capturing what people are talking about in that group. To understand the sentiment across the different hashtag groups, we perform a comparative sentiment analysis. We use a pre-trained sentiment analysis model [2] , which has 95.11% accuracy on Stanford SST test dataset and apply it to our dataset. Our model, roBERTa base model, classifies the data into five sentiment categories: strongly positive, positive, neutral, negative, and strongly negative. We present the results in Figure 5 . Since the neutral category is not useful for our analysis, we exclude it and scale the rest of the categories to 100%, normalizing for the number of tweets in each category. We notice that the School Closures group has a significantly higher number of positive tweets that capture the overall positive sentiment around the closure of schools. In contrast, the Panic buying group has a higher number of negative tweets showing the frustration in relation to panic buying. Overall, we observe strongly positive tweets when compared to strongly negative tweets in all categories. This is especially interesting in the Quarantine and Frustration and Hope groups, where more tweets are showing support for quarantine and hopefulness. We use AllenNLP BERT based model [3] to run semantic role labeling and identify the action words (verbs), which capture the actions people are referring to in the tweets. To identify the uniquely representative verbs in each group, we identify all the verbs in each group and use TF-IDF vectorization to remove the common verbs across the groups. We then compute the verb frequency of remaining verbs in each group. Figure 6 shows the verbs and their frequencies. The results capture the top verbs defining each group. For example, the School Closures group has close as its top verb which signifies the closing of schools while learn, read, and teach emphasize the actions corresponding to learning online because of the pandemic. In comparison some words such as mean and post are challenging to understand without additional context, so we present examples tweets containing these words to understand the context in which they are used in Table V . All tweets with mean have a similar context but post is used in two different contexts. One refers to send and the other refers to the post pandemic period. Along the same lines, verbs in other groups also signify people's actions during the pandemic. The Panic Buying group captures actions such as buy, wash, hoard, sell, and To further analyze the context in which the action words discussed in Section III-E are used, we analyze the words associated with them using dependency parsing. Dependency parsing breaks down each sentence into linguistic dependency structures organized in the form of a tree. We focus on identifying the nouns that are connected to the action words. In the dependency parse, the action words/verbs form the root of the parse and the dependencies are in the left and right subtrees. To identify the nouns associated with the verbs, we traverse the dependency parse to the sub-tree where the action word of interest is present and then extract the corresponding noun. We also analyze the link associated with noun and verb and find that "nsubj" (Nominal Subject), "pobj" (Object of a preposition) and "dobj" (The Direct Object) are the most related link tags that contribute to the action words. Figure 7 gives some notable dependency parse subtrees with the action word and the corresponding noun. We can see that by decoding the parse structure, we can identify additional contextual information such as the nouns they refer to. We use the AllenNLP implementation of a neural model for dependency parsing using biaffine classifiers on top of a bidirectional LSTM [4] . We parse the sentences associated with the top 5 verbs in Figure 6 and find their associated nouns to understand what the action verbs are used to signify. Tables VI, VII, VIII, IX represent the different nouns associated with the most prominent action words in each group, respectively. General COVID being the diverse group, it contains a myriad of tweets from offering support to the fear of getting infected. Apart from the verb-noun combinations that we expect to see in the group (such as test virus, confirm case, offer support), the other most notable verb-noun combinations in this group are: deal stress, deal anxiety, fear system, and fear safety. And in other groups, the verb-noun combinations narrow down on the specific actions relevant to the group. For instance, in School Closures, the action word close mostly talks about closing the schools for benefit of students, and action word offer co-occurs with teaching aids through online sources. In the Panic Buying group, tweets about the panic experienced by people is captured by verb-noun pairs such as stop madness, buy paper, find store. In the Lockdown group, some interesting combinations surface such as believe information, guess trust, which captures the possible distrust people have with the lockdown measures. Nouns such as insanity also help in capturing peoples' reaction to the lockdown measures. In this section, we use Seeded LDA [5] to categorize the tweets and check the closeness of these automatically obtained groups with our manual grouping using the hashtags. As we are specifically interested in isolating the tweets in specific topics of our interest than general topics identified by a topic model, we leverage a seeded variant of LDA, Seeded LDA [5] to guide the topic model to discover them. Seeded LDA allows seeding of topics by providing a small set of keywords to guide topic discovery influencing both the document-topic and the topic-word distributions. The seed words need not be exhaustive as the model is able to detect other words in the same category via co-occurrence in the dataset. Our goal with seeded LDA is to i) present a way to automatically categorize tweets into specific topics of interest, especially when the topics are rarer in the dataset, ii) passively evaluate the effectiveness of our word analysis thus far, and iii) develop a scalable approach that can be extended to millions of tweets with minimal manual intervention. We develop a Seeded LDA model to categorize tweets into the five hashtag groups: i) General COVID, ii) School Closures, iii) Panic Buying, iv) Lockdowns, and v) Quarantine by seeding each group with seed words from our analysis in Section III-B. We leave out the Frustration and Hope topic due to the inherent polarizing nature of the keywords and the lack of identifying keywords that are unique for the topic. We select the top few words from our words in Figure 3 as seed words for our Seeded LDA model. Table X gives the seed words for the different COVID categories. We include k unseeded topics in our model to account for messages that do not fall into these topic categories. After experimenting with different values of k and manually evaluating the topics, we find that k = 2 gives us the best separation and categorization. We use α = 0.01 and β = 0.0001 to give us sparse documenttopic and topic-word distributions where fewer topics and words with high values emerge, so we can classify the tweets to the predominant category. We train the seeded LDA models for 2000 iterations. We first use the documenttopic distribution to get the best topic for each tweet. If the best topic of the message is one of the seeded topics which correspond to the categories, then, we classify the tweet into that category. In the event that a clear best topic does not emerge, we randomly assign the tweet to one of the topics that have the same document topic distribution. 1) Analyzing Effectiveness of Seeded LDA Model: To check how closely the hashtag groups match with the seeded LDA groups, we measure the accuracy by comparing the document topic distribution from the LDA against the grouping determined by the hashtags. We do this by calculating the confusion matrix which gives us four metrics such as true positives, true negatives, false positives, and false negatives to further calculate accuracy, precision, recall, and f1 scores, which gives the value of correctness. The results we obtained are shown in the Table XI . This endeavor helps in determining the effectiveness of our word analysis (seeds) and our seeded LDA model. Also, to verify that our model had best results in the groups that we are interested in, we calculate the precision, recall, and F1 scores for the School Closures, Panic Buying, Lockdowns, and Quarantine groups. We exclude General COVID and Frustration and Hope groups as they are too general, and we are interested in isolating the more specific COVID groups. Table XII shows the results of each group. By examining the result, we observe that the manual grouping of the hashtags have significant match with the seeded LDA groups. We also note that the seeded LDA model is able to correctly isolate the tweets in rarer groups where there is less data, such the School closures group. This shows the effectiveness of our model to analyze rarer groups in the data. Additionally, from the precision of classification for the Quarantine group, we observe that the false positives were significantly low and further adds credibility to our model. IV. RELATED WORK In this section, we outline existing research related to modeling and analyzing Twitter and web data to understand social, political, psychological, and economic impacts of a variety of different events. Due to the recent nature of the outbreak, there is little to no published work on COVID-19. We primarily focus on discussing work that analyze Twitter communications. Ahmed et al. focus on the conspiracy theories surrounding the novel coronavirus, especially in relationship with 5G [6] . The authors analyze Twitter communications and discuss the possibility of using bots for propagating misinformation and political conspiracies during the pandemic [7] , [8] . In comparison, the authors in [10] conduct infodemiology studies on Twitter communications to understand how information is spreading during this time, while the the stigma created by referencing the novel coronavirus as "Chinese virus" is investigated in [9] . Twitter has been used to study political events and related stance [11] , [12] , human trafficking [13] , and public health [14] , [15] , [16] , [17] , [18] . Several work perform fine-grained linguistic analysis on social media data [19] , [20] , [21] . V. DISCUSSION AND CONCLUDING REMARKS In this paper, we studied Twitter communications in the United States during the early days of the COVID-19 outbreak. As the disease continued to spread, we observed panic buying as well as calls for closures of schools, bars, cities, social distancing and quarantining. We conducted a linguistic word-usage analysis and identified the most frequently occurring unigrams and bigrams in each group that provide us an idea of the main discussion points. We conducted sentiment analysis to understand the extent of positive and negative sentiments in the tweets. We then performed semantic role labeling to identify the key action words and then obtained the corresponding contextual words using dependency parsing. Finally, we designed a scalable seeded topic modeling approach to automatically identify the key topics in the tweets. Discovering, assessing, and mitigating data bias in social media Roberta: A robustly optimized bert pretraining approach Simple bert models for relation extraction and semantic role labeling Deep biaffine attention for neural dependency parsing Incorporating lexical priors into topic models Covid-19 and the 5g conspiracy theory: social network analysis of twitter data Coronavirus goes viral: quantifying the covid-19 misinformation epidemic on twitter What types of covid-19 conspiracies are populated by twitter bots?" First Monday Creating covid-19 stigma by referencing the novel coronavirus as the ?chinese virus? on twitter: Quantitative analysis of social media data Conversations and medical news frames on twitter: Infodemiological study on covid-19 in south korea All i know about politics is what i read in twitter: Weakly supervised models for extracting politicians' stances from twitter Bumps and bruises: Mining presidential campaign announcements on twitter The impact of environmental stressors on human trafficking Using twitter to understand the human bowel disease community: Exploratory analysis of key topics How social media will change public health Detecting and characterizing mental health related self-disclosure in social media Predicting depression via social media Characterizing sleep issues using twitter Fine-grained analysis of cyberbullying using weakly-supervised topic models A socio-linguistic model for cyberbullying detection Weakly supervised cyberbullying detection using co-trained ensembles of embedding models