key: cord-246958-in0m5jnk
authors: Dharawat, Arkin; Lourentzou, Ismini; Morales, Alex; Zhai, ChengXiang
title: Drink bleach or do what now? Covid-HeRA: A dataset for risk-informed health decision making in the presence of COVID19 misinformation
date: 2020-10-17
journal: nan
DOI: nan
sha: 
doc_id: 246958
cord_uid: in0m5jnk

Given the wide spread of inaccurate medical advice related to the 2019 coronavirus pandemic (COVID-19), such as fake remedies, treatments and prevention suggestions, misinformation detection has emerged as an open problem of high importance and interest for the NLP community. To combat potential harm of COVID19-related misinformation, we release Covid-HeRA, a dataset for health risk assessment of COVID-19-related social media posts. More specifically, we study the severity of each misinformation story, i.e., how harmful a message believed by the audience can be and what type of signals can be used to discover high malicious fake news and detect refuted claims. We present a detailed analysis, evaluate several simple and advanced classification models, and conclude with our experimental analysis that presents open challenges and future directions.

While an increasing percentage of the population relies on social media platforms for news consumption, the reliability of the information shared remains an open problem. Fake news and other types of misinformation have been widely prevalent in social media, putting audiences at great risks globally. Detecting and mitigating the impact of misinformation is therefore a crucial task that has attracted research interest, with a variety of approaches proposed, from linguistic indicators to deep learning models (Bal et al., 2020) . Several research endeavors tackle key issues, such as mitigating label scarcity with additional weak social supervision signals, improving intractability with attention mechanisms, leverage network, group and/or user * Pre-print. Work in progress. information, etc. (Jin et al., 2016; Ruchansky et al., 2017; Shu et al., 2019; Wang et al., 2020; Lu and Li, 2020) .

Fake news frequently emerge for certain phenomena and topics, e.g., public health issues, politics etc. (Allcott et al., 2019; Shin et al., 2018; Bode and Vraga, 2018) . Unsurprisingly, the same applies for the current global pandemic, where inaccurate stories are surfacing daily. It is often difficult for users, that decide to take action based on health advice found online, to understand the consequences and potential risks from following unreliable guidance, especially when all information spread by influential users is perceived as equally credible (Morales et al., 2020) . This adds to the worry and anxiety felt by many, already in a difficult situation (Kleinberg et al., 2020) .

While much work has been focused on identifying health-related misinformation, there has been little attention on making a distinction between the seriousness of misinformation (Fernandez and Alani, 2018) . Severity varies greatly across each message: some might be jokes, some might be discussing the impact of fake news or refute the claim, others might be highly malicious, or others might be simply inaccurate information with little impact that results in no harm. The severity of each message can vary depending on its content, e.g., urging users to eat garlic is less severe than urging users to drink bleach. Several news articles are posted daily related to COVID19, which capture weak signals of misinformation severity. However, identifying the severity level for each misinformation story is a challenging task that has not been previously studied.

To help reduce the impact of COVID19 misinformation on potential health-related decisions of users, we study the severity of misinformation detection. In contrast with previous works that treat misinformation as a binary classification task, we build a novel health risk assessment misinformation benchmark dataset, Covid-HeRA, that contains social media posts annotated on a finer scale, based on whether the message content is: a) real news, b) inaccurate or misinformation or c) refutes/rebuts a specific claim or news article. In addition, posts labeled as misinformation are judged based on their potential to impact user's health, assuming the individual might be making decisions upon the advice and suggestions read in the post. In other words, our goal is to produce a label that reflects the level of risk factors in the presence of inaccurate claims and news, conditioned on the worst-case assumption that the user will follow the advice.

We present our data analysis that reveals several key insights about the most prominent unreliable news and evaluate several baselines, as well as stateof-the-art models and variations. We hope to guide research on developing risk-aware misinformation deterrence algorithms. To facilitate future research, the Covid-HeRA data, the data analysis and the baseline models are open sourced for public usage 1 .

Health-related misinformation research spans a broad range of disciplines including computer science, social science, journalism, psychology, and so on (Dhoju et al., 2019; Castelo et al., 2019; Fard and Lingeswaran, 2020) . While health-related misinformation is only a facet of misinformation research, there has been much work analysing misinformation in different medical domains, such as cancer (Bal et al., 2020; Loeb et al., 2019) , orthodonics (Kılınç and Sayar, 2019), sexually transmitted disease and infections (Zimet et al., 2013; Joshi et al., 2018) , autism (Baumer and McGee, 2019), influenza (Culotta, 2010; Signorini et al., 2011) , and more recently COVID-19 (Garrett, 2020; Brennen et al., 2020; Cinelli et al., 2020; Cui and Lee, 2020) .

Health Misinformation on Social Media The web and social media data have been used to monitor influenza prevalence and awareness online (Smith et al., 2016; Ji et al., 2013; Huang et al., 2017) . Systems such as Google Flu Trends use real-time signals, such as search queries, to detect influenza epidemics (Ginsberg et al., 2009; Preis and Moat, 2014; Santillana et al., 2014; Kandula and Shaman, 2019) . However, relying solely on the search queries leads to an overestimation of influenza, namely because there is no distinction between general awareness about the flu and searches for treatment methods (Smith et al., 2016; Klembczyk et al., 2016) . Our work focuses on social media, in particular health misinformation on micro-blogging sites, such as Twitter. Tomeny et al.

(2017) examined geographical and demographical trends in anti-vaccine tweets. They analyzed anti-vaccine tweets with respect to autism spectrum disorder, and trained a classifier to predict a binary label for anti-vaccine using features such as unigrams, bigrams, word occurrence counts, punctuation, and location. Our work goes beyond such a binary classification, as our model is able to further categorize misinformation on a set of fine-grained severity scale.

Baumer and McGee (2019) apply topic modeling to an autism spectrum disorder (ASD) blogging community dataset with the goal of understanding the representation, delegation and authority of such a method. In a recent workshop on automatic classification of influenza (flu) vaccine behavior mentions in tweets (Weissenbacher et al., 2018) , the top performing system compared deep learning models with pre-trained language models with an LSTM classifier and an ensemble of statistical classifiers with task-specific features which resulted in comparable performances. An error analysis showed that vaccine hesitancy was conflicting with vaccination behavior (Joshi et al., 2018) . Huang et al. (2017) examine the geographic and demographic patterns of the flu vaccine in social media. Recent work, has also focused on identifying users disseminating misinformation, in the case of cancer treatments (Ghenai and Mejova, 2018) , as well as hybrid approaches combining user-related features with content features (Ruchansky et al., 2017) .

With the threat of COVID-19 misinformation to public health organizations, there has been several call-to-actions (Chung et al., 2020; Mian and Khan, 2020; Calisher et al., 2020) to underscore the gravity and impact of COVID-19 misinformation Garrett (2020). Tasnim et al. (2020) outlines several potential strategies to ensure effective communication on COVID-19. Among the recommendations are ensuring up-to-date reliable information, via identifying fake news, or misinformation. and show that the majority appear on social media outlets. As the dialog on the pandemic evolves, so does the need of reliable and trustworthy information online (Cuan-Baltazar et al., 2020). Pennycook et al. (2020) show that, people tend to believe false claims about COVID-19 and share false information when they do not think as critically about the accuracy and veracity of the information. Kouzy et al. (2020) annotated about 600 messages containing hashtags about COVID-19, they discovered that about 25% of messages contain some form of misinformation and about 17% contain some unverifiable information. Singh et al. (2020) provide a large-scale exploratory analysis of how myths and COVID-19 themes are propagated on Twitter, by analysing how users share URL links. Cinelli et al. (2020) cluster word embeddings to identify topics and measure engagement of users on several social media platforms. They provide a comparative study of information reproduction and provide rumor amplification parameters for COVID-19 on these platforms.

The coronavirus pandemic has lead to several measures enforced across the globe, from social distancing and shelter-in-place orders to budget cuts and travel bans (Nicola et al., 2020) . In addition, news circulate daily advice for the public, with suggestions that help prevent the spread and precautions to keep the infection and mortality rates low. Some articles, however, contain fake remedies that reportedly cure or prevent COVID-19, promote false diagnostic procedures, report incorrect news about the virus properties or urge the audience to avoid specific food or treatments that might make symptoms worse or the reader more likely to contract the virus 2 . With such information overload, any decision making procedures based on misinformation have high likelihood of severly impacting one's health (Ingraham and Tignanelli, 2020). Therefore, we aim to predict the severity of incorrect information released on social media, as well as detect any posts that refute or rebut unreliable claims and suggestions on coronavirus misinformation.

In the next sections, we first describe our data collection and annotation methodology. We present statistics and examples of our dataset (Section 3). Subsequently, in Section 4 we present experimental results with several baseline and state-of-the-art machine learning models. Finally, we conclude with a discussion and possible future extensions (Section 5).

We introduce our data source and annotation strategy. Moreover, we present detailed statistics and data analysis that shows key insights on the most prevalent harmful misinformation online.

The goal of creating a new misinformation benchmark dataset is two-fold. First, we want to highlight the importance of understanding the impact of COVID-19 misinformation in health-related decision making and which behavioral aspects are affected by the digital spread of inaccurate harmful advice. More importantly, we aim to flag unreliable posts based on the potential risk and severity of the statements, so that users stay informed on the consequences of incorrect health advice when making decisions.

Thus, we seek to target misinformation on a finer annotation scale, based on whether it has the potential to guide the audience towards healthrelated decisions or behavioral changes with high risk factor, i.e., high likelihood of severely impacting one's health. To this end, we frame the task as a multi-class classification problem, where each social media post is categorized as: a) Real News/Claims, i.e., reliable correct information, b) Refutes/Rebuts, i.e., refutation or rebuttal of an incorrect statement, c) Not severe, i.e., misinformation but unlikely to result in risky behavioral changes or harmful decisions, d) Possibly severe misinformation, with possible severe health-related impact and e) Highly severe misinformation with increased potential risks for any individual following the advice & suggestions expressed in the social media post content.

These categories enable researchers to study the impact of coronavirus health misinformation at a finer granular level, to develop algorithms that caution the audience on the potential risks and to design systems that present unbiased information, i.e., both the original -potentially unreliable -post, along with any possible rebutting claims expressed online. In Table 1 , we present example posts for each category and further describe our annotation process in the following paragraphs.

We make use of CoAID, a large scale healthcare This is misinformation, but behavioral changes of are less likely to occur.

Possibly severe "Vitamin C Protects Against Coronavirus" Although an individual may decide to take daily doses of vitamin C, it is unlikely to be harmful and potential risks are less significant than for other actionable items.

Highly severe "Good News: Coronavirus Destroyed By Chlorine Dioxide _ Kerri Rivera"

These tweets either promote specific behavioral changes and fake remedies with increased health risks, or may result in increased exposure for certain socioeconomic groups. These type of posts are useful in identifying fake claims, as well as presenting opposing views. misinformation data collection related to COVID19 with binary ground truth labeled news articles and claims, accompanied with associated tweets and user replies (Cui and Lee, 2020) . This dataset provides us with a large amount of reliable twitter data and alleviates the need for labeling tweets as real or fake. Furthermore, it has the potential to be updated automatically with additional instances, enabling Covid-HeRA semi-supervised models as future work.

To obtain annotations based on our defined severity categorization, all tweets labeled in CoAID as misinformation are shuffled and distributed to two different annotators. Each annotator is asked to judge whether any decisions or other actionable items can be taken based on the expressed content, and whether those could result in harmful choices, risky behavioral changes or other severe health impacts.

Additionally, we asked annotators to flag any post that expresses an opinion or argument against the unreliable claims, i.e., refutes or rebuts misinformation (see Figure 1 for a screenshot of our annotation interface). To assess agreement levels, an external validator was asked to annotate a random sample of the labeled tweets. The kappa score be-tween the annotators and the validator was 0.7037, which shows good agreement on the task (Hunt, 1986). A final round was introduced as an additional step, in order to resolve conflicts on ambiguous instances. The total number of tweets labeled per category, alongside with the number of unique words, are presented in Table 2 . 

We first identify the most frequent discriminative terms per category, i.e., terms that appear very often in a category but are infrequent in the remaining categories. We use a document frequency threshold of 0.5% to discard terms that are very commons across the whole data collection, for example "COVID19" or "virus" appearing in more than half of the tweets. In Figure 2 , we visualize the top-30 terms per category, with each term weighted by its representativeness. When comparing the "Not severe" category with the other severe categories, we see that many of the terms here refer to conspiracies about COVID-19, such as "artificially", "labmade", "bioweapon" which pertains to the conspiracy that COVID-19 is a man-made virus. The top terms for the "Highly severe" category seem to be about treatments and are more risky words such as "risk", "mask", "cure", "vaccines", and "hydroxychloroquine". The top terms for the "Refutes/Rebuts" contains terms such as "myths", "weaponized", "lying", and "antibiotics" as the messages in this category addressed and debunked conspiracies and misinformation, while the top terms for the "True News/Claims" are "resources", "symptoms", "testing", and "guidance", as these messages are generally informative and provide advice about COVID-19.

We visualize the compactness of each category in Figure 3 . We use pre-trained BERT embeddings (Devlin et al., 2018) to measure how close each tweet is by the centroid of its corresponding category, based on their document vectors. We hypothesize that compact categories are more likely to be well-formed and thus easier to classify. We compute the skewness and kurtosis for each distribution. Each distribution shows a positive skewness, and as we can see, they are right-tailed distributions. The "Possibly severe" category is the most skewed and the "Highly severe" category is the least. The negative kurtosis of both the "Highly severe" and "Refutes/Rebuts" categories shows that these categories have less of a peak and appear more uniform, which we is also evident by the flatness of these curves. This may be due to the broad range to topics covered in both of these categories compared to the rest. In Figure 4 , we analyze the top-10 frequent hashtags per category. We remove common hashtags such as "#covid_19", "#coronavirus", etc. The length of each bar indicates how frequently the hashtag appears. We find that the "Not severe" category follows a similar pattern to Figure 2 , in that the top hashtags are pertaining more to rumors and conspiracies, such as the "#pope" tested positive for COVID-19, or that COVID-19 is a "#bioweapon". This maybe attributed to the fact that those susceptible to misinformation, are less likely to think critically about news sources and thus tend to believe more false claims (Pennycook et al., 2020) . Both severe categories focus on remedies, e.g. "#vitaminc" and vaccination. Interestingly, the Refutes/Rebuts top hashtags had terms associated to computation such as "#dataviz", "#tableau", as well as hashtag to promote social distancing "#stayhome". These hashtags may be evidence of several infographics and data visualizations shared in social media, often used as arguments against misinformation.

Finally, we present the most common claims and news per category, and their corresponding frequency (Table 3) .

To showcase the open challenges of risk-based labeling of misinformation, we perform experiments with several baselines and state-of-the-art multiclass classification models. We pre-process tweets to filter out reserved tokens, such as RT or retweet, urls and mentions. We then split the data into 80% training and 20% testing, keeping the same splits across all models for fair comparison.

The algorithms we experiment with are the following: Random Forests with bag-of-words (RF-TFIDF) or 100-dimensional pre-trained Glove embeddings (RF-Glove) as text representation Support Vector Machine with bag-of-words (SVM-TFIDF) or 100-dimensional pre-trained Glove embeddings (SVM-Glove) Logistic Regression with bag-of-words (LR-TFIDF) or 100-dimensional pre-trained Glove embeddings (LR-Glove), same as for SVM and RF.

Bi-directional LSTM model (Schuster and Paliwal, 1997) with 100-dimensional pre-trained Glove embeddings as initial representation (LSTM). Multichannel CNN convolutional neural network with multiple kernel sizes and 100-dimensional pretrained Glove embeddings as initial representation, similar to (Kim, 2014) 

Task-specific BERT fine-tuned on our downstream text classification task, initialized with general-purpose BERT embeddings (Devlin et al., 2018) . Hierarchical Attention Networks (HAN), with word and sentence level attentions (Yang et al., 2016) We additionally test whether incorporating additional sources of information, such as user replies or news articles with related content can improve predictive performance. To this end, we train dE-FEND (Shu et al., 2019) , a state-of-the-art fake news detection model that builds upon a Hierarchical Attention Network (Yang et al., 2016) , by adding co-attention between two textual sources; in our case either tweet replies or corresponding news content.

Our evaluation metrics are accuracy, precision, recall, and F1 score. We train with cross-entropy, Adam and 50 minibatch size for all models. In Table 4 , we report the average score of 3 independent trials, i.e., run each model three times with different seeds. In terms of F1 score, CNN and LSTM models perform slightly better than simpler baselines. Surprisingly, incorporating user engagement features, news content or contextualized pretrained embeddings did not help 3 . We note however that BERT and dEFEND (co-attending on news content) have higher recall, suggesting that ensemble 3 We also performed experiments with COVID-Twitter-BERT, a Transformer model pre-trained on 22.5M COVID19related Twitter messages (Müller et al., 2020) , unfortunately with much lower performance than general BERT. We leave further analysis on the reasons why fine-tuned embeddings on COVID-19 posts were not helpful as future work. models could further improve performance.

One of the main challenges of health-related misinformation with finer granularity is that some categories might be substantially underrepresented, i.e., risk assessment in misinformation creates a high imbalanced problem, especially for some topics, that is even more difficult to tackle than misinformation detection. To test this hypothesis, we perform the same experiments in a common binary classification setting. More specifically, we discard all refutation and rebuttal tweets and collapse all tweets labeled as misinformation in a common label, irrespective of the severity label, essentially backtracking to a real vs. fake traditional framework. We evaluate the same set of algorithms and present results in Table 5 . Compared to the finegrained labels of Covid-HeRA, the binary classification task produces higher performance across all evaluation metrics, highlighting the limitations of our finer categorization setting. Despite being an important task, i.e., including key goals such as distinguishing between harmful social media medical advice and refuted claims, health misinformation risk-assessment presents many challenges. Based on the per-class evaluation, we note another challenging difference, apart from the imbalance discussed above. In Figure 5 , we present the confusion matrix for the best performing model on the severity multi-class classification task (CNN). Tweets labeled as "Not severe" and "Possibly severe" are more likely to be predicted as "Real News/Claims", probably due to the true latent semantics being similar for these categories. Further research on handling imbalance as well as integrating auxiliary signals is required. We conclude with future directions in the next section. 

In this work, we release Covid-HeRA, a new benchmark dataset for risk-aware health misinformation on COVID-19 related social media posts. We describe our data collection and conduct thorough data analysis and extensive experiments with baseline methods and state-of-the-art text classification and misinformation models. Our experimental results demonstrate the usefulness and challenges of finer-grained multi-class classification in healthcare misinformation detection. We hope Covid-HeRA will enable researchers to design advanced models for risk scoring of misinformation spread and to develop systems that inform the social media audience on the respective dangers of following unreliable advice from inaccurate sources.

There are several possible future directions. First and foremost, we hope to take into account the substantial imbalance of misinformation, compared to the overall number of tweets online. By leveraging advancements in relevant research, we can build custom loss functions or data sampling methods to mitigate the challenges of sparsity in underrepresented categories.

Additionally, to alleviate the need of large training sets, future research could focus on the exploration of weak supervision signals, semi-supervised and self-supervised algorithms. Few-shot models can also handle distribution shift and novel classes with fewer examples,e.g., adding a scale or category in our annotation set. Finally, the task of identifying rebuttal and refutation posts, which present arguments against misinformation spread, is something we aim to tackle on future research, exploring additional linguistic signals and auxiliary tasks, e.g., applying controversy detection algorithms (Lourentzou et al., 2015) .

Trends in the diffusion of misinformation on social media

Analysing the Extent of Misinformation in Cancer Related Tweets

Speaking on Behalf of: Representation, Delegation, and Authority in Computational Text Analysis

See something, say something: Correction of global health misinformation on social media

Types, sources, and claims of COVID-19 misinformation

Statement in support of the scientists, public health professionals, and medical professionals of China combatting COVID-19

A topic-agnostic approach for identifying fake news pages

CT imaging features of 2019 novel coronavirus (2019-nCoV)

The covid-19 social media infodemic

Misinformation of COVID-19 on the Internet: Infodemiology study

CoAID: COVID-19 Healthcare Misinformation Dataset

Towards detecting influenza epidemics by analyzing Twitter messages

Bert: Pre-training of deep bidirectional transformers for language understanding

Differences in health news from reliable and unreliable media

Misinformation Battle Revisited: Counter Strategies from Clinics to Artificial Intelligence

Online misinformation: Challenges and future directions

Covid-19: the medium is the message

Fake cures: user-centric modeling of health misinformation in social media

Detecting influenza epidemics using search engine query data

Examining patterns of influenza vaccination in social media

Percent agreement, pearson's correlation, and kappa as measures of inter-examiner reliability

Fact versus science fiction: fighting coronavirus disease 2019 requires the wisdom to know the difference

Monitoring public health concerns using twitter sentiment classifications

News verification by exploiting conflicting social viewpoints in microblogs

Shot or not: Comparison of NLP approaches for vaccination behaviour detection

Reappraising the utility of Google Flu Trends

Assessment of reliability of youtube videos on orthodontics

Convolutional Neural Networks for Sentence Classification

Measuring emotions in the covid-19 real world worry dataset

Google flu trends spatial variability validated against emergency department influenzarelated visits

Coronavirus goes viral: Quantifying the covid-19 misinformation epidemic on twitter

Dissemination of misinformative and biased information about prostate cancer on youtube

Hotspots of news articles: Joint mining of news text & social media to discover controversial points in news

GCAN: Graph-aware co-attention networks for explainable fake news detection on social media

Coronavirus: the spread of misinformation

CrowdQM: Learning aspect-level user reliability and comment trustworthiness in discussion forums

COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter

Ahmed Al-Jabir, Christos Iosifidis, Maliha Agha, and Riaz Agha. 2020. The socio-economic implications of the coronavirus and COVID-19 pandemic: a review

Fighting COVID-19 misinformation on social media: Experimental evidence for a scalable accuracy nudge intervention

Adaptive nowcasting of influenza outbreaks using Google searches

Csi: A hybrid deep model for fake news detection

What can digital disease detection learn from (an external revision to) Google Flu Trends? American journal of preventive medicine

Bidirectional recurrent neural networks

The diffusion of misinformation on social media: Temporal pattern, message, and source. Computers in Human Behavior

defend: Explainable fake news detection

The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic

Emily Vraga, and Yanchen Wang. 2020. A first look at COVID-19 information and misinformation sharing on Twitter

Towards real-time measurement of public epidemic awareness: Monitoring influenza awareness through Twitter

Impact of rumors or misinformation on coronavirus disease (COVID-19) in social media

Geographic and demographic correlates of autism-related anti-vaccine beliefs on Twitter

Bin Zhong, Qiang Deng, and Jing Gao. 2020. Weak supervision for fake news detection via reinforcement learning

Overview of the third social media mining for health (SMM4H) shared tasks at EMNLP

Hierarchical attention networks for document classification

Beliefs, behaviors and HPV vaccine: correcting the myths and the misinformation