key: cord-0127063-goww7m5o
authors: Toraman, Cagri; cSahinucc, Furkan; Yilmaz, Eyup Halit
title: What Happened in Social Media during the 2020 BLM Movement? An Analysis of Deleted and Suspended Users in Twitter
date: 2021-09-30
journal: nan
DOI: nan
sha: 80294818939a27248461ad28a7123c1aac6ea857
doc_id: 127063
cord_uid: goww7m5o

After George Floyd's death in May 2020, the volume of discussion in social media increased dramatically. A series of protests followed this tragic event, called as the 2020 BlackLivesMatter (BLM) movement. People participated in the discussion for several reasons; including protesting and advocacy, as well as spamming and misinformation spread. Eventually, many user accounts are deleted by their owners or suspended due to violating the rules of social media platforms. In this study, we analyze what happened in Twitter before and after the event triggers. We create a novel dataset that includes approximately 500k users sharing 20m tweets, half of whom actively participated in the 2020 BLM discussion, but some of them were deleted or suspended later. We have the following research questions: What differences exist (i) between the users who did participate and not participate in the BLM discussion, and (ii) between old and new users who participated in the discussion? And, (iii) why are users deleted and suspended? To find answers, we conduct several experiments supported by statistical tests; including lexical analysis, spamming, negative language, hate speech, and misinformation spread.

Social media is a tool for people discussing and sharing opinions, as well as getting information and news. A recent study shows that 71% of Twitter users get news from this platform [27] , which emphasizes the threat of fake news [16] and misinformation spread [12, 34] . After George Floyd's death in May 2020, the volume of discussion in social media increased dramatically. A series of protests followed this tragic event, called the 2020 BlackLivesMatter (BLM) movement.

People participated in the discussion for several reasons. Some users had advocacy to protest racism and police violence. Some tried to take advantage of the popularity of the event by spamming and misinformation spread. Chaos emerged due to many users participating in discussion with different motivations. . May 25, 2020 is the day of the event that triggers the 2020 BLM movement. Act/Del/Sus-Old/New refers to active, deleted, and suspended users who participated in the BLM discussion and joined Twitter before and after May 25, respectively. Control group is the set of users who do not share any BLM-related tweet.

Eventually, particular accounts are deleted by users or suspended due to violating the rules of social media platforms. We analyze the number of new users in Twitter before and after the death of George Floyd in Figure 1 , using a novel tweet dataset that we describe in the next section. We notice that the number of new users increased dramatically after the event; however most of them are either deleted or suspended later. Meanwhile, there is a strong decrease in control users who do not share any BLM-related tweet. These observations direct us to analyze what happened in social media before and after the event triggers in terms of deleted and suspended users. We thereby examine eight user types; i.e. active, deleted, suspended, and control users who joined the discussion before and after the event (old and new users), as given in Table 1 . Our study has three research questions: RQ1: What differences exist between the users who did participate and not participate in the BLM discussion in Twitter?

RQ2: What differences exist between the users who joined the discussion before and after the start of the 2020 BLM movement? RQ3: What are the possible reasons for user deletion and suspension related to the 2020 BLM movement?

We analyze deleted and suspended users in terms of five factors (i.e. lexical analysis, spamming, sentiment analysis, hate speech, and misinformation spread). First, we analyze raw text to get basic insights on lexical text features. Second, there are social spammers [22] who mostly promote irrelevant content, such as abusing hashtags for advertisements and promoting irrelevant URLs. Third, some users have tendency towards deletion possibly due to the regret of sharing negative sentiments [2, 32] . Fourth, there are haters and trolls who use offensive language and hate speech mostly against people who have different demographic background [6, 20] . Lastly, social media is an important tool for the circulation of fake news as observed during the 2016 US Election [16] , and for dis/misinformation spread [39] ; such that Twitter reveals user accounts connected to a propaganda effort by a Russian government-linked organization during the 2016 US Election [34] . We refer to misinformation as a term including all types of false information (i.e. dis/misinformation, conspiracy, hoax, etc.). Spamming, hate speech, and misinformation spread are considered undesirable and untrustworthy behavior in the remaining of this study. The contributions of our study are that we (i) create a novel dataset that includes over 500k users sharing approximately 20m tweets half of whom actively participated in the 2020 BLM discussion, but some of them were deleted or suspended later; (ii) analyze the 2020 BLM movement on social media with a detailed focus on deleted and suspended users through a wide perspective in terms of lexical analysis, spamming, sentiment analysis, hate speech, and misinformation spread. The experimental results are supported by statistical tests. (iii) Our study raises awareness for social media platforms and related organizations to have a special focus on new users who join after important events occur. We emphasize the presence of such users for undesirable and untrustworthy behavior.

The structure of this paper is as follows. In the next section, we provide a brief summary on related studies. In Section 3, we explain the details of our dataset. We then report our experimental design and results in Section 4. Lastly, we discuss the main outcomes in Section 5, and then conclude the study.

Spamming behavior promotes undesirable and inappropriate content. Spammers are widely studied in online social networks; including social spammers [22, 8] , clickbait [28] , spam URLs [9] , and social bots [14] . In this study, we consider two types of spammers who share similar content multiple times (i.e. social spammers and bots), and promote irrelevant content (i.e. spamming URLs and clickbait).

Attacking other people and communities in social media is an important problem that has different forms; including cyberbullying [1] and hate speech [6, 20, 3, 15] . Deep learning models outperform traditional lexicon-based and machine learning with hand-crafted features to detect hate speech text automatically [10, 11] . In this study, we rely on a recent Transformer language model, RoBERTa [23] , which shows state-of-the-art performances for hate speech detection [40] .

Misinformation is an umbrella term that includes all false information in social media [39] . Identification and verification of online information, i.e. factchecking and rumor detection, is a recent research area [24, 26] . Fighting against fake news is another dimension of misinformation spread [17] . To the best of our knowledge, there is no misinformation study on the case of the 2020 BLM event; we thereby identify misinformation by using a set of fake and rumor topics.

In terms of untrustworthy behavior, there are also efforts to find trustworthy or credible users [37] . In this study, rather than supervised models finding trustworthy users, we aim to understand the differences between normal and deleted/suspended users in terms of untrustworthy behaviors such as hate speech and misinformation spread.

The factors behind social media suspension are analyzed in different cases rather than the 2020 BlackLivesMatter movement [33, 21, 12, 13] . Similarly, there are studies to understand the reasons for why users delete their social media accounts [2, 41, 7] . Instead of understanding factors and reasons, the impact of suspended users is analyzed when they are taken out from online social networks [38] .

Supervised prediction models are proposed for deleted [41] , suspended [36, 31] , and even troll accounts [19] . However, users who join social media before and after the start of a social movement are not studied in terms of active, deleted, and suspended user accounts. We fill this gap by providing a comprehensive analysis on a novel dataset regarding 2020 BLM. Our findings can provide a basis for understanding and filtering undesirable and untrustworthy behavior in social media after critical events occur. A study with a similar objective aims to understand users with mental disorder in social media [29] . This is also the first study that examines the dynamics of 2020 BLM to this extent.

We created our dataset in two steps. First, we collected all tweets from Twitter API's public stream between April 07 and June 15, 2020. From this collection, we found a user set who posted tweets about the 2020 BLM movement by using a list of 66 BLM-related keywords, such as #BlackLivesMatter and #BLM. We did not omit tweets in languages other than English as long as they contain the keywords, e.g. #BlackLivesMatter is a global hashtag. In December 2020, we checked the status of the user accounts. Some of the accounts maintain their activity, some of them are deleted, and the remaining ones are suspended by Twitter. In the second step, we expand the tweets by fetching the archived tweets of all users from Wayback Machine 1 . To obtain a better insight on user activity before and after the start of the 2020 BLM movement, we collected the archived tweets posted two months before and after the event.

Our dataset includes 500,234 users with 19,861,390 tweets; 250,242 users participated in the BLM discussion at least once, while the remaining users did not participate in the BLM discussion at all (i.e. the control group). We publish the BLM keywords and dataset with user and tweet IDs 2 , considering Twitter API's Agreement and Policy. We acknowledge that deleted and suspended users can not be retrieved from Twitter API, however we share user and tweet ids to provide transparency and reliability to our study, as well as possibility to track down via the web archive platforms.

We give the distribution of users and tweets in Table 2 , along with the average number of tweets per user, and the average number of hashtags and URLs per tweet. To provide a fair analysis, we collect half of the tweets for Control (non-BLM), whereas the other half of the tweets are distributed equally as much as possible among active, deleted, and suspended users. The average number of tweets per user is 39.70 in overall. The average number of hashtags shared by new accounts is higher than those of old accounts. Similarly, the new accounts share more URLs compared to the old ones, considering only deleted and suspended users.

Apr In Figure 2 , we plot the number of tweets shared per day for each user type. There is a substantial increase in the number of tweets shared by non-control groups after May 25, as expected. We show the distribution of tweets for each user type according to top-5 mostly observed languages in Figure 3 . The vast majority of the tweets are in English; followed by Spanish, Portuguese, French, and Indonesian. Thai and Japanese are observed for active users as well. The ratio of English is higher for deleted and suspended new users, compared to the others.

We design the following experiments to find answers for our RQs.

-Hashtags and URLs: We report the most frequent hashtags and URL domains shared in the tweets to observe any abnormal activity. We fact-check the most frequent URL domains in terms of false information by using PolitiFact 3 . -Lexical Analysis: To analyze raw text features, we report the number of total tweets, and tweet length in terms of number of words by using empirical -Spamming: Spam behavior is mostly observed when users share duplicate and similar content multiple times [18] , and exploit popular hashtags to promote their irrelevant content, i.e. hashtag hijacking [35] . We detect both issues and find spam tweets. To detect the former, we find near-duplicate tweets by measuring higher than 95% text similarity between tweets using the Cosine similarity with TF-IDF term weighting [30] . To detect the latter, we group popular hashtags into topics, and label a tweet as spam if it contains hashtags from different sets. The topics are the BLM movement, COVID-19 pandemic, Korean Music, Bollywood, Games&Tech, and U.S. Politics.

-Sentiment Analysis: Users can express their opinion in either a negative or positive way. We apply RoBERTa [23] fine-tuned for the task of sentiment analysis in terms of polarity classification (positive-negative-neutral) [5] to the tweet contents. We filter non-English tweets, and tweets with less than three tokens excluding hashtags, links, and user mentions. Any existing duplicate tweets due to retweets are removed as well. Preprocessing results in approximately %60 of tweets in BLM-related users, and %18 for non-BLM.

-Hate Speech: During the discussion of important events, some users can behave aggressively and even use hate speech towards other individuals or groups of people. We apply RoBERTa [23] fine-tuned for the task of detecting hate speech and offensive language [25] to the tweet contents. Text filtering and cleaning are applied as in sentiment analysis.

-Misinformation Spread: Misinformation spread during the 2020 BLM movement can be examined under a number of topics. Since hashtag is one of impor-tant instruments of misinformation campaigns [4] , we find tweets containing a list of hashtags and keywords regarding five misinformation topics: i. Blackout: Authorities block protesters to communicate by blackout in Washington DC. ii. Arrest: A black teenager is violently arrested by US police. iii. Soros: George Soros funded the protests. iv. ANTIFA: Antifa organized the protests. v. Pederson: St. Paul police officer Jacob Pederson is the provocateur who helped to start the looting.

For each user type, t T , where T = {Act,Del,Sus,Cont}×{Old,New}, we split the tweet set into k independent subsets. In each subset, we measure factor ratio, the average number of tweets belonging to a factor, as follows.

where f {spam,negative,hate,misinfo}, n t,f is the number of tweets assigned to the factor f in the user type t, and n t is the number of all tweets in the user type t. We set k to 100 for the experiments. We determine statistically significant differences between the mean of 100-observation subsets (µ t,f ) following nonnormal distributions by using the two-sided Mann-Whitney U (MWU) test at %95 interval with Bonferroni correction. To scale the results in the plots, we report normalized factor ratio for the user type t as follows.

In this section, we report our experimental results to answer the RQs. Due to limited space, we publish online the details of experimental design, results, and statistical tests 4 .

We report the most frequent hashtags shared in the tweets in Figure 4 . #BlackLivesMatter is the most frequent hashtag for each user type, except control users, as expected. Control users (non-BLM) share hashtags from various topics with no single dominant hashtag (RQ1). Old users share not only BLM-related hashtags, but also other popular topics during the same time period, such as #COVID19 and #ObamaGate, while new users share mostly BLM-related hashtags (RQ2). New users actively participate in the discussion, since the ratio of hashtags in replies is higher, compared to old users, as observed in the barchart. For old and suspended users, we observe more right-wing political hashtags, such as #WWG1WGA and #ObamaGate, in the barchart. Suspended users would be more oriented to misinformation campaigns, compared to active and deleted users (RQ3). The ratio of hashtags in quotes is higher for them compared to other hashtags. We report the most frequent URL domains in Table 3 . Similar to hashtags, we observe mostly right-wing political domains for suspended users. A right-wing political domain, foxnews.com, is shared more by the deleted and suspended users, compared to the active users who share nytimes.com, washingtonpost.com, and cnn.com. In deleted users, change.org is consistently observed. Lexical Analysis Figure 5 shows the empirical CCDF of two lexical features: Tweet length (number of words) and average word length (number of characters) per tweet. The mean and standard deviations of the tweet and word length distributions are given in Table 4 .

When the users who participated and did not participate in the BLM discussion are compared (RQ1), BLM users share thoughts with longer tweets 

New Users Fig. 6 : Normalized factor ratio for spamming, negative sentiment, hate speech, and misinformation spread of old and new users (best viewed in color). Misinformation is not available for Control that has non-BLM users.

but shorter words, compared to non-BLM control (KS-test, Bonferroni adjusted p<0.008). When old and new users are compared pairwise (RQ2), new users have shorter tweets than old users, except deleted and suspended (KS-test, Bonferroni adjusted p<0.0125). When user types are compared pairwise (RQ3); active users have longer tweets than deleted and suspended, but only for old users (KS-test, Bonferroni adjusted p<0.008).

We report normalized factor ratio in terms of spamming, negative sentiment, hate speech, and misinformation in Figure 6 . Note that misinformation scores are not available for Control group, since the topics are about BLM and Control group has no BLM-related tweets.

When BLM and non-BLM users are compared (RQ1); BLM users have more spam, more negative, and more hate speech tweets than non-BLM (control) users (MWU-test, Bonferroni adjusted p<0.008). However, this observation is invalid for the spamming behavior of new users.

When old and new users are compared (RQ2), new users have more negative, more hate speech, and more tweets on misinformation topics, compared to old users. These differences are statistically significant (MWU-test, Bonferroni adjusted p<0.0125); except deleted users for hate, and misinformation. This observation supports that, new users (specifically suspended ones) are more oriented to use negative sentiments, hate speech, and misinformation during important social movements, such as BLM, compared to old users. Nevertheless, there is no statistically significant change regardless of users being old and new in hate speech, and misinformation behavior of deleted users; and spamming behavior of suspended users. The number of spam tweets is decreased for active and deleted users.

When user types are compared (RQ3), suspended accounts share more spamming, more negative, more hate speech, and more tweets on misinformation topics, compared to active and deleted users, specifically for new users (MWU-test, Bonferroni adjusted p<0.008). Considering old accounts, suspended users have more hate speech and misinformation tweets, but not negative and spam tweets. We observe many negative tweets of active users, which might support that such users yield to protest the event. Another observation is that deleted users have more negative and hate speech, compared to active users, specifically for new accounts (MWU-test, Bonferroni adjusted p<0.008).

The main outcomes of this study can be summarized as follows.

-Users who participated in the 2020 BLM discussion have longer tweets with shorter words, more negative tweets, and more undesirable and untrustworthy behavior (i.e. spamming, hate speech, and misinformation), compared to the users who did not participate (RQ1). -The number of new accounts in Twitter increased significantly after the event that triggers 2020 BLM. New users mostly participated in the BLM discussion but with shorter tweets; and they are more oriented to have undesirable and untrustworthy behavior, compared to old users (RQ2). We emphasize the presence of such users for hate speech and misinformation detection. Deleted users are an exception for this observation. -Suspended users have more undesirable tweets, compared to active and deleted users (RQ3). This observation is consistent with Twitter's Rules and Agreement. However, we still find undesirable tweets in active users, specifically new ones, showing that Twitter's suspension mechanism could miss such users.

We analyze what happened in Twitter before and after the event that triggers the 2020 BLM movement on a novel dataset with approximately 500k users and 20m tweets including deleted and suspended users. We report substantial differences between old and new users participated in 2020 BLM, and reasons for user deletion and suspension. Our experimental results are supported by statistical tests. Our analysis is based on a novel tweet dataset; yet the results can be extended to other social media platforms and different case studies. The main observations that we report in the 2020 BLM event can be compared with similar events to understand the generalization of this analysis. The features analyzed in this study can be further exploited in predicting user deletion and suspension.

Deep learning for detecting cyberbullying across multiple social media platforms

Tweets are forever: A large-scale quantitative analysis of deleted tweets

Hate speech detection is not as easy as you may think: A closer look at model validation

Acting the part: Examining information operations within #blacklivesmatter discourse

Tweeteval: Unified benchmark and comparative evaluation for tweet classification

Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter

Characterizing deleted tweets and their authors

A framework for unsupervised spam detection in social networking sites

Detecting spam URLs in social media via behavioral analysis

Deephate: Hate speech detection via multi-faceted text representations

HateBERT: Retraining BERT for abusive language detection in English

On Twitter purge: A retrospective analysis of suspended users

Examining factors associated with twitter account suspension following the 2020 us presidential election

The rise of social bots

A survey on automatic detection of hate speech in text

Geographic and temporal trends in fake news consumption during the 2016 us presidential election

The battle against online harmful information: The cases of fake news and hate speech

A reminder about spammy behaviour and platform manipulation on Twitter

Still out there: Modeling and identifying russian troll accounts on twitter

Locate the hate: Detecting tweets against blacks

A postmortem of suspended twitter accounts in the 2016 u.s. presidential election

Uncovering social spammers: Social honeypots + machine learning

RoBERTa: A robustly optimized BERT pretraining approach

Detect rumors using time series of social context information on microblogging websites

Hatexplain: A benchmark dataset for explainable hate speech detection

The clef-2021 checkthat! lab on detecting check-worthy claims, previously fact-checked claims, and fake news

PawResearchCenter: Americans are wary of the role social media sites play in delivering the news

Clickbait detection

Beyond modelling: Understanding mental disorders in online social media

Hspam14: A collection of 14 million tweets for hashtag-oriented spam research

Textual analysis and timely detection of suspended social media accounts

i read my twitter the next morning and was astonished" a conversational perspective on twitter regrets

Suspended accounts in retrospect: An analysis of twitter spam

Update on Twitter's review of the 2016 U.S. election

Detecting hashtag hijacking from twitter

Identifying effective signals to predict deleted and suspended accounts on twitter across languages

A survey on trust evaluation based on machine learning

The fragility of Twitter social networks against suspended users

Misinformation in social media: Definition, manipulation, and detection

SemEval-2020 task 12: Multilingual offensive language identification in social media

Tweet properly: Analyzing deleted tweets to understand and identify regrettable ones