key: cord-297462-c5hafan8
authors: Tang, Lu; Bie, Bijie; Zhi, Degui
title: Tweeting about measles during stages of an outbreak: A semantic network approach to the framing of an emerging infectious disease
date: 2018-06-19
journal: Am J Infect Control
DOI: 10.1016/j.ajic.2018.05.019
sha: 
doc_id: 297462
cord_uid: c5hafan8

BACKGROUND: The public increasingly uses social media not only to look for information about emerging infectious diseases (EIDs), but also to share opinions, emotions, and coping strategies. Identifying the frames used in social media discussion about EIDs will allow public health agencies to assess public opinions and sentiments. METHOD: This study examined how the public discussed measles during the measles outbreak in the United States during early 2015 that originated in Disneyland Park in Anaheim, CA, through a semantic network analysis of the content of around 1 million tweets using KH coder. RESULTS: Four frames were identified based on word frequencies and co-occurrence: news update, public health, vaccination, and political. The prominence of each individual frame changed over the corse of the pre-crisis, initial, maintenance, and resolution stages of the outbreak. CONCLUSIONS: This study proposed and tested a method for assessing the frames used in social media discussions about EIDs based on the creation, interpretation, and quantification of semantic networks. Public health agencies could use social media outlets, such as Twitter, to assess how the public makes sense of an EID outbreak and to create adaptive messages in communicating with the public during different stages of the crisis.

Emerging infectious diseases (EIDs) present novel and unfamiliar risks to the public. As a new tool for strategic communication during EID outbreaks, social media allows government agencies such as the Centers for Disease Control and Prevention (CDC) to reach wider audiences. 1 As a platform based on user-generated content, social media enables the public to share opinions and sentiments during these outbreaks. One such EID is measles. Measles is a highly contagious and acute illness that can lead to pneumonia, encephalitis, and death. 2 It was declared eliminated in the United States in 2000 as the result of the successful nationwide administration of a 2-dose vaccination (ie, measles, mumps, and rubella vaccine). However, recent years have witnessed the re-emergence of measles outbreaks. 2 Most of these outbreaks were associated with imported cases; at the same time, a decrease in the domestic vaccination rate has made the country increasingly vulnerable during such outbreaks.

Presented here is a semantic network analysis of Twitter content about measles during the measles outbreak that first appeared in California during early 2015. Examining the most frequently used key words and their co-occurrences allows researchers to induce a semantic network that represents the major frames used in large amounts of text. Frames represent the cognitive structure people use in understanding and communicating issues. Through framing, media and individuals choose to highlight certain aspect of the crisis while downplaying other aspects. 3 This study adds to the research on crisis and emergency risk communication by demonstrating that social media users applied different frames to understand the public health crisis associated with a measles outbreak: news update frame, public health frame, vaccination frame, and political frame. In terms of research methodology, this study demonstrates the feasibility of identifying frames through semantic network analysis. Practically, the findings of the study allow public health professionals to understand how social media users make sense of an EID during different stages of the outbreak so that they can develop more effective crisis communication strategies.

Social media plays an essential role in the dissemination of information on EIDs. What social media users post, share, like, and comment on reflects not only what information is available, but also what they consider important. Researchers have used social media data to assess public perceptions, sentiments, and responses toward EID outbreaks such as the 2009 H1N1 outbreak, 4 the 2011 European Escherichia coli outbreak, 5 the 2014 Ebola virus disease outbreak, 6 and the 2013-2014 measles outbreak in the Netherlands. 7 With the exception of Lazard et al, 6 all these studies examined social media contents deductively based on pre-existing categories through either manual coding or text mining based on manually coded training dataset. Semantic network of social media contents represents a new lens through which researchers could inductively investigate how the public thinks and feels during an EID outbreak without the need for a training dataset.

Semantic networks represent the semantic relationships among a set of words. 8 Semantic network analysis is both a theoretical framework and a quantitative text analysis method that uncovers the structure of relationships among words. 9 In a semantic network, word-use frequencies and co-occurrence of the most frequently occurring words represent shared meanings and common perceptions in people's minds. 9 Semantic networks can be used to infer the frames used in texts. Framing is the process by which organizations and individuals choose to report or discuss an event (such as a public health crisis) by selectively highlighting certain aspects and downplaying other aspects of the event. 3 Researchers have explored the frames used by different stakeholders facing a crisis through the examination of semantic networks. David et al 10 examined 6 pre-established frames about population issues in news articles in the Philippines by looking at weighted semantic networks between 1986 and 2000. However, a strength of semantic network analysis is it allows new and unfamiliar frames to emerge from the data inductively. For instance, Schultz et al 11 studied the associative frames used by public relations professionals and news media in the United States and United Kingdom after the 2010 BP oil spill into the Gulf of Mexico and found that BP framed the oil spill as caused by external factors and downplayed internal factors (such as company's behaviors), whereas the news media adopted more complicated frames. Van der Meer et al 12 compared the frames used by public relations professionals, news media, and the public in times of crises such as explosions and volcano eruptions by examining the semantic networks and found the frames used by these groups converged overtime. Tian and Stewart 13 compared the semantic networks based on word cooccurrence in CNN and BBC online news report about severe acute respiratory syndrome and found that although both news outlets used the public health frame, CNN used the economic impact frame, and BBC used the outbreak impact frame. Exploring frames during a measles outbreak as indicated by semantic networks by Twitter users can provide insights into how social media users make sense of the crisis and what issues concern them. Hence, the first research question (RQ) was proposed. RQ1: What are the frames used in the Twitter discussion about the 2015 measles outbreak as indicated by semantic networks?

Crisis communication takes different forms in different stages of the crisis. Reynolds and Seeger 14 identified a stage-based model of crisis and emergency risk communication, which includes 5 stages: pre-crisis, initial event, maintenance, resolution, and evaluation. Each of these stages is associated with different tasks for crisis communication. Although this model was proposed as a strategic tool for guiding the crisis communication of government agencies such as the CDC, it nevertheless shows that the public goes through different stages in information processing and sensemaking about public health crises such as a measles outbreak. As a result, they are likely to use different frames to discuss measles on Twitter during each stage of the crisis. Hence, we asked the next RQ:

RQ2: How does the use of different frames change over different stages of the outbreak?

Because the most recent measles outbreak in the United States occurred between December 2014 and April 2015, 2 tweets including the word measles posted between December 1, 2014, and April 30, 2015, were purchased from DiscoverText.com (n = 1,154,156). Using this time frame allowed us to look at the Twitter discussion before, during, and after the outbreak.

First, raw texts were cleaned and transformed into an appropriate format to be mined. 10 Non-English tweets were excluded, resulting in 1,133,656 tweets. For the purpose of semantic network analysis, URLs in tweet texts were deleted. Special characters such as \,^, and $ as well as user names mentioned after @ were also deleted from tweet texts. Next, stop words were excluded, including conjunctions, auxiliary verbs, and transitive verbs, among others. 12 Plural forms of words were replaced by singular forms and high frequency words sharing the same root were combined into single words to facilitate the analysis. 10 Similarly, multiword phrases with the same meaning were combined.

Semantic network analysis was conducted to answer the RQs proposed. First, the content of tweets was analyzed to calculate the frequency of words and determine the most frequently co-occurring word pairs using KH Coder version 2.00f, a free software for analyzing text and identifying co-occurrence networks (Available for download at https://sourceforge.net/projects/khc/). When the data were loaded into the program, a word frequency table and a word co-occurrence network were generated. Each individual tweet was a unit of analysis and word pair co-occurrence was defined as the appearance of 2 words in the same tweet.

To explore the dynamic nature of Twitter discussion during an EID outbreak, the duration of the measles outbreak was divided into 4 stages. The pre-crisis stage (stage 1) was between December 1, 2014, and January 4, 2015. The first cases of this outbreak were reported on January 5, which marked the beginning of outbreak and thus the initial stage (stage 2). 2 The maintenance stage (stage 3) started on January 28, when the number of new cases started to decline and ended on March 6, when the last case in this break was reported. The resolution stage (stage 4) was between March 7 and April 17, when the CDC announced this outbreak to be officially over. 2 Five days were selected for each stage. To be included in the final semantic network, a word or multiword phrase must have appeared in more than 1% of tweets and be involved in the top-60 edges (connections) of each stage filtered based on Jaccard coefficient. 15 Jaccard coefficient 16 is a statistical measure widely used for assessing similarity between objects (Jaccard, 1912) . The values of Jaccard coefficient vary between 0 and 1. In KH Coder, words that appear frequently in the same tweet are considered to be closely associated, and their Jaccard coefficient become closer to 1. To facilitate a more nuanced understanding of the network structure during different stages of outbreak progression, an inductive approach was adopted to identify potential frames based on the high-frequency words and co-occurrence links among them. 17 The visualization of the semantic networks was accomplished using Graphviz (http://graphviz.org).

To explore the longitudinal changes in the use of the 4 frames identified, dictionaries containing key words associated with each frame were created based on the semantic networks identified and on the authors' further reading of tweets adopting these frames. Tweets were labeled as containing a frame based on the presence of tags (ie, frame-relevant terms). In other words, each tweet was labeled by a unique set of key words. For example, the groups of words classified as news update frame included California, Disney, Utah, January, visitor, official, and today. The list of words associated with the public health frame included CDC, patient, disease, contract, infectious, and virus, among others. The vaccine frame, which included any mention of vaccine-related issues, was associated with key terms such as vaccine, antivaccine, unvaccinate, unvaccinated, inhaled measles vaccine, safety, immunity, and Jenny McCarthy (an actress often associated with the anti-vaccine movement). Finally, the political frame was associated with terms such as Obama, Republicans, Democrats, immigration, illegal, lawmaker, debate, Governor Christie, and politics. A single tweet can be labeled as using 1 frame, multiple frames, or none of the frames. Next, percentages of tweets using different frames were calculated for the 4 stages identified and χ 2 tests were run to see whether the differences among stages were statistically significant. Bonferroni correction was conducted to account for the effects of multiple testing.

Overall, original tweets (n = 567,391) accounted for 50.05% of all tweets collected (n = 1,133,656). As shown in Figure 1 , although the numbers of tweets and retweets were similar during the most active part of the outbreak, retweets outnumbered original tweets mostly in early February during the maintenance stage. Figure 2 represents the semantic network of Twitter content about measles during the entire outbreak. Four distinct frames were identified inductively based on the reading of the semantic networks and tweets containing the key words included in these semantic networks: news update frame, public health frame, vaccine frame, and political frame. The titles of these frames were coined by the authors based on the typical message carried in each frame, similar to the practice described in Odlum and Toon (2015) . 18 

The news update frame provided news and updates about suspected and confirmed cases of measles before, during, and after the outbreak. Typically, a tweet adopting the news update frame included the number of cases in a geographic location; for example, "California reports 4 more measles cases." Tweets using the news update frame did not typically contain opinions.

Tweets adopting the public health frame conveyed medical information such as symptoms of measles, methods of prevention, and treatment. For instance, the following tweet introduced a new complication of measles: "Eye complications possible with measles warn ophthalmologists." Tweets using the public health frame educated the public about measles and sometimes included behavior recommendations.

The vaccine frame referred to the discussion and debates about the safety and necessity of vaccination. Tweets using this frame were often emotionally charged. An example of a provaccine tweet was: "If I get measles because some nitwit talked my parents into not vaccinating me, somebody's getting their ass kicked." An example of an anti-vaccine tweet was: "Measles vaccine kills 6 kidsmedia blackout." 

The political frame was used in those tweets that presented the causes of and solutions for the measles outbreak in political terms. Some Twitter users blamed the outbreak on the influx of illegal immigrants and called for tighter immigration law and border control. The political frame was also used in tweets debating governmental policies on vaccination and disease prevention. An example of a tweet using the political frame was: "Am I the only one wondering if the surge of illegals in 2014 has anything to do with the measles outbreaks we see now?"

Across the 4 stages of the outbreak, the pre-crisis stage had the fewest original tweets (n = 2,182), whereas the maintenance stage included 137,233 original tweets, followed by the initial stage (n = 13,556) and the resolution stage (n = 8,702). The use of the 4 frames showed different patterns in the 4 stages of the outbreak (See Fig 3) . The news update frame appeared to be the most dominant frame during the initial and resolution stages. The public health frame was 1 of the 2 most dominant frames in the precrisis stage; however, its use decreased during the initial stage and was lowest during the maintenance stage. The use of the vaccine frame increased from pre-crisis stage to the initial stage and the vaccine frame became the most dominant frame during the maintenance. The political frame was the least often used frame in all 4 stages of the outbreak and appeared most frequently during the maintenance stage.

A series of χ 2 tests showed that the use of these frames was significantly different across the 4 stages of the outbreak (See Table 1 ). Pairwise χ 2 tests were conducted to further explore the differences among these 4 stages in the use of frames (See Table 2 ). Specifically, the following 3 pairs of comparisons were performed based on the chronological order of development stages: the precrisis stage versus the initial stage, the initial stage versus the maintenance stage, and the maintenance stage versus the resolution stage. All of the pairwise comparisons were significant at the adjusted P < .001 level, except for the use of the political frame in pre-crisis and initial stages.

Social media allows the assessment of public opinions, sentiments, and responses during an EID outbreak. Our study examines the frames used in Twitter discussion during the 2015 measles outbreak through a semantic network analysis.

This study finds that around half of tweets are retweets. Furthermore, retweets outnumbered original tweets during the early days of the maintenance stage. This stands in contrast with the findings of previous studies. For instance, Liu, Kliman-Silver, and Mislove 19 found that overall around 30% of tweets are retweets. Radzikowski et al 20 studied the tweets about vaccines during the later stages of the 2015 measles outbreak (the maintenance stage in our study) and found around 44% of tweets in their data corpus were retweets. Our data also suggest the highest rate of retweets during this stage, although our data were about measles instead of vaccine. It is possible that the maintenance stage sees more reflection and activism when the immediate threat of the outbreak has been contained and thus is associated with more retweets.

The current study identified 4 major frames that emerged organically from the semantic network based on word frequencies and co-occurrence. Each frame highlighted an important dimension of measles-related discussion on Twitter. Furthermore, different stages of the outbreak witnessed fluctuations in each frame's popularity. During the pre-crisis stage, the news update frame and public health frame were the 2 most dominant frames used, whereas the vaccine frame was rarely used and the political frame was almost never used. During the initial stage, use of the news update and vaccine frames increased, although the use of the public health frame actually went down. During the maintenance stage, the vaccine frame became the most frequently used frame, followed by the news update frame, public health frame, and political frame. As the outbreak drew to an end, the use of frames revert to the pattern observed in the precrisis stage, with news update being the most used frame, followed by the public health frame, vaccine frame, and political frame.

There are 2 possible explanations for the changes in how measles is framed on Twitter. It is possible that the frames on Twitter are influenced by the frames set by the mainstream media. For instance, Lee and Basnyat 21 studied media framing of the 2009 H1N1 outbreak in Singapore and suggested that although traditional media predominantly used the update frame and prevention frame throughout the break, their use of frames diversified to some extent in the later stages of the crisis to include more personal frames and social frames. Our semantic network shows that traditional news media (eg, CNN, CBS, Reuters, and AP) featured prominently in the discussion about the measles outbreak. Future research could examine the intermedia agenda setting process between traditional news media and social media in covering and framing public health crises associated with EIDs to see if Twitter frames are influenced by the frames used in the traditional news media or if the public may influence the frames used by traditional news media through voicing their concerns and opinions in Twitter. The other possible explanation for the changes in the use of frames is psychological. According to the psychometric paradigm, people's response to a risk is decided by the dread, catastrophic potential, controllability, and familiarity of said risk. 22 It is possible that during the early stages of a measles outbreak, the public is unsure about the seriousness of the risk and the controllability of the outbreak and therefore prefers the update frame and the public health frame. These 2 frames provide information that can help the public assess the scale and severity of the crisis. However, as the outbreak develops, people perceive the risks associated with measles as more familiar and more controllable. As a result, they start to try to make sense of the crisis by deciding whom they should blame for the outbreak, thereby leading to the increased use of the vaccine frame, which blames parents who refuse to vaccinate their children, as well as the political frame, which blames illegal immigrants or the government for the outbreak. Through surveys, future research could establish the relationship between social media users' risk perception and the frames they apply in discussing EIDs. The high frequencies of the political frame during the maintenance stage, some of which state that illegal immigrants were responsible for the measles outbreak, also indicates the spread of fake news on Twitter. Recently, Vosoughi, Roy, and Aral 23 found that false news stories spread further, faster, deeper and more broadly on Twitter than truthful stories. Compared with truth, false news stories only took one-sixth of the time to reach 1,500 people and were 70% more likely to be retweeted. These findings have particularly important implications for health care professionals during infectious disease emergencies. When rumor or misinformation gains high virality, informing and educating the public would become very challenging and difficult.

Radzikowski et al. studied the measles vaccination narrative in Twitter by examining the semantic networks of popular hashtags and retweeting patterns in the aftermath of the outbreak. They found that although official public health agencies such as the CDC and World Health Organization have all entered the social media arena, mainstream media coverage about key health issues still have the power to lead online public participation. Our study confirmed some of the findings of Radzikowski et al. 20 that mainstream news coverage of health topics actually played a very important role in leading online audience attention and shaping public debate on social media and political frame was the most observed (9.3%) during the maintenance stage when the vaccine frame also came to its peak (39.5%). Our study also extended Radzikowski et al. 20 by extending the observation time, looking at Twitter discussion of this outbreak as a whole from a risk communication perspective, and comparing the content patterns in different stages of the outbreak.

This study proposes and tests a method for assessing the frames used in social media discussions about EIDs based on the creation, interpretation, and quantification of semantic networks. This method can be used to study the public response to different EID outbreaks as well as other ongoing public health crises, such as heart disease and obesity. Instead of looking for known frames used in social media, this method allows the identification of unique frames associated with different public health crises.

In terms of this study's practical implications, our study shows that agencies could use social media content on platforms such as Twitter to assess how the public makes sense of an EID outbreak and to create adaptive messages in communicating with the public during different stages of the crisis. For instance, if it is detected that the public tends to use the vaccine frame during a certain stage of the crisis, public health agencies could design and disseminate specific social media messages to spread useful information and combat misconceptions.

This study only represented an early-stage effort at mining the contents about EIDs on social media. Future studies can take a number of directions. It is possible that individuals, government agencies, nonprofit organizations, and news media might use different frames in tweeting about measles before, during, and after an EID outbreak. Further research should compare these different stakeholders' tweets. In addition, it would be of interest to study the retweeting network about EIDs (as done in Radzikowski et al, 2016) . Studying the structure of retweeting networks can shed lights on how information and opinion is diffused among Twitter users. It would be useful to compare the retweeting networks during different stages of the outbreak to see if information flows differently during these stages and explore the roles played different stakeholders (eg, traditional news media, nonprofit organizations, and corporations) during across the stages.

Emerging infectious disease (EID) communication during the 2009 H1N1 influenza outbreak: literature review (2009-2013) of the methodology used for EID communication analysis

Measles -United States

Framing as a theory of media effects

Pandemics in the age of Twitter: content analysis of Tweets during the 2009 H1N1 outbreak

Tweeting during food crises: A psychosocial analysis of threat coping expressions in Spain during the 2011 European EHEC outbreak

Detecting themes of public concern: a text mining analysis of the Centers for Disease Control and Prevention's Ebola live Twitter chat

Disease detection or public opinion reflection? Content analysis of tweets, other social media, and online newspapers during the measles outbreak in The Netherlands in 2013

Theories of communication networks

What constitutes semantic network analysis? A comparison of research and methodologies

News frames of the population issue in the Philippines

Strategic framing in the BP crisis: A semantic network analysis of associative frames

When frames align: the interplay between PR, news media, and the public in times of crisis

Framing the SARS crisis: a computer-assisted text analysis of CNN and BBC online news reports of SARS

Crisis and emergency risk communication as an integrative model

Network analysis of message content

The distribution of the flora in the alpine zone

Lexical shifts, substantive changes, and continuity in State of the Union discourse

What can we learn about the Ebola outbreak from tweets?

The tweets they are a-changin': Evolution of Twitter users and behavior. AAAI

The measles vaccination narrative in Twitter: a quantitative analysis

From press release to news: mapping the framing of the 2009 H1N1 a influenza pandemic

The perception of risk

The spread of true and false news online