key: cord-0683298-d04tj969 authors: Zhang, Yuqi; Guo, Bin; Ding, Yasan; Liu, Jiaqi; Qiu, Chen; Liu, Sicong; Yu, Zhiwen title: Investigation of the determinants for misinformation correction effectiveness on social media during COVID-19 pandemic date: 2022-04-05 journal: Inf Process Manag DOI: 10.1016/j.ipm.2022.102935 sha: a629178377ffc66ce6bfdab951da0995ecbec235 doc_id: 683298 cord_uid: d04tj969 The rapid dissemination of misinformation in social media during the COVID-19 pandemic triggers panic and threatens the pandemic preparedness and control. Correction is a crucial countermeasure to debunk misperceptions. However, the effective mechanism of correction on social media is not fully verified. Previous works focus on psychological theories and experimental studies, while the applicability of conclusions to the actual social media is unclear. This study explores determinants governing the effectiveness of misinformation corrections on social media with a combination of a data-driven approach and related theories on psychology and communication. Specifically, referring to the Backfire Effect, Source Credibility, and Audience’s role in dissemination theories, we propose five hypotheses containing seven potential factors (regarding correction content and publishers’ influence), e.g., the proportion of original misinformation and warnings of misinformation. Then, we obtain 1487 significant COVID-19 related corrections on Microblog between January 1st, 2020 and April 30th, 2020, and conduct annotations, which characterize each piece of correction based on the aforementioned factors. We demonstrate several promising conclusions through a comprehensive analysis of the dataset. For example, mentioning excessive original misinformation in corrections would not undermine people’s believability within a short period after reading; warnings of misinformation in a demanding tone make correction worse; determinants of correction effectiveness vary among different topics of misinformation. Finally, we build a regression model to predict correction effectiveness. These results provide practical suggestions on misinformation correction on social media, and a tool to guide practitioners to revise corrections before publishing, leading to ideal efficacies. The rapid dissemination of misinformation in social media during the COVID-19 pandemic triggers panic and threatens the pandemic preparedness and control. Correction is a crucial countermeasure to debunk misperceptions. However, the effective mechanism of correction on social media is not fully verified. Previous works focus on psychological theories and experimental studies, while the applicability of conclusions to the actual social media is unclear. This study explores determinants governing the effectiveness of misinformation corrections on social media with a combination of a data-driven approach and related theories on psychology and communication. Specifically, referring to the Backfire Effect, Source Credibility, and Audience's role in dissemination theories, we propose five hypotheses containing seven potential factors (regarding correction content and publishers' influence), e.g., the proportion of original misinformation and warnings of misinformation. Then, we obtain 1487 significant COVID-19 related corrections on Microblog between January 1st, 2020 and April 30th, 2020, and conduct annotations, which characterize each piece of correction based on the aforementioned factors. We demonstrate several promising conclusions through a comprehensive analysis of the dataset. For example, mentioning excessive original misinformation in corrections would not undermine people's believability within a short period after reading; warnings of misinformation in a demanding tone make correction worse; determinants of correction effectiveness vary among different topics of misinformation. Finally, we build a regression model to predict correction effectiveness. These results provide practical suggestions on misinformation correction on social media, and a tool to guide practitioners to revise corrections before publishing, leading to ideal efficacies. The world has suffered severe attacks from 'infodemic' on social media during the COVID-19 pandemic, including rumors, stigma, and conspiracy theories (we refer to such inaccurate content collectively as "misinformation" below) (Lederer, 2020; Guo et al., 2020; Luo et al., 2020; Pian et al., 2021) . The misinformation spreads fear and stigma to the public and undermines the adoption of reasonable preventions and policies for control (Bridgman et al., 2020; Zhou et al., 2021) , that poses threats to pandemic preparedness and control. Therefore, mitigating the severe impacts of misinformation has aroused widespread attention. Researchers have tried to combat them with the methods like fact-checking, correcting, and debunking misinformation (Vraga and Bode, 2020; Burel et al., 2021) . Correction on social media is one of the crucial countermeasures for online misinformation, that debunks a false claim or a misperception through posts published by users (Vraga and Bode, 2020) . However, the mechanism of misinformation correction is not clear on social media, for example, how to organize the content appropriately and how the publishers affect the efficacy of corrections are not well resolved. Therefore, it is necessary to further explore the instinct mechanisms of misinformation correction on social media and guide practitioners to correct it effectively. Previous works (Nguyen et al., 2013; Budak et al., 2011; He et al., 2012; Tong and Wu, 2018; Saxena et al., 2020) mainly study the effectiveness of misinformation corrections on social media from two perspectives: "truth campaigns" and cognitive effectiveness of corrections. "Truth campaign" explores the promotion of corrections dissemination, mostly based on diffusion cascades and greedy algorithms. Cognitive effectiveness of corrections studies the cognitive effects of corrections on individuals, with the aim to find effective ways to convey corrections (Lewandowsky et al., 2012; Rich and Zaragoza, 2020; Dai et al., 2021) . This paper focuses on the latter perspective. Cognitive effectiveness of misinformation correction mainly focuses on psychology and communication theories and experiments. According to Lewandowsky et al. (2012) , the characteristics of an effective debunking post are summarized grounded in psychological theories (e.g., a convincing explanation, a concise statement). Moreover, much experimental research is conducted to examine the influencing factors of the effectiveness of corrections, e.g., the timeliness of debunking misinformation (Rich and Zaragoza, 2020) , message order (before misinformation or after J o u r n a l P r e -p r o o f Journal Pre-proof Determinants for Misinformation Correction Effectiveness misinformation) and debiasing message (including in corrections or not) (Dai et al., 2021) . Although these works have been done by pioneering research on the cognitive effects of misinformation correction, there still exists the following constraints: first, the data samples are usually small because of the cost of social experiments, so the representativeness of the result can be affected. Second, surveyed or interviewed data are often objective and can easily be influenced by "observer bias." Third, the applicability of these works to social media is unclear, because their results are not obtained from the actual data flow on social media. In this paper, we explore the determinants governing the effectiveness of misinformation corrections based on the comprehensive analysis of a COVID-19 online dataset and propose a regression model to predict correction effectiveness, given the correction post's basic features. Our dataset consists of 1487 corrections on significant COVID-19 related events between January 1st, 2020 and April 30th, 2020, that is retrieved from Microblog on Internet. Furthermore, our work is grounded in theories on psychology and communication (e.g., Backfire Effect, Source Credibility, Disconfirmation Bias) (Lewandowsky et al., 2012; Kim and Dennis, 2018; Nyhan and Reifler, 2010) . To our best knowledge, our work is the first to explore the cognitive effectiveness of misinformation corrections based on the data-driven approach and theories on psychology and communication. Besides, we fill the gap in the lack of qualitative mechanisms to guide effective corrections. According to theories on psychology and communication (i.e., Backfire Effect, Source Credibility, and Audience's Roles in Dissemination) (Ecker et al., 2011 (Ecker et al., , 2010 Johnson and Seifert, 1994; Van den Broek et al., 1999; Kim and Dennis, 2018; Shu et al., 2019) , we propose five hypotheses, that assume the associations between seven potential factors and the effectiveness of misinformation correction. Afterward, based on COVID-19 related corrections from Microblog, we evaluate the effectiveness of corrections using their social contexts and examine the relations in assumptions. Through the comprehensive analysis, we obtain five findings, i.e., mentioning excessive original misinformation in corrections would not undermine the believability of people within a short period after reading; warnings of misinformation in a demanding tone make corrections worse; concise corrections are more effective; persuasive graphic explanations are appealed to be valued; Influential media should take more responsibility. According to our findings, we appeal for the corrections which are short, concise, persuasive, rich in graphics, in a gentle tone, and J o u r n a l P r e -p r o o f Journal Pre-proof Determinants for Misinformation Correction Effectiveness published by influential users. In addition, we demonstrate that the effects of the above factors on corrections vary among different topics. Finally, we build the regression model to predict the effectiveness of corrections. The 2 (Goodness of Fit) of the optimal model in the training dataset is up to 0.66 and up to 0.46 in the testing dataset. The proposed model obtains the key relations among these variables. Our novel contributions are summarized as follow: • As far as we know, our work is the first time that studies the cognitive effectiveness of misinformation corrections with the combination of a data-driven approach and related theories on psychology and communication. We propose assumptions based on theories (i.e., Backfire Effect, Mental Model of Misinformation, Source Credibility, and Audience's Roles in Dissemination) and conduct extensive evaluations on 1487 corrections on influential COVID-19 related events from January 1st, 2020, to April 30th, 2020, on Microblog. • Based upon a comprehensive analysis of results, we obtain the determinants of the efficacy of corrections and provide explanations for results according to "Continued Effect of Misinformation" (Johnson and Seifert, 1994; Wilkes and Leatherbarrow, 1988) and "Disconfirmation Bias" theories. We also propose five suggestions on the good practice of effective corrections on social media. • We build an effective regression model to predict the efficacies of corrections. This model captures the key relations among features and efficiencies of corrections. It can guide practitioners to revise their corrections before publishing and achieve ideal efficiency. The rest of the paper is organized as follows. We introduce related research about misinformation correction in Section 2. Then, we elaborate on the problem in Section 3, and describe the proposed methodology in Section 4. Subsequently, the evaluation results are elucidated in Section 5 and the implications are discussed in Section 6, and we discuss limitations and future work in Section 7. Finally, we comprehensively provide our conclusions in Section 8. Existing works mainly study the effectiveness of misinformation corrections from two perspectives: "truth campaigns" and the cognitive effectiveness of corrections. In this section, we summarize recent works in these two fields. Misinformation intervention strategies. can be divided into two categories: influence blocking and truth campaigns. Influence blocking strategy functions by blocking critical nodes or edges in the spread of misinformation to minimize the flow of misinformation (Yang et al., 2018; Wu and Pan, 2017) . And truth campaigns strategy select the optimal seed nodes to diffuse the truth information to influence users' awareness and reduce the proliferation of misinformation (Zareie and Sakellariou, 2021) . The latter strategy is related to the focus of this paper: the dissemination of misinformation correction, so the methods of truth campaigns are elaborated here. In truth campaigns, the misinformation and truth both spread in the network. It chooses the top-k originators to diffuse the corrective message and reverses viewpoints of nodes already affected by false claims. The ultimate goal is to minimize the number of users adopting the misinformation in the network. Nguyen et al. (2013) stimulated the diffusion process of the truth based on the IC or LT diffusion model. They designed a greedy algorithm to select the best set of seed users to start the spread of the corresponding fact, making sure at least the -fraction of users in the network can be decontaminated. Yang et al. (2020) set two different thresholds for each individual in the diffusion model: influence threshold and decision threshold. The influence threshold took effects in activating a node receiving no information, and the decision threshold was used while convincing a node to switch its perceptions. Tong and Wu (2018) assumed that more than one truth campaign may attempt to intervene in the spread of misinformation and they considered multiple diffusion cascades with different cascade priorities in the dissemination model. influence the maximum number of users before a given deadline. Saxena et al. (2020) proposed an opinion model that took the fluctuation of users' opinions into account and designed a mitigation solution to select a subset of debunkers to maximize the number of users who turned over the wrong pre-beliefs. Zhang et al. (2021) considered another realistic situation, where both seed nodes and boost nodes existed in the diffusion model. The boost nodes could be more likely to adopt the corrective message when they receive it. The above studies improve the effects of corrections from the perspective of network structure. Although some of them take users' opinions and realistic conditions into consideration, the associations between factors in corrections and correction effectiveness are not estimated. For example, the content of corrections and the influence of debunkers can significantly affect the acceptance of people on misinformation corrections. It is necessary to understand the mechanisms of correction effectiveness and helps in mitigating the negative impacts of misinformation. Misinformation is usually persistent and difficult to entirely be clarified and can influence users' perceptions even after corrections, namely, the continued effect of misinformation (Johnson and Seifert, 1994; Desai and Reimers, 2019; Andrea and Radvansky, 2020) . Many kinds of research have confirmed it. For example, Ecker et al. (2011) found out the retraction on misinformation rarely has the intended effect of eliminating the reliance on misperceptions even when people acknowledge and remember this retraction. The reason that results in debunking misinformation such difficulty has been explored in many studies in psychology. One possible reason is motivated reasoning. Individuals can protect their pre-existing attitudes and prefer to receive the information they believed before (Jerit and Barabas, 2012; Taber and Lodge, 2006) . When people with wrong beliefs encounter a corrective message, they may feel be challenged and refuse to accept the fact. In addition, misinformation tends to be sticky and persistent in the brain due to the mental models people build for the process of misinformation (Thorson, 2016) . Another reason is that corrections can sometimes lead to backfire effects (Lewandowsky et al., 2012) , meaning that misinformation correction can not achieve the intended effect but strengthen the misperception (e.g., familiarity backfire effect, the overkill backfire effect). During the COVID-19 pandemic, a large proportion of misinformation crowds in the social media. The persuasion strategies of misinformation-containing posts (Chen et al., 2021) and users' poor ability to discriminate misinformation (Zrnec et al., 2022) nudge the viral dissemination of misinformation, threatening the pandemic preparedness and control. Correction is an important intervention for misinformation on social media. The effects of corrections are vital to ensure the orderly epidemic control. Therefore, it is crucial to understand the mechanism how the correction can affect the perceptions of individuals effectively, which is named as cognitive effectiveness of correction. Some efforts have been taken to study the effective mechanism of correcting misinformation on psychology and communication. that corrections mentioning misinformation would decrease the audience's belief in falsehoods, who had not been exposed to misinformation before corrections (Ecker et al., 2020) . Ecker et al. (2019) tested the overkill effect and found evidence against it that the correction with a larger number of counterarguments led to as much or more misperceptions reduction compared to one with a smaller number of counterarguments. Rich and Zaragoza (2020) assessed the influence of the timeliness of debunking misinformation on the efficacy of corrections. They also found in experiments that people's belief in misinformation can increase over two days after correction. Dai et al. (2021) investigated two factors that could affect the effectiveness of correction, namely, message order (before misinformation or after misinformation) and debiasing message (including in corrections or not). Through online experiments, they revealed that misinformation corrections are most effective when they are conveyed after misinformation and include debiasing messages. Bautista et al. (2021) conducted interviews on healthcare professionals to build the conceptual model of these professionals' act of correcting health misinformation on social media, to guide effective corrections for authorities. These works on the efficacy of misinformation correction explore how the content and format of correction influence its effectiveness. Some are theoretical research, and the others are experiment research based on surveys or interviews. This "well-designed" research paradigm is suitable for exploring specific questions deeply. But they do not capture the associations from the real data flow on social media, the universality of conclusions is unclear. Besides, it has constraints in experiment cost and small samples. In this work, we apply a data-driven approach to explore the factors governing correction effectiveness. The negative influence of misinformation has aroused dramatic attention in society during the COVID-19 pandemic and many interventions have been implemented, like fact-checking, correcting. Correction on social media is one of the crucial countermeasures for online misinformation. However, the effective mechanism of misinformation correction is not clear on social media, for example, how effective mainstream corrections are, how to organize the content appropriately, and how the credibility of publishers affect the efficacy of corrections are not solved. In this work, we propose assumptions about the determinants governing the effectiveness of misinformation corrections according to "continued effects of misinformation" and "backfire effects" theories. These hypotheses include seven factors that are potentially associated with the effectiveness of corrections, i.e., the proportion of original misinformation, length of the post, textual warning of misinformation, graphic warning of misinformation, explanation, graphic explanation, and influence of publisher. Besides, according to the theory that users play different roles in disseminating misinformation, it raises our curiosity about the role of users in the spread of corrections. Repetition of one message makes people familiar with the information and rarely suspect its veracity. That is because prior knowledge smooths the process of thinking (Schwarz et al., 2007) . In other words, repetition of information strengthens the familiarity and builds up people's belief in it. Unfortunately, in corrections of misinformation, it is unavoidable to mention misinformation. Otherwise, corrections can not convey key points needed to be clarified. Consequently, corrections mentioning too much original misinformation make According to the research (Schwarz et al., 2007) , information is more likely to be accepted if it is easy to process. However, for misinformation corrections, more evidence means more successful. However, it turns out that fewer and simple arguments are more effective in reducing misperceptions. That is because too many arguments can take lots of effort to understand and simple statements can be more attractive cognitively. Therefore, the overkill backfire effect means that providing many arguments sometimes can not reduce wrong beliefs and even reinforce wrong beliefs. It is necessary that keep corrections clean and easy to process. Considering the complexity of the corrections in Microblog varies (here we take the length of the post as a measurement of complexity), we assume: H2: If the length of corrections is too long, the effects of corrections will be greatly affected. When people hear misinformation, they would build a mental model with this misinformation. This mental model records the associations among the key elements in a message, e.g., causality, correlativity. Once the false information is refuted, some key elements or associations in the mental model could be overthrown and there could be gaps left. According to existing research (Ecker et al., 2010 (Ecker et al., , 2011 Johnson and Seifert, 1994; Van den Broek et al., 1999) , if corrections do not provide alternative explanations to fill those gaps, people would continue to use that inaccurate information. It is the reason that they may be uncomfortable with gaps in their knowledge and prefer a complete model even it is inaccurate. Moreover, this effect was verified by experiments. For example, in an experiment, people read a piece of misinformation that a warehouse fire was made of paint and gas cans along with explosions (Seifert, 2002; Johnson and Seifert, 1994; Wilkes and Leatherbarrow, 1988) . And in people's mental model, it could be like "paints and gas cans led to explosions" J o u r n a l P r e -p r o o f Journal Pre-proof and "explosions caused the fire in a warehouse." Then people were told that paint and cans were not present at the fire, and when asked questions about the cause of the fire, they invoked the paint and cans despite having just acknowledged these things were not present at the fire. An alternative explanation that some accelerants contributed to the fire was provided for people. After that, people were less likely to mention "paint and cans." Therefore, it is crucial to provide alternative explanations to replace the misperceptions. Compelling explanations can be why the misinformation is wrong or why the originator of misinformation disseminated the false information. Graphics are significantly more effective than text in reducing misperception (Alda et al., 2012) , we hypothesize accordingly: H3: The alternative explanations improve the effectiveness of corrections, especially for the graphic explanations. Although people usually expect that the information they encounter is valid, misinformation is unavoidable. With explicit warnings appearing in front of misinformation, it can induce a temporary state of skepticism, which can improve people's ability to discriminate between truth and original misinformation. Besides, Ecker et al. (2010) investigated whether explicit warnings can reduce the continued influence of misinformation. It turned out that a specific warning giving details about the continued influence effect can reduce the misperceptions and a particular warning combined with alternative explanations can reduce misperception more effectively. The correction posts on Microblog usually use textual alerts like "xx is wrong!" and graphic warnings. We want to explore whether these kinds of warnings positively affect the correction effectiveness. H4: Textual and graphic warnings help in promoting the correction effectiveness. When people meet a piece of information, they usually leverage their prior knowledge. The prior knowledge impacts how people evaluate the trustworthiness of the content (Hovland et al., 1953) . When people consider the veracity of a message, one kind of crucial prior knowledge they rely on is the credibility of the source. They prefer to believe information from sources they trust rather than doubtful ones (McCracken, 1989) . Kim and Dennis (2018) studied whether the presentation format of fake news affects people's believability. They found out that high ratings of the source have a significant and positive effect on believability. Therefore, credible sources make people more likely to accept and believe corrections, thus improving the correction effectiveness. In our work, we evaluate the credibility of publishers based on their influence. H5: The influential publisher makes corrections more acceptable. During the spread of fake news, individuals play different roles. Persuaders spread fake news with supporting opinions to convince others. Clarifiers are someone who proposes skeptical and opposing opinions to clarify fake news. Gullible users have a low ability to distinguish between true and false information and are easily persuaded to accept false information (Shu et al., 2019) . Analogously, we assume that users in the dissemination of corrections on social media also have various roles, and thereby we question: In the method, as Fig. 1 shows, after assumptions, we obtain correction posts and their social interactions, characterize the correction posts based on factors from hypothesis, and conduct correlation analysis and exploratory analysis on the dataset. On top of analysis results, we examine the validity of assumptions and build the regression model to predict correction effectiveness. In this section, firstly, we describe our dataset concerning 1487 correction posts from Microblog. Subsequently, we demonstrate details on dataset labeling and evaluation of the effectiveness of corrections. Finally, we illustrate the implementations of regression models. We select the Sina Weibo platform (which is also known as Microblog) as our research target. Sina Weibo is one of the largest social media in China. Users can publish the posts and interact with other users, e.g., leaving comments, liking and retweeting to a post. "Weibo Pi Yao" is an official account to report the misinformation and publish corrections, which is operated by the authorities of the Sina Weibo platform. We Ethics statement Approval and informed confirmation are not needed. We only obtain publicly available data for the data collection which anyone can access, and any content with privacy restrictions is not collected. We collect 1487 correction posts to the influential misinformation on Microblog from January 1st, 2020, to April 30th, 2020. This collection includes the post texts and their metadata, i.e., the unique identity number of the post, the publisher of the post, the publish timestamp, retweet counts, like counts and comment counts. During data preprocessing, we remove the special strings of digital elements (i.e., URLs, emoticons, "@xxx") and the hashtag("xxx") is reserved because we assume that it can convey some considerable information. Table 1 . analysis on posts' comments. Therefore, we gather the comments of the pre-collected posts, and due to the limits of API, we only obtain comments of every post in the first 50 pages. 70,772 comments are collected finally. Each piece of data consists of the comment text, the publisher of the comment, the timestamp, like counts. User information According to the hypothesis in Section 3, the influence of users is also considered as a possible influencing factor, so we collect the information about publishers (i.e., the number of followers and influence data from module "influence of yesterday" in Microblog, details shown in Section 4.2) to evaluate their influence. Here we obtain data from 535 unique publishers. Besides, for further exploration about the effect of users in the dissemination of corrections, we gather the fan number of users interacting with precollected posts. Specific analysis is shown in Section 5. Finally, we collect information about 52,682 retweet users and 51,006 comment users. "On 9th March, it is widely spread online that XX University released a notice to inform senior students to return to school for graduation in June. Today, the college claims this message is fabricated. They mention that the school returning will be arranged in the near future." Situation of the epidemic Be associated with new cases of COVID-19, virus transmission, isolation to potential virus carriers, etc. "On 11th March, someone spread the message 'one person was diagnosed with COVID-19 infection,' we solemnly declare that the news is a rumor. And we have reported to the police, the originator will take the legal responsibility for this." Medical knowledge Common senses of medicine, virus prevention, virus self-test, etc. "Recently there is a new online that taking a sip of water every 15 minutes to keep the throat moist can prevent the virus. According to specialists, there is no relationship between virus infection and dry throat, and drinking too much water may bring extra strain to the body." Supply and safety About medical supplies, food safety and so on. "China is a major producer of masks in the world, and its annual export volume remains stable at more than 70% of its production scale. China has never issued a ban on the export of masks and their raw materials, and enterprises can carry out trades by market-oriented principles. said Li Xinggan, director-general of the Department of Foreign Trade of the Ministry of Commerce." According to our hypotheses, we summarize seven potential influencing factors: the proportion of Table 2 . During data annotations, disagreement among annotators may occur, and thus we elaborately make rules to handle conflicts for each feature. For most features that are easier to be identified and to reach an agreement (e.g., category, textual warning, graphic warning, graphic explanation, and explanation), we use the majority of annotations as the final label. For features that are harder to reach an agreement (e.g., the proportion of original misinformation, and length of the post), we take the average of all annotations as the final label. Publisher's influence evaluation Although the number of followers is a common metric to evaluate the influence of users, it is not a reliable metric due to the markets of zombie fans. Therefore, we collect user influence data from a module called "influence of yesterday," maintained by Microblog officials. Although the number of followers is a common metric to evaluate the influence of users, it is not a reliable metric due to the markets of zombie fans. Therefore, we collect user influence data from a module called "influence The sentiment analysis on comments is executed by a Python library named SnowNLP. It can handle Chinese text conveniently, including text segmentation, part-of-speech tagging, sentiment classification, etc. Table 2 Labeling and evaluation of Collected Data. The length of the text The number of characters of the post which excludes the special strings(e.g., URLs, emoticons, "@xxx") and punctuations. Explanation Whether the post contains the explanation for why the misinformation is wrong or why originators of misinformation disseminated the false information. "0"-no, "1"-yes, for the posts including multi-corrections, it is annotated as the number of corrections providing explanation divided by the total number of corrections mentioned in the post. Whether contains the explanation in graphic form "0"-no, "1"-yes, for the posts including multi-corrections, it is annotated as the number of corrections providing graphic explanation divided by the total number of corrections mentioned in the post. Whether contains textual warnings before first mentioning the misinformation "0"-no, "1"-yes, for the posts including multi-corrections, it is annotated as the number of corrections containing textual warnings divided by the total number of corrections mentioned in the post. Graphic warnings of misinformation Whether contains warnings in graphic form before first mentioning the misinformation "0"-no, "1"-yes, for the posts including multi-corrections, it is annotated as the number of corrections containing graphic warnings divided by the total number of corrections mentioned in the post. The influence of the publisher of the post Be calculated as the sum of follower counts and other influence statstics, which is presented in Equation 1 The examination of assumptions provides good understandings of effective mechanisms of misinformation correction on social media. However, it still cannot provide qualitative measurements of correction effectiveness. To solve this problem, we develop a data-driven prediction model with respect to the basic features of correction posts. We conduct our prediction in terms of typical machine learning methods, i.e., SVR, KNN, Random Forest, XGBoost. The dataset consists of 1487 correction posts. We randomly split it into the training set and testing set with a ratio of 7:3. The 5-fold validation is executed in the training set to select the model with the optimal parameters. The metrics used for evaluating the effectiveness of these models are Mean Absolute Error (referred to as MAE) and Coefficient of Multiple Determination (referred to as 2 ). MAE represents deviations from the predicted values of regression models to annotations, and 2 evaluates the explainability of independent variables to dependent variables in a regression model. Each regressor has been fine-tuned to the optimal parameters shown in Table 3 . All models are built by a Python library named sklearn. Subsequently, we make a thorough analysis of the dataset. We also apply the Spearman Correlation Analysis to verify the associations between the factors and the correction effectiveness. Moreover, the regression model is built to predict the efficacy of corrections. Results are displayed and discussed in Section 5. In this section, the comprehensive analysis on COVID-related corrections and the experiments of the regression model are presented and discussed. J o u r n a l P r e -p r o o f Determinants for Misinformation Correction Effectiveness This part demonstrates the effective determinants governing the correction efficacy and examines the veracity of the hypothesis. Subsequently, we verify the effects of factors on various topics of corrections. Finally, the influence of users involved in the dissemination of corrections is further explored. The results of Spearman Correlation Analysis are shown in Table 4 . The details and alternative explanations are discussed below. Mentioning excessive original misinformation in corrections would not undermine people's believability within a short period after reading. In Table 4 , the relation between this factor and correction effectiveness is not statistically significant, suggesting that the aforementioned hypothesis H1 is not supported. It reveals that excessive misinformation in a correction post would not reinforce the misperception of the public. Since our dataset is based on instant reactions from people, we speculate based on results that people can temporally recognize the misinformation clearly after reading corrections. Therefore, excessive original misinformation in correction posts can not interfere with people's instant judgments on the veracity of information. Concise corrections are more effective, which is recommended to be less than 500 words. Table 4 shows that the post length has a negative relation to correction effectiveness, which supports hypothesis H2. Fig. 2(a) visualizes the negative relationship between the "length of the post" and "correction effectiveness" on a log scale. (The right histograms and the top one represent the distribution of effectiveness on a log scale and the distribution on length of the post, respectively.) Moreover, it is suggested that the length of posts of current corrections is mainly distributed in [0,400] words. It should be mentioned that the correction effectiveness drops sharply when the length exceeds 500. Therefore, the length of posts is recommended to be less than 500 words. The textual explanation seems to fail to improve correction. H3 argued that explanations for why misinformation is wrong or how it started to spread could correct misperceptions of people, especially graphic ones. Results from Table 4 suggest that the explanation does not significantly affect correction effectiveness and that the graphic explanation has a small positive effect on it, so H3 is half true. The finding is against common sense that alternative explanation makes corrections more feasible for people. Moreover, it should be stressed that graphic explanation is underestimated and rarely used (see Fig. 3 Concise and persuasive graphical explanations are appealed to be valued. Warnings in a tough tone make correction worse. Surprisingly, it is investigated that the textual and graphic warnings of misinformation both have negative associations with correction effectiveness. This observation is opposite to hypothesis H4. However, it is commonly believed that warnings of false information can make people more aware of a bias and think critically. Nowadays, mainstream media usually prefers to use eye-catching warnings to attract people's attention in correction posts (see Fig. 3 ), such as "xxx is wrong!", "xxx is not real!". These warnings are in demanding narrative and contain strong negative emotions, so it is easier to trigger Disconfirmation J o u r n a l P r e -p r o o f Journal Pre-proof Bias (Nyhan and Reifler, 2010) which contributes to the backfire effect of corrections. Disconfirmation Bias means when people meet the arguments that challenge their worldview, their pre-beliefs will not change and even strengthen to argue against the opposing arguments. Hence, corrections should be demonstrated more gently and persuasively to avoid this backfire effect, not in a tough and strong tone. The influence of publishers improves correction significantly. Finally, as H5 assumed, the publisher's influence positively and significantly relates to the correction effectiveness. Fig. 2(b) presents the combined effects of the "influence of publisher" and "length of the post" on the efficacy of correction. (Color and the size of data points represent correction effectiveness on a log scale.) It demonstrates that, as influence increases, the effectiveness is distributed in the range of more effective areas. Therefore, influential media should take more responsibility to combat misinformation. Influencing factors of correction effectiveness on different themes In Section 4.1, we classified four specific types of corrections based on the misinformation topics, namely, "control measures in the epidemic," "situation of the epidemic," "medical knowledge," and "supply and safety." Fig. 4 Subsequently, the effects of factors on each type were verified by Correlation Analysis. Results shown in Table 5 suggest that the effects of factors vary in different themes of corrections. For "situation of the epidemic," affective factors are the same as analysis results on overall corrections (referred to "overall result" below). However, for the "medical knowledge" type, the only influential factor is the influence of the publisher (0.391**, 0.000). The reason for this may be that corrections in "medical knowledge" contain more complex knowledge and are more difficult to understand than those in other themes, so the influencing factors in this type could be more complicated and need to be explored further. Finally, for the "control measures in the epidemic" type and "supply and safety" type, the textual warnings of misinformation and influence of publisher have effects on them just like overall results, but the length of posts, graphic warnings and graphic explanation have no influence. And surprisingly, the proportion of original misinformation has negative and significant effects on these two types (see Fig. 4 respectively.). Therefore, while correcting misinformation, various topics of events need different optimal formats of corrections. in the dissemination of corrections was conducted to identify their impacts on the efficacy of corrections. The posts of corrections were equally divided into two parts ranked by the effectiveness of the correction. Part A is more effective and the other part is referred to as Part B. The distribution of social interactions (i.e., "retweet," "support," which means leave supportive comments under the post, "criticize," which is contrary to "support") is presented in Fig. 5(a) . Obviously, the number of overall users involved in Part A is approximately 13 times as one in Part B, which can be one of the reasons that Part A is more valid. It is indicated that social media influencers participate much more in Part A, which can be another reason for the better effectiveness. It is also observed that influencers retweet more rather than leave comments. This subsection illustrates the experiment results of the regression model and validates the sufficiency of the model through feature importance. Based on the above analysis, the category of corrections has an association with their effectiveness. Accordingly, the input of the model contains the category and valid factors examined in Section 5.1 (i.e., length of the post, graphic explanation, textual warning of misinformation, graphic warning of misinformation, and influence of publisher). During data processing, we apply a Leave-One-Out encoder to encode the category variable, which can avoid feature sparsity caused by One-hot encoding. Furthermore, J o u r n a l P r e -p r o o f Determinants for Misinformation Correction Effectiveness other variables are normalized. We implemented regression models to predict the efficacy of corrections, as shown in Section 4.3. The regression modeling res ults are presented in Table 6 , and each regressor has been tuned to the optimal parameters. As Table 6 shows, SVR is underfitting and KNN is overfitting. The performance of XGBoost and Random Forest are comparable, whose MAE and 2 are far better both in the training and testing set. We further compare the feature importance of these two outperformed regression models, which is shown in Table 7 . In XGBoost, it is clearly seen that the influence of publisher, textual warnings of misinformation and category of misinformation have significant impacts on the prediction of correction effectiveness, and that the length of the post and graphic explanation have relatively smaller contributions, which captures the majority of associations found in the subsection "Effective influencing factor" of Section 5. The feature importance of Random Forest reflects too much reliance on "influence of publisher", which hinders the model to learn the actual relationships among variables. Obviously, the XGBoost outperforms other regression models and learns the actual relation between the factors and correction effectiveness. This study sheds light on improving correction effectiveness on social media in several ways. Firstly, this study conducts manual labeling and evaluations on 1487 COVID-19 related corrections based on seven features. It provides a dataset for researchers interested in mechanisms of misinformation correction on social media, which fills the gap in the field. Specifically, researchers can use this dataset to train advanced machine learning models to automatically predict correction effectiveness. Also, the features within can be expanded to further explore determinants of correction effectiveness, according to needs. Secondly, the findings of this study provide validations for related theories, demonstrate effective factors associated with correction effectiveness on current social media, and offer insightful and practical suggestions. Namely, compared to the previous studies, this study provides evidence for overkill effect (Schwarz et al., 2007; Lewandowsky et al., 2012; Ecker et al., 2019) , and reminds researchers to take a second think of the applicability of familiar backfire effect (Lewandowsky et al., 2012; Ecker et al., 2017 Ecker et al., , 2020 and mental model of misinformation (Ecker et al., 2010 (Ecker et al., , 2011 on current social media. Besides, it reminds content publishers to notice the intrinsically psychological factors interfering with the process of adoption by people, instead of paying much attention to catching eyes, e.g., the usage of strong and explicit warning. Also, the social platform can utilize these findings to promote the credibility of corrections published by influential users. according to findings, the publishers' influence has a significant impact on correction effectiveness. Therefore, it could help if the social media platform emphasizes the influence of publishers with ratings. Thirdly, this study proposes a regression model to predict correction effectiveness, with the input of basic features of corrections. This can guide practitioners to revise their corrections before publishing, leading to ideal efficiency. content, posts they interact with, etc. However, these tasks might be challenging due to the availability of data. Second, the effective factors associated with the efficacy of corrections may change, while considering the effectiveness in a temporal sequence. In addition, because our work relies on 1487 correction posts, the dataset can be expanded to acquire more universal conclusions. Based on an analysis of the COVID-19 related dataset, we demonstrate five findings on the good practice of effective corrections. First, mentioning excessive original misinformation in corrections would not undermine people's believability within a short period after reading. Second, concise corrections are more effective, which is recommended to be less than 500 words. Third, persuasive graphic explanations are appealed to be valued. Forth, corrections should be demonstrated more gently and persuasively, not in a tough and strong tone. Fifth, Influential media should take more responsibility to combat misinformation. In summary, we appeal for the corrections which are short, concise, persuasive, rich in graphics, in a gentle tone, and published by influential users. Also, we reveal that the effects of the influencing factors on corrections vary among various topics. Therefore, it should be considered while debunking. At the end of this work, we build the regression model to predict the effectiveness of corrections, which is proven that the model learns the actual association among the basic features of correction and its effectiveness. This model can guide practitioners to revise their corrections before publishing and achieve effective corrections. Supervision. The Debunking Handbook Failure to accept retractions: A contribution to the continued influence effect Healthcare professionals' acts of correcting health misinformation on social media See something, say something: correction of global health misinformation on social media The causes and consequences of covid-19 misperceptions: Understanding the role of news and social media Limiting the spread of misinformation in social networks Demographics and topics impact on the co-spread of covid-19 misinformation and fact-checks on twitter Persuasion strategies of misinformation-containing posts in the social media The effects of message order and debiasing information in misinformation correction Comparing the use of open and closed questions for web-based measures of the continued-influence effect Reminders and repetition of misinformation: Helping or hindering its retraction Terrorists brought down the plane!-no, actually it was a technical fault: Processing corrections of emotive information Refutations of equivocal claims: No evidence for an ironic effect of counterargument number Explicit warnings reduce but do not eliminate the continued influence of misinformation The effectiveness of short-format refutational fact-checks The future of false information detection on social media: New perspectives and trends Influence blocking maximization in social networks under the competitive linear threshold model Communication and persuasion Partisan perceptual bias and the information environment Sources of the continued influence effect: When misinformation in memory affects later inferences Says who?: How news presentation format influences perceived believability and the engagement level of social media users UN Chief Antonio Guterres: Misinformation about COVID-19 Is the New Enemy Misinformation and its correction: Continued influence and successful debiasing Covid-19 infodemic on chinese social media: A 4p framework, selective review and research directions Who is the celebrity endorser? cultural foundations of the endorsement process Analysis of misinformation containment in online social networks When corrections fail: The persistence of political misperceptions The causes, impacts and countermeasures of covid-19 "infodemic": A systematic review using narrative synthesis Correcting misinformation in news stories: An investigation of correction timing and correction durability Mitigating misinformation in online social network with top-k debunkers and evolving user opinions Metacognitive experiences and the intricacies of setting people straight: Implications for debiasing and public information campaigns The continued influence of misinformation in memory: What makes a correction effective? In Psychology of learning and motivation Studying fake news via network analysis: detection and mitigation Temporal influence blocking: Minimizing the effect of misinformation in social networks Motivated skepticism in the evaluation of political beliefs Belief echoes: The persistent effects of corrected misinformation On misinformation containment in online social networks The landscape model of reading: Inferences and the online construction of a memory representation. The construction of mental representations during reading Correction as a solution for health misinformation on social media Editing episodic memory following the identification of error Scalable influence blocking maximization in social networks under competitive independent cascade models Dynamic node immunization for restraint of harmful information diffusion in social networks Containment of rumor spread in complex social networks Minimizing the spread of misinformation in online social networks: A survey Rumor correction maximization problem in social networks Characterizing the dissemination of misinformation on social media in health emergencies: An empirical study based on covid-19 Users' ability to perceive misinformation: An information quality assessment approach