key: cord-0692301-qn8p7zlq authors: Tattan‐Birch, Harry; Marsden, John; West, Robert; Gage, Suzanne H. title: Assessing and addressing collider bias in addiction research: the curious case of smoking and COVID‐19 date: 2020-11-23 journal: Addiction DOI: 10.1111/add.15348 sha: 56c00b29d462554f24cbedc0bd0bd32a05461367 doc_id: 692301 cord_uid: qn8p7zlq nan Addiction researchers tend to have a keen eye for confounder bias. They are often quick to spot why variables that cause both an exposure and an outcome can produce spurious associations. A researcher who is studying possible gateway effects (for example, the likelihood of cocaine use among people who have prior exposure of cannabis use) will take care to control for confounding variables, such as the personality trait 'novelty seeking'. In this simulated example, novelty seeking is a cause of cannabis use and also a cause of cocaine use (1) . Unless controlled for, confounders may suggest a causal association between cannabis and cocaine use where none exists. However, there is another source of bias that is often neglected: collider bias. Collider bias can be seen as the flip side of confounder bias, but it is much less intuitive. Whereas confounders cause both exposures and outcomes, colliders are caused by both exposures and outcomes (figure 1 shows directed acyclic graphs). Whereas controlling for a confounder removes bias, controlling for a collider can produce it. So, when do confounder and collider bias occur? Confounder bias occurs when an analysis fails to adequately control for a variable, a 'confounder', that is a cause of both the exposure and outcome. The effect of this is to distort the association between exposure and outcome. In the above example, novelty seeking (confounder) causes both cannabis (exposure) and cocaine use (outcome). This indicates that people who use cannabis have higher levels of the novelty seeking trait than those who do not. As high novelty seeking also causes people to use cocaine, there will be a positive association between use of both drugs even if cannabis use does not itself cause subsequent cocaine use. To determine whether cannabis use does cause cocaine use, the researcher must control for novelty seeking by adding it as a covariable in the analysis. This will show whether people who use cannabis are more likely to subsequently use cocaine than those who do not, even when they have the same level of novelty seeking. By controlling for the confounder, bias has been removed. By contrast, collider bias occurs when an analysis controls for, stratifies on, or selects its sample based on a variable, a 'collider', that is caused by the exposure and also caused by the outcome (2, 3) . This distorts the association between the exposure and outcome. For example, a researcher is interested in testing whether depression (the exposure) is associated with impulsivity (the outcome). Let us assume that, in this simulated example, there is no association This article is protected by copyright. All rights reserved. between depression and impulsivity in the general population. However, depression and impulsivity both increase the likelihood of a person using opioidsso opioid use a collider. Controlling for opioid use in the analysis would introduce a negative association between depressive symptoms and impulsivity. It would make it appear that depression causes people to become less impulsive; or impulsivity causes people to become less depressed. Collider bias occurs not only when adding a collider as a covariable, but also when you select (or stratify) your sample based on a collider. This is also often called 'selection bias'. This selection process happens frequently in addiction research. Using our previous example, selecting a sample of only people who use opioids would produce a negative association between depressive symptoms and impulsivity, where none existed in the general population (see figure 2 ). Another way to think about this is the following: if you know someone uses opioids, but they have no depressive symptoms, something must have caused them to start. Therefore, they may be more likely to be impulsive. Conversely, people who are depressed may use opioids even if they are not very impulsive. Hence, the relationship shown in figure 2 . Recently, it has been suggested that collider bias could be a particular problem in research investigating whether smoking may protect against contracting COVID-19 (4). In many countries, people who develop a cough are advised to get tested for COVID-19. However, both smoking and COVID-19 can cause coughing. Smokers who develop a smoking-related cough may seek out a test even when they do not have COVID-19. This would lead smokers to be over-represented among those who test negative for COVID-19, inducing a negative association between smoking and COVID-19 where none really exists. This is because only those who are tested are included in the sample, and being tested is the result of the collider 'having a cough'. Thus, selecting for a sample of those who are tested for COVID-19 is the equivalent of conditioning on this collider. Results from a recent systematic review support the above interpretation: compared with people who have never smoked, smokers were more likely to be tested for COVID-19, but less likely to test positive (5) . It should be noted that there is some evidence from sources that are unaffected by collider bias, such as seroprevalence studies (6, 7) , that smokers have lower risk of contracting COVID-19. Nonetheless, smokers may have This article is protected by copyright. All rights reserved. worse outcomes when hospitalised (5) , so the overall effect of smoking could be negative even if it protects against infection. Collider bias has been implicated in seemingly protective effects of smoking previously. For example, a recent paper explored the impact of collider bias in studies examining how smoking affects birth defects (8) . The authors found that, in a sample where only live births were included, children of smokers were less likely to exhibit the birth defect anencephaly relative to children of non-smokers. Smoking (the exposure) and anencephaly (the outcome) both reduce the likelihood that pregnancy will end in a live birth (the collider). So, only including live births in a sample introduces collider bias, which inflates the negative association between smoking and anencephaly. This could lead one unaware of collider bias to erroneously conclude that smoking protects against anencephaly. The authors also included results from pregnancies that did not result in live births. When they did so, the association weakened towards the nullwhich lends support for the hypothesis that the association is due to collider bias rather than representing causality. What should addiction researchers do to address collider bias? Firstly, we must think carefully about the causal relationships between variables in our studies. This can be done by drawing simple causal graphs, such as in figure 1. Even if we do not present these graphs in our manuscripts, having a better picture of the causal relationships between variables may help us spot confounders and colliders that we would have otherwise missed. Secondly, we can design studies in a way that mitigates against the impact of collider bias. These include using weighting of participants to make the sample more representative of the underlying population with the aim of removing or minimising the impact of biases in the selection of the sample. We can also use cross contextual studies, where the same associations are explored in different settings with different underlying sample selection criteria. All of these methods have limitations, but crucially they have different limitations. By triangulating different study designs, a better estimate of the likelihood of various causal associations can be ascertained (9) . Thirdly, once a study has been conducted, we can look for indicators of collider bias. We can examine the demographics of the sample being analysed to identify whether particular groups are over-or under-represented. For example, the over-representation of smokers among those This article is protected by copyright. All rights reserved. tested for COVID-19, but their lower rates of testing positive relative to never smokers, is indicative of collider bias. Similarly, we can use 'negative controls': variables that we have reason to assume should not be associated. If an association is found with these, then the risk of collider bias may be high (10) . Much like with confounding, it is often not possible to be sure that collider bias has been avoided or eliminated. However, with a better understanding of what causes it, and checks and balances to explore whether it could be present, we can be better placed to interpret surprising and implausible findings with appropriate caution and caveat. Cannabis use and other illicit drug use: Testing the cannabis gateway hypothesis Quantifying biases in causal models: Classical confounding vs colliderstratification bias Causal diagrams for epidemiologic research Collider bias undermines our understanding of COVID-19 disease risk and severity. medRxiv The association of smoking status with SARS-CoV-2 infection, hospitalisation and mortality from COVID-19: A living rapid evidence review with Bayesian meta-analyses This article is protected by copyright. All rights reserved Antibody prevalence for SARS-CoV-2 in England following first peak of the pandemic: REACT2 study in 100,000 adults. medRxiv Seroprevalence of SARS-CoV-2 among adults in three regions of France following the lockdown and associated risk factors: a multicohort study Quantification of selection bias in studies of risk factors for birth defects among livebirths Triangulation in aetiological epidemiology Causal Inference in Developmental Origins of Health and Disease (DOHaD) Research This article is protected by copyright. All rights reserved Figure 1: Directed acyclic graph showing causal relationships between exposures, outcomes, and A) confounders and B) colliders HTB holds a studentship that is funded by Public Health England (558585/180737). This article is protected by copyright. All rights reserved.