key: cord-0791861-8crnrpae authors: Lennon, Jack C title: Machine learning algorithms for suicide risk: a premature arms race? date: 2020-10-01 journal: Gen Psychiatr DOI: 10.1136/gpsych-2020-100269 sha: f56f5ebe038acb6ef8a271cf627156235f608086 doc_id: 791861 cord_uid: 8crnrpae nan Machine learning (ML) techniques 1 are becoming a major area of study in psychiatry, as ML possesses the theoretical capacity to draw conclusions based on a broad range of data once the system has been taught through a process of trial and error. Specifically, and arguably most importantly, ML can do so more quickly and accurately than clinicians. 2 Several studies have demonstrated accuracy through the use of ML in various populations and geographic locations, further perpetuating the perceived need to incorporate ML into ongoing studies. However, many considerations that must be accounted for when discussing ML in the context of data collection and clinical implementation, many of which have done little to thwart the ongoing pursuit of this type of research. Given the promises and overall potential of ML techniques, suicide is one global pandemic that could benefit greatly from its use due to failed attempts at prevention. 3 However, how one defines and views suicide will determine his or her perceptions of ML's ability to be used globally in present times. Second, the limitations of ML in the context of suicide are critical to understand its utility in terms of both operation and timeliness. The use of ML techniques in suicide research is potentially premature, thus allocating funding to a solution to psychiatric translational issues prematurely. Torous and Walker 4 reported that ML can serve as a practical tool to augment current knowledge and assist in overcoming translational issues. However, there are several considerations specific to suicide that are neither discussed nor given commensurate attention. While ML need not be considered a panacea to be perceived as fallacious, the underlying concern is that ML holds greater potential as a secondary measure to large-scale prospective studies than it does as a current psychiatric tool. Views on suicide require a substantial paradigm shift, 5 much like depression is in desperate need of reconsideration due to its biological and clinical heterogeneity. 6 Based on current diagnostic criteria, suicide is not viewed as a distinct disorder or trajectory, despite vast literature supporting differences between those who are depressed, those who ideate and those who die by suicide. 7 Instead, suicide is viewed as a cause of death-the result of brain dysfunction that may or may not have included depression. If this is the initial premise of one's syllogism, initial and ongoing conditions will hold less value than determining an ultimate outcome through ML-thus, ML could be as effective as it is in other disciplines such as determining immune response or tumour growth. If a different initial premise is presented, such that suicide is its own discrete trajectory, 5 rather than a cause of death, differing from all disorders that either possess similar risk factors or serve as risk factors for suicide, then one can understand why ML requires significantly more data because the system needs to learn about initial conditions to speculate about conclusions. Studies that have developed risk profiles through ML continue to report that we are facing understudied risk factors 8 but, more importantly, we do not have the large, prospective datasets to use the risk factors of which we are aware. RELEVANT LIMITATIONS OF MACHINE LEARNING ML limitations include but are not limited to external validity-generalisation to populations outside of the specific dataset through which the algorithm has been developed. Increased heterogeneity serves as a barrier for accuracy in ML, even within a given population, requiring large sample sizes. 9 While it could be claimed that each hospital system across the globe could develop its own ML algorithm to accurately predict who is at clinically significant risk of suicide, this would require a significant amount of funding and time and, further, would be based on the conditions of the time at which the study is conducted. For example, ML algorithms developed during the coronavirus disease 2019 (COVID-19) pandemic will likely only reflect this time period and become obsolete over time. The experience of grief and bereavement may seem to bear significantly more weight than is currently the case given what we currently know about suicide decedents. Risk factors will likely remain the same during COVID-19, but the rapid and successive nature of personal and vicarious traumas will result in overleveraged data that will be limited by more than the population-time itself will limit the utility of these procedures. Prematurity is not to suggest that research on suicide is premature, as suicide is an overwhelming global health concern that is in desperate need of novel research and solutions. This is potentially what makes ML so convincing and appealing to researchers across disciplines. ML ostensibly allows for accuracy in determining who is at risk for suicide-how could this be a problem? This is not the problem. The concern, however, is that we are embarking on what may very well be the future of psychiatric decision-making with insufficient data. In doing so, we are expending more funding and a greater number of hours to developing ML algorithms that will become outdated. They will General Psychiatry become outdated because current systems are not being taught with sufficient data points, which include many of the suicide risk factors we already know. We lack a dataset that incorporates all necessary components of suicide that would result in consistently viable ML algorithms. We continue to grapple with the ethical considerations of metadata, as well as the contents of said metadata whether they are housed within electronic health records or elsewhere. These types of data and studies are necessary to consider one of the oft forgotten and undervalued factors in suicide risk assessment-temporality. ML, including deep learning methods, must incorporate the timing of events, including the order of risk and protective factors in a given patient's life. This relates to predicting or assessing recidivism risk or any other prospective claim. 10 The simple but extensive combination of events will not yield the same predictive results as algorithms that account for temporality-an area that is understudied. 11 Thus, we simply lack the information necessary for adequate learning to occur within an artificial intelligence network. ML techniques are likely the future of psychiatry and psychology, particularly in acute settings when difficult and time-sensitive decisions must be made. While efforts to develop ML algorithms that focus on specific populations within specific settings are both impressive and laudable, ML holds greater potential than the localised efforts in which it is being used. It appears that the mere sight of a light at the end of the tunnel, promising substantial improvements in reducing suicide deaths, is blinding to the rough terrain that must be traversed in the interim. Not entirely dissimilar to an internet search, there must be data through which to search. In the case of suicide, whether the data are originating from registries, electronic health records or other databases, they invariably lack the data necessary to be generalisable and sufficiently comprehensive in terms of relevant variables. Until we gather these data and make them available for inclusion in these critical algorithms, ML will make predictions but never reach its full potential. Even more worrisome is the potential for researchers to misuse resources through premature efforts or, further, move past ML as a viable suicide prevention strategy in the future simply because it was never offered the opportunity to fully engage in what it is meant to do. Contributors JCL is responsible for the concept, drafting and revisions of this original submission. Funding JCL is supported by the Alfred Adler Scholarship. Competing interests None declared. Provenance and peer review Not commissioned; externally peer reviewed. Open access This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. Through these experiences, he has become particularly interested in the clinical utility of neuropsychological assessment in early detection and prediction of disease onset, including the extension of neuropsychology and deep learning approaches to suicide risk assessment in neurological and neuropsychiatric populations. Machine learning methods in psychiatry: a brief introduction Machine learning approaches for clinical psychology and psychiatry Trends in US suicide deaths Leveraging digital health and machine learning toward reducing Suicide-From panacea to practical tool Toward a distinct mental Disorder-Suicidal behavior Heterogeneity in 10-year course trajectories of moderate to severe major depressive disorder: a Danish national register-based study Animal models to improve our understanding and treatment of suicidal behavior Prediction of sex-specific suicide risk using machine learning and single-payer health care registry data from Denmark Machine learning methods for developing precision treatment rules with observational data Using algorithms to address trade-offs inherent in predicting recidivism Action-informed artificial intelligencematching the algorithm to the problem