key: cord-0281324-f68msmxt authors: Rovetta, A. title: Joinpoint Regression to Determine the Impact of COVID-19 on Mortality in Europe: A Longitudinal Analysis From 2000 to 2020 in 27 Countries date: 2022-01-21 journal: nan DOI: 10.1101/2022.01.19.22269576 sha: dc52e5c9d21755180f52f2d552ce1761a0bf505b doc_id: 281324 cord_uid: f68msmxt The novel coronavirus disease 2019 (COVID-19) represented the most extensive health emergency in human history. However, to date, there is still a lot of uncertainty about the exact death toll the pandemic has claimed. In particular, the number of official deaths could be vastly underestimated. Despite this, many conspirationists speculate that COVID-19 is not a dangerous disease. Therefore, in this manuscript, we use joinpoint regression analysis to estimate the impact of COVID-19 in 27 European countries by comparing annual mortality trends from 2000 to 2020. Furthermore, we provide accessible evidence even for a non-expert audience. In conclusion, our results estimate that COVID-19 increased the overall mortality in Europe by 10% (P < .001). In 16 out of 27 countries (59.3%), the excess mortality ranged from 7.4% to 18.5% (all P < .007). Comparison of the observed mortality distribution with the null counterfactual showed that the mortality increase was highly significant across Europe, even considering only the nations with minor 2020 increases (highest P = .014, lowest mean = 2.4). Coronavirus disease 2019 (COVID-19) is a human-to-human transmissible infectious disease that reached the size of a pandemic in early March 2020, causing more than 5.5 million official deaths worldwide in less than two years [1] . The damage caused by this new coronavirus has been catastrophic, but the harms in the counterfactual scenario of the absence of non-pharmaceutical measures, therapies, and vaccines would have been even more dramatic [2, 3] . Furthermore, the official number of COVID-19 deaths is plausibly an underestimate of the real figure, given the poor testing capabilities in the initial phase [4, 5] . For this reason, scientists have begun comparing mortality statistics from previous years with current ones to highlight the actual epidemiological impact of COVID-19. Nevertheless, such a comparison is far from simple. Specifically, real time series can present critical issues such as trends, seasonalities, and level shifts. Therefore, direct comparisons between pandemic data and previous year averages can be improper or even misleading. Moreover, the further problem is that of trend estimation: indeed, determining the beginning of one trend and the end of another is a process that requires a high number of iterations to find the statistically most significant fit. In this regard, the Division of Cancer Control and Population Sciences of the National Institutes of Health has developed free software -called Joinpoint -to search for the best linear subtrends within a timeseries [6] . In this paper, Joinpoint was used to compare the annual mortality rates of 27 European countries before and after COVID-19 (from 2000 to 2020). The purpose of the manuscript is to provide epidemiologically relevant data and conclusive proof of the danger of COVID-19. Indeed, various conspiracy hypotheses have argued that a significant number of patients died from other causes, even if tested positive to the novel coronavirus 2019 [7] . Since risk perception is strongly influenced by how information and images are presented, a simple and intuitive figure has been developed to show the results to the lay public [8] . Finally, although more detailed surveys (e.g., stratified by age groups and periods) have been conducted, a more straightforward approach can provide clearer evidence and require fewer assumptions, reducing the likelihood of interpretative errors. Annual mortality data from 2000 to 2020 for all European countries were collected from the website "The World Bank" and downloaded in ".xls" format [9] . Mortality for 2020 has been obtained from the "Eurostat" website [10] . The time series were plotted for an initial check for normality and absence of marked outliers and heteroskedasticity. After that, the "Joinpoint" software -provided by the "Division of Cancer Control & Population Science" of the National Institutes of Health (NIH) -was adopted to break the time series into linear subtrends [6] . The last subtrend found was then analyzed and confirmed by a graph check and a linear regression analysis performed with the "XLSTAT" tool for Microsoft Excel v.2112 [11] . In particular, the tool automatically quantitatively checks the assumptions of the linear regression. Finally, the residual between the model's prediction for 2020 and the observed 2020 value was calculated for each time series. The Grubb test was applied to verify whether the 2020 residual was out of the distribution of previous residuals [12] . Welch t-test was exploited to compare two mortality distributions: the observed one and the counterfactual centered in 0. The shape of the two distributions was assumed to be the same. Joinpoint regression. Most of the settings have been left at their defaults. The changed settings are specified below: Type of variable = Crude Rate (Death rate), Log transformation = No {y=xb}, Independent Variable = Year. In some cases, denoted with *, we have forced the model to introduce at least one joinpoint to fit the last values of the timeseries better (** was used for two joinpoints). All the graphs of these analyzes are reported as supplementary material to allow the reader an independent evaluation [13] . The acronym JPR-i indicates the i number of joinpoints. The absence of trend was indicated with NT. Linear regression. Ordinary least square linear regression from the XLSTAT package was used to model the annual mortality trend from the last joinpoint through 2019. The standard assumptions of the model -i.e., normality of residuals, absence of outliers, and homoscedasticity -were automatically verified by Shapiro-Wilk and F-tests. The residual of 2020 was indicated with r_20 while the distribution of residuals from the last joinpoint up to 2019 with δ. The entire analysis is available as a supplementary Excel file to allow the reader an independent evaluation [13] . P-values. P-values were used as graded measures of the strength of evidence against the null hypothesis. Therefore we have not adopted dichotomic significance thresholds. However, we have divided the degrees of significance into low (P>.300), medium (.100