key: cord-0707464-iwovntth authors: Abou Ghayda, Ramy; Lee, Keum Hwa; Han, Young Joo; Ryu, Seohyun; Hong, Sung Hwi; Yoon, Sojung; Jeong, Gwang Hun; Lee, Jinhee; Lee, Jun Young; Yang, Jae Won; Effenberger, Maria; Eisenhut, Michael; Kronbichler, Andreas; Solmi, Marco; Li, Han; Jacob, Louis; Koyanagi, Ai; Radua, Joaquim; Shinc, Jae Il; Smith, Lee title: Estimation of global case fatality rate of coronavirus disease 2019 (COVID-19) using meta-analyses: Comparison between calendar date and days since the outbreak of the first confirmed case date: 2020-09-01 journal: Int J Infect Dis DOI: 10.1016/j.ijid.2020.08.065 sha: 6fbd939c0e9b1deb3d077fc874659643965b0c14 doc_id: 707464 cord_uid: iwovntth OBJECTIVE: Since the outbreak of the coronavirus disease 2019 (COVID-19) in December of 2019 in China, the estimation of the pandemic’s case fatality rate (CFR) has been the focus and interest of many stakeholders. In this manuscript, we prove that the method of using the cumulative CFR is static and does not reflect the trend according to the daily change per unit of time. METHODS: A proportion meta-analysis was carried out on CFR in every country reporting COVID-19 cases. Based on the results, we performed a meta-analysis for global COVID-19 CFR. Each analysis was performed on two different calculations of CFR: according to calendar date and according to days since the outbreak of the first confirmed case. We thus explored an innovative and original calculation of CFR concurrently based on the date of the first confirmed case as well as on a daily basis. RESULTS: For the first time, we showed that using meta-analyses, according to calendar date and days since the outbreak of the first confirmed case were different. CONCLUSION: We propose that CFR according to days since the outbreak of the first confirmed case might be a better predictor of the current CFR of COVID-19 and its kinetics. Since the outbreak of the coronavirus disease 2019 in December of 2019 in China, COVID-19 has spread worldwide (WHO, 2020a; Zhu et al, 2020) . As of August 19th, 2020, 21,938,207 confirmed cases with 775,582 deaths were reported across the 216 affected countries, territories, or areas (WHO, 2020b) . Among other clinical and epidemiologic features of the virus, predicting the estimates of mortality of this pandemic is crucial and indispensable. The estimate of the case fatality rate (CFR), is defined as the number of deaths with COVID-19 divided by the number of confirmed COVID-19 cases. CFR has been J o u r n a l P r e -p r o o f developed to understand the mortality and epidemiological features reporting for emerging infectious diseases (Porta et al., 2008; Battegay et al., 2020) , such as Severe Acute Respiratory Syndrome-coronavirus (SARS, CFR 9.6% on a global scale) (Donnelly et al., 2003) and Middle East Respiratory Syndrome-coronavirus (MERS, CFR 34.5%) (Fisman et al, 2014) . To date, there have been many attempts to estimate the underlying "true CFR" of COVID-19 (Baud et al., 2020; Wilson et al., 2020; Rajgor et al., 2020; Kim et al., 2020; Spychalski et al., 2020; Lipsitch et al., 2020) . However, these CFR are not without limitations. These estimates need to be treated with extreme caution because each region of the world is experiencing a different stage of the pandemic. In addition, CFR is contingent on many other factors, including the extensiveness detection and testing efficiency, local health and pandemic response policies, and the condition and inclusiveness of the already existing health systems. Failure to consider these former factors and simply dividing the cumulative deaths with COVID-19 by cumulative confirmed cases based on the latest global statistics available will inevitably distort the CFR in each stage of COVID-19 into an unknown direction, let alone fail to reveal the true dynamics of CFR of the disease. In addition, several previously published papers (Yang et al., 2020; Öztoprak et al., 2020) suggested models using CFR should be based on the cumulative confirmed cases and deaths with a simple linear regression analysis. However, this method of using the cumulative number is static and does not reflect the trend according to the daily change per unit of time. Additionally, it prevents J o u r n a l P r e -p r o o f an exact estimation of CFR because the number of the confirmed cases and the onset time of the first case vary by country, and even within regions of the same country. Therefore, to get as close as possible to a real estimate, we calculated the CFR of each country, concurrently based on the date of the first confirmed case as well as on a daily basis. Proportion meta-analyses were performed to obtain the average CFR for each day, commencing from the date of the first confirmed case to the present, stratified by each country. Therefore, we present unique CFR dynamics obtained by correcting and theoretically circumventing the bias created by the fact that each country is facing different stages of the pandemic. This approach to the CFR provides a new insight that lays the foundation for a proper analysis of CFR. One caveat that we acknowledge is that many potential positive cases that were not tested might present possible confounding variables, skewing our results in a specific direction. At this point, it is impossible to account for the totality of the COVID-19 cases (tested and not tested), and this calculation is out of the scope of this study. (Higgins et al., 2003) . An I 2 value below 50% represented low or moderate heterogeneity, while I 2 >50% represented high heterogeneity (Higgins et al., 2003) . For graphing the patterns of CFR in all countries, RStudio version 1.3.1073 was used. We used a weighted average to ensure the precision of the overall estimates, i.e. having the smallest possible variance and standard variation. In our study, we used inverse variance weights of our data in order to give less weight to the nosier and less relevant data. The weight for each size J o u r n a l P r e -p r o o f estimate was proportional to the inverse of its variance, ensuring that larger countries with high number of infections and mortality, will have more weight compared to those countries with few cases and noisy data. Figure 1A and 1B present the following data over time: the fixed-and the randommodel results of the meta-analysis, the pooled estimate, and the number of total cases included in each analysis. By comparing the figures that show the time trend of CFR stratified by the two methods, it was visually observed that the CFRs calculated by each sorting method had different trends over time: Figure 1A (CFR stratified by calendar date) vs. Figure 1B (CFR stratified by days since the first confirmed case). In both figures, results from the random-and the fixed-effect model were almost identical; however, after they diverge, the fixed-effect model was similar to the pooled estimates while the random-effect model estimates were smaller. One possible explanation for the fact that random CFR estimates were lower than the fixed estimate is that less weight is given to countries with a small number of confirmed cases compared to countries with a high number of cases. For example, United States has 5,141,207 confirmed cases and 164,537 deaths compared to South Korea, with 14,714 confirmed cases and 305 deaths. Similar to Figure 1A , the initial phase (phase 1), phase 2 in which pooled and fixed estimates increase rapidly and the random estimates increase slowly, phase 3A where pooled and fixed estimates remain high, and phase 3B in which the pooled and fixed estimates gradually decrease. There is also "the unreliable phase" where the countries that have been enrolled later are dropping out as the day gets longer, making it difficult to interpret. It should be noted that during phases 3A and 3B, the random estimates remains constant, slightly below 3.0%. When we analyzed this data in May, 2020, we could not see trends after phase 3A. From the point of view at that time, pooled and fixed estimates tended to be quite similar, while random estimates were taken as far apart. As the data until August 12, 2020 was updated, a long phase 3B appeared, and the pooled and fixed estimates at the end of phase 3B became more similar to the random estimates, which had been constant for a long period of time, narrowing the gap considerably. In this analysis, I 2 is more than 70% over the entire period from day 1 to day 174, suggesting that the random estimate is more reliable. calculated CFR based on incidence and mortality data We set the cut off to 100 confirmed COVID-19 cases to reasonably reduce the noise in our statistical analyses. We focused on the distribution of fixed and random weights on analyzing the data from April 24, 2020, where the difference between fixed and random effects was the largest. In the fixed estimate, as the number of confirmed cases increases, a weight that increases proportionally is given ( Figure 3A ). On the other hand, in random meta-analyses, the weight was 0.6% for all countries or territories with greater than or equal to 1,981 confirmed COVID-19 cases ( Figure 3B ). Namely, a higher weight is given to countries with a large number of confirmed cases in the random estimate. On the other hand, lower weight is given to countries with a low number of cases in the fixed estimate ( Figure 3A and 3B). Extrapolating from our previous results, the CFR we choose March 20, 2020 as a cutoff date, because this date would guarantee a relative homogeneity of the data analyzed. We observed that most of the countries would have entered the observed "second phase" of the pandemic. That is following March 20, 2020, the fixed and random meta-analysis are divergent for all countries and their weight-adjustment would be guarantee consistency of the observed outcomes. We identified 4 distinct phases based on our results. In Figure 1A , phase 1 contains data from January 15 to March 15, 2020, phase 2 included data from March 16 to April 25 and phase 3 was from April 26 to August 12. In phase 1, all CFRs ranged between 1% and 3.4%. However, from March 16 to April 25 (phase 2), both fixed and pooled CFRs increased rapidly from 3.3% to 6.6% for fixed-effect CFR and 3.4% to 7.3% for pooled CFR. From April 25th with 2,730,521 confirmed patients to May 16th, both fixed and pooled CFRs remained at 6%p and 7%, respectively (phase 3). Interestingly, in phase 3, the CFR starts to decrease even though the number of confirmed patients per date continues to rise after a total of 2,730,521 was reached. As our results demonstrated, Figure 2A did not show similar pattern to Figure 1A . In phase 2 of Figure 1A , we observed a rise in pooled and fixed model, however the random model dose not increase steeply. This trend was not observed in Figure 2A . This further support our hypothesis that CFR is not a dynamic indicator, and should not be analyzed solely using the traditional mathematical equation that is based primarily on the cumulative number of patients. The trends in Figure 2A are established using the number of confirmed cases according to calendar date. Therefore, its trend is a better representation of the established healthcare systems, the testing ability and socioeconomic factors of the respective countries. Figure 2B shows a similar trend with the characteristic's phases 1 to 3, parallel to those in Figure 1 . It showed an exploding increase from Day 1 to Day 45, when the number of confirmed patients per day reached 2,564,432 (phase 1). As the number of confirmed patients increased to 4,648,514 on Day 66, the fixed CFR remained at 5.4% and the pooled CFR remains at 6.2%, despite the fact that the number of confirmed patients increased rapidly. Comparing Figure 1A and 1B, we found that both pooled and fixed CFRs increased approximately 1%P after adjusting the CFR standard to the days since the first confirmed case (7.09% to 8.20%, 6.40% to 7.40%, respectively). Therefore, the CFR in the plateau phase was approximately 1%P higher in the meta-analyses by days since the first confirmed patient compared to the meta-analyses by date. This might be explained by the "noise" in the data from the early days of the epidemic in each J o u r n a l P r e -p r o o f country. Analogous comparison of Figure 2A and 2B revealed a similar 1%P approximate increase in phase 3, the plateau phase, between CFR by days since the first confirmed patient compared to the meta-analyses by date. An additional phase emerged in Figure 1B Figure 1A , see Appendix). The heterogeneity that we observed are actually directly related to the stage of the pandemic each country experienced at the time we analyzed the data. That is, at the start of the pandemic, almost all confirmed cases and associated mortality were originating and reported from China, almost exclusively. This is why, the heterogeneity was at 0% as expected. As the pandemic unfolded further and many more countries started experiencing it, confirmed cases and mortality began to be reported from countries around the globe, in addition to those coming from China. This was manifested by an expected increase in the study heterogeneity Based on Figure 1 , we also investigated the relationship between CFR and the number of confirmed COVID-19 patients (Figure 2 ). Figure 2A was devised using the number of patients according to the calendar date rather than the cumulative number of patients. This figure revealed that CFRs linearly correlated with the number of confirmed cases, the more the number of confirmed cases, the higher the CFR. On the other hand, when the number of patients was adjusted by days since the first confirmed case, as shown in Figure 2B , CFR increases, as shown in Figure 1B , until the number of confirmed patients per day reaches 1.0 million cases. Following this phase, CFR then rapidly increases between 1.0 million and 1.5 million cases. After 2.0 million cases, a plateau pattern continues. In Figure 2B , the blurry dots represent CFRs in the "unreliable phase." In this phase, CFR decreases when the number of confirmed patients falls below 2 million. The unreliable phase could represent potentially a new phase, the decreasing phase. The model according to calendar date ( Figure 2A ) may have underestimated the CFR, this might be because countries being in different stages, and thus phases, of the disease. The estimation of the COVID-19 pandemic's CFR has been the focus and interest of many stakeholders as it plays a key role in understanding this pandemic and guides appropriate responses and efficient mitigation strategies. We propose that CFR is not a fixed, static rate. It is rather dynamic, constantly fluctuating with time, location, and population, as confirmed in Figure 1A and 1B. In this context, it is important to view CFR as a function of time, rather than presenting CFR as a single and absolute value. Stratifying CFR by days since the first confirmed case is a novel and innovative attempt to uncover the dynamics of CFR as the epidemic unfolds itself. We believe that the CFR simply stratified by calendar date does not reflect the true epidemic situation of each country. Our analysis revealed a CFR trend consisting of four distinctive phases. Based on our results, we carefully propose that the slope of epidemic model will proceed to the next four stages as follows: phase 1 or initial phase, phase 2 or rapid increase phase, phase 3 or plateau phase and phase 4 or decreasing phase. Based on this statistical trend, it is estimated that the global situation of the pandemic will slow down from the time all countries reach phase 3, and it can be improved when the situation has reached the end of the phase. However, as mentioned above and analyzing the data of 100 days so far, the world may remain in phase 3 as of May 2020 for an undetermined amount of time, and CFR may not have yet reached phase 4. It may take considerable amount of time to enter this final phase. The method of calculating CFR needs to be cautioned, and its limitations acknowledged. The numerator and the denominator of CFR should be composed of patients infected at the same time as those who died to accurately represent the CFR. To overcome this restraint, Baud et al. (Baud et al., 2020) and Wilson et al (Wilson et al., 2020) proposed time delay-adjusted CFR to correct the delay between confirmation and death. They adjusted the denominator of CFR as the number of confirmed cases 13-14 days before the measured date to calculate the number of confirmed cases infected concurrently to those who died. Based on these articles, researchers at Oxford University used their global COVID-19 CFR model according to the date since the outbreak in Jan 2020 (CDC, 2020). However, Oxford's calculation is also flawed since 13 to 14 days before the date of test confirmation is not necessarily the date when a subject is infected (Spychalski et al., 2020) . Moreover, there are cases that show test positivity even after recovery. Additionally, the stretching and overwhelming of the healthcare systems creates a delay between testing and receiving the results, thus confirming the case. As this adjusted time-delay CFR leads the estimate to an unknown bias (Spychalski et al., 2020; Lipsitch et al., 2020) , we used the conventional method to calculate CFR. Moreover, the numerator of the CFR is the number of deaths with COVID-19. We should be aware that this number is imperfect, and may include deaths not directly caused by COVID-19, such as fatal comorbid diseases. This may lead to an overestimation of the number relative to its true value. In the present study, we observed unusually exaggerated estimates from our metaanalyses in the early phase (Phase 1) of the epidemic, both in CFR based on the calendar date and days since the first confirmed case. This is thought to be a statistical bias, as many groups and countries with small numbers were included. The studies included in the early phase of the epidemic are mostly a bundle of data in which deaths sporadically occurred in very small group sizes. Such data distribution may have severely exaggerated the meta-analyses results. Therefore, we believe that we should aim for a more standardized and homogenous analysis of the numbers. One method would be to observe the results from the time when the number of confirmed cases in each country has reached a certain distinct level. Other previous studies have performed meta-analysis of observational of COVID-19 studies and reported pooled incidence of mortality. Zhao found a pooled CFR of 3.1% after analyzing 30 studies with 53000 patients (Zhao et all, 2020) . Similarly, our J o u r n a l P r e -p r o o f meta-analysis of CFR calculated from the first confirmed case, we set new standards for observing CFR and suggest the four phases of epidemic pattern. From the results, the overall estimated CFR in this pandemic is expected to be at 2.9% to 3.0% in random estimates ( Figure 1B) . Because the pandemic is still in progress, however, future studies and discussions are needed to fulfill the unmet need for a consensus of the definition of each phase. It would also be interesting to explore the relation between CFR and the number of testing performed. More specifically, it would be of great added value to explore if higher number of testing and availability is associated with a lower CFR. When the CFR is estimated by day since the first confirmed case, the estimates could be more representative of "the true kinetics" of Hopkins University dash board. Because of time constrains, we did not perform a comparison between these different data source in our manuscript. However, future COVID-19 related project should aim at doing so to ensure outmost accuracy and validity of the information. This report highlights that the CFR is not a fixed value, rather it is a dynamic value. Therefore, we strongly urge caution when dealing with CFR values, especially in an ongoing epidemic. We originally showed that estimation of global CFR of COVID-19, using meta-analyses of CFR, according to the calendar date and days since the outbreak of the first confirmed case were different. We propose that CFR Real estimates of mortality following COVID-19 infection. The Lancet Infectious Diseases 2019-novel Coronavirus (2019-nCoV): estimating the case fatality rate -a word of caution Epidemiological Determinants of Spread of Causal Agent of Severe Acute Respiratory Syndrome in Hong Kong Estimation of MERS-Coronavirus Reproductive Number and Case Fatality Rate for the Spring 2014 Saudi Arabia Outbreak: Insights from Publicly Available Data Estimating case fatality rates of COVID-19 Measuring inconsistency in meta-analyses Estimating case fatality rates of COVID-19 Medicine TKAoI. Version Case Fatality Rate estimation of COVID-19 for European Countries: Turkey's Current Scenario Amidst a Global Pandemic A dictionary of epidemiology The many estimates of the COVID-19 case fatality rate. 2020. The Lancet Infectious Diseases Virus Spread Pushes Italian Hospitals Toward Breaking Point Estimating case fatality rates of COVID-19. The Lancet Infectious Diseases The Centre for Evidence-Based Medicine. Global Covid-19 Case Fatality Rates -Updated 15th May Novel Coronavirus related illness (COVID-19) Case-Fatality Risk Estimates for COVID-19 Calculated by Using a Lag Time for Fatality. Emerging Infectious Diseases World Health Organization. Coronavirus disease (COVID-19) outbreak Situation Report-150 Early estimation of the case fatality rate of COVID-19 in mainland China: a data-driven analysis Incidence, clinical characteristics and prognostic factor of patients with COVID-19: A systematic review and meta-analysis The authors declare no conflict of interest directly applicable to this research.J o u r n a l P r e -p r o o f