key: cord-0939691-89n2n0in authors: Cetin, S.; Ulgen, A.; Li, W.; Sivgin, H. title: Approximate Reciprocal Relationship Between Two Cause-Specific Hazard Ratios in COVID-19 Data With Mutually Exclusive Events date: 2021-04-27 journal: nan DOI: 10.1101/2021.04.22.21255955 sha: cabeb7ae7760f870764d122795ae987beadb22cf doc_id: 939691 cord_uid: 89n2n0in COVID-19 survival data presents a special situation where not only the time-to-event period is short, but also the two events or outcome types, death and release from hospital, are mutually exclusive, leading to two cause-specific hazard ratios (csHR_d and csHR_r). The eventual mortality/release outcome can also be analyzed by logistic regression to obtain odds-ratio (OR). We have the following three empirical observations concerning csHR_d, csHR_r and OR: (1) The magnitude of OR is an upper limit of the csHR_d: |log(OR)| >= |log(csHR_d)|. This relationship between OR and HR might be understood from the definition of the two quantities; (2) csHR_d and csHR_r point in opposite directions: log(csHR_d) log(csHR_r) < 0; This relation is a direct consequence of the nature of the two events; and (3) there is a tendency for a reciprocal relation between csHR_d and csHR_r: csHR_d ~ 1/csHR_r. Though an approximate reciprocal trend between the two hazard ratios is in indication that the same factor causing faster death also lead to slow recovery by a similar mechanism, and vice versa, a quantitative relation between csHR_d and csHR_r in this context is not obvious. These resutls may help future analyses of COVID-19 data, in particular if the deceased samples are lacking. Survival analysis studies the longitudinal event data. Regression in survival analysis investigates whether a factor contributes to the hazard (rate of risks) of the event under study. The hazard ratio (HR) is the ratio of two hazards, one with the factor taking the at-risk value (e.g. smoking) and another without (e.g. not-smoking). Since hazard and HR describes the instantaneous risk, or rate of event occurrence (e.g. death from a specific disease), it is a different concept from the life-long risk of having that event (Sutradhar and Austin, 2018) . As a result, regression in survival analysis (e.g. Cox regression) is different from the static caseversus-control regression analysis (e.g. logistic regression). Take the following two statements as example: "smoking makes lung cancer patients die faster" and "smoking makes a person more likely to die from lung caner than a non-smoker"; the first would be a conclusion from a survival analysis, whereas the second from a case-control type of analysis. The COVID-19 pandemic since 2020 (Huang et al., 2020; Xu et al., 2020) provides a unique longitudinal event data. First of all, a COVID-19 patient admitted to a hospital sees his/her outcome relatively quickly: either the patient survived or not in a matter of days. As a result, there are very few right-censored data where the outcome is still unknown at the time of data collection. Of course, there exist chronic or long COVID-19 survivors who are not completely cured (Carfi et al., 2020; Rubin, 2020) , but they are unlikely to die from COVID-19 in the future. The second feature of COVID-19 longitudinal event data is that the two events, death and release from the hospital, are not the traditional "competing risk events" (Austin et al., 2016; Austin and Fine, 2017) . Although not strictly defined as such, competing risks events are often two unfavorable events with one occurring before the occurrence of another. In COVID-19 data, the event of being released from hospital is a favorite event, and a description of them in principle should not be connected to words like "risk" or "hazard". A higher HR for the event of releasing from hospital implies a faster recover, thus a factor that contributes to this higher HR actually provide protection. Also, in the COVID-19 data, the event of death and the event of released from hospital are mutually exclusive. Although organ transplant and death can be mutually exclusive when the organ involved is (e.g.) heart, transplant of many other types of organ is an event that preceed, and may have an impact on, death. Even in the case of heart transplant, survival after the operation is not completely guaranteed. Regardless of these detail, factors affecting transplant timing are basically external, whereas those affecting mortality without a transplant are mostly internal. Our COVID-19 death/release mutually exclusive event pair does not have a correspondence in death/organ-transplant pairs. As discussed in (Pintilie, 2007; Austin et al., 2016) , given the the event time (T, k) where T is the time to event (or to censoring), k=1,2 for two event types (we may understand that k = 0 means right-censored), there are two ways to define hazard function: (1) cause-specific hazard function, and (2) subdistribution hazard function (or Fine-Grey model), where K for event type (1 or 2), and conditioning T ≥ t means the function is defined before the occurring of the event, and the added conditioning (T < t) ∩ (K = k) means event of alternative type may be allowed to occur before. In the first approach (cause-specific), all alternative events are converted to right-censored (i.e., k = 0). Because the event of dead and release are mutually exclusive, we cannot use subdistribution hazard function. We can indeed study the two cause-specific hazard functions or HRs, one for time-to-death (treating release event as right-censored) and another for time-to-release (then treating death event as right-censored): where x is the independent binary factor (for continuous factor, the definition is similar, with two hazards evaluated at x-levels differing by 1), and the time dependency is supposedly canceled. The question we ask: what is the relationship between the two cause-specific hazard ratios, csHR k=death and csHR k=rel ? Intuitively, if a factor value leads to faster death, the same 3 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 27, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 factor may lead to slow recovery, and vice versa. In other words, a larger csHR k=death may imply a smaller csHR k=rel . Our working hypothesis is that csHR k=death = 1/csHR k=rel , which is also hypothesized to be equal to the odds-ratio from a logistic regression analysis. We will use a survival data of n = 450 COVID-19 test our working hypothesis. Previously we asked the question on whether the two survival analyses and one logistic regression can all identify a risk factor (Cetin et al., 2021) . Here we are asking more quantitative questions concerning the three analyses. The COVID-19 patients dataset (n=450) used in this study was collected from the Tokat State Hospital. Electronic medical records, including patient demographics, clinical manifestation, comorbidities, laboratory tests results were collected. According to the severity and outcomes of the patients, they were divided into two groups: deceased/nonsurviving, released/survived severe groups. Clinical and laboratory data were comparable among the two groups. Ministry of Health permission and the ethics committee of Tokat GaziOsmanPasa University Ethics committee permission was obtained with the number 83116987/360 on March 4, 2021. Besides age and gender, we selected these 18 laboratory testing measurements at the time of hospital admission in the survival analysis n value is the number of samples with measurement value): (from the complete blood count panel) white blood cell (WBC) count (n=432), neutrophil (NEU) count (n=383), lymphocyte (LYM) count (n=382), hemoglobin (HGB) (n=432), platelet (PLT) (n=432), mean corpuscle volume (MCV) (n=432), mean platelet volume (MPV) (n=430); (from metabolic panel) glucose (n=449), alanine adinotransferase (ALT) (n=435), aspartate aminotrasferase (AST) (n=429), blood urea nitrogen (urea or BUN) (n=436), creatine (n=436), calcium (n=393), potassium (n=433), sodium (n=435); (others) ferritin-1 (FER1) (n=407), d-dimer (n=390), and lactate dehydrogenase (LDH) (n=316). The NLR (neutrophil/lymphocyte ratio) and PLR (platelet/lymphocyte ratio) are derived quantities. There are other test measurements, either only available for fewer number of patients, or due to other reasons, that are not included in the analysis. The following variables seem to follow a normal distribution after the log transformation better than the original values: ferritin-1, d-dimer, glucose, ALT, AST, urea, creatine, LDH, WBC, NEU, and perhaps LYM and PLT. The alternative use of Cox regression for cause-specific hazard is applied: where h cs k (t) is the cause-specific hazard function defined in Eq.1, h cs k,base (t) is an undefined baseline hazard function. The left-hand side of Eq.4 contains time t whereas the right-hand side does not, which is the proportional hazard assumption, that the time-dependent part is cancelled. {x i } is a list of factors. The single-variate Cox regression limits index i to only one variable. In practice, cause-specific Cox regression is carried out by labeling event=2 as right-censored event=0 when focusing on event=1, and vice versa when focusing on event=2. Overview of the survival analysis and logistic regression analysis results: Large number of analysis run results are included in Tables 1,2,3. Table 2 contains factors in a metabolic panel, Table 3 are factors in a complete blood count panel, and Table 1 lists the rest. The first column is the result from single-factor Cox regression survival analysis using death as event of interest and release as right-censored event. The second column is the Cox regression results by switching the two events. The listed results include the cause-specific hazard ratio (csHR) and its 95% confident interval (CI), and the p-value for testing csHR equal to 1. The third column is the logistic regression result comparing the dead and release samples, where the results include odds-ratio (OR) and its 95% CI, and p-value for testing OR=1. We also run the same group of analysis on log-transformed factor values, if that factor better follows a normal distribution after log transformation. Then, we run the same set of analysis, 5 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2021. ; when the factor value is continuous, for binarized factors with optimally chosen threshold values. Both these runs will be discussed in detail later. The reason for running large number of analyses is not for "fishing expedition" in order to have a better chance to find statistically significant results, but to test the robustness of the results. Therefore, it is not necessary to do a multiple testing correction. We mark those p-values that are smaller the 0.005 in Tables 1,2,3. The reason to use 0.005 instead of 0.01 or 0.05 is explained in (Ioannidis, 2018; Colquhoun, 2017) , and the practice of always adding a level when using the word "significant" is proposed in (Wasserstein et al., 2019); see also ). Strikingly, almost all factors significantly (at 0.005 level) influence the rate of event of COVID-19 patients, and are significantly different between the deceased and survived group. Relationship between HR d and HR r for continuous factors: Another striking observation is the relationship between the two cause-specific hazard ratios. Denote csHR d for csHR of the event of death from COVID-19 and csHR r for csHR of the event of COVID-19 patient releasing from hospital. it can be easily seen from Table 1 ,2,3 that if csHR d > 1, the corresponding csHR r < 1, and vice versa. A simple mathematical expression of this fact is: The only exception is the factor of gender. But the 95%CI is so large to have both < 1 and > 1 values, it should not really be considered as an exception. The opposite direction of csHR d and csHR r is understandable: a risk factor for faster death in a deceased patient would also make a surviving patient recover longer. Table 1 ,2,3 seem to also show that the larger csHR d , and smaller csHR r , and vice versa. In order to check if there is a numerical relationship between csHR d and csHR r , we plot 1/HR r as function of csHR d in Fig.1 . The line csHR d × csHR r = 1 is marked by the slope=1 line in 6 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2021. ; https://doi.org/10. 1101 Relationship between hazard (rate) ratio and odds (risk) ratio: As emphasized in (Sutradhar and Austin, 2018), survival analysis estimates the relative rates (of risk) whereas case-control type of analysis such as logistic regression estimates the relative (static or cumulative) risks. Table 1 ,2,3 show that OR seems to have a larger magnitude than csHR d . Fig.2 shows y=OR as a function of x= csHR d . Unlike Fig.1 where dots are scattered near the slope=1 line, in Fig.2 , dots systematically deviates from the diagonal line. In fact, if a factor is a risk (OR > 1 and csHR d > 1), we have OR > csHR d , and if a factor is a protection (OR < 1 and csHR d < 1), then OR < csHR d . We can summarize these into a working hypothesis: Eq.7 might be proved in a simple approximation as follows: the risk function F (t) is known to be related to hazard rate function h(t): Therefore, the odds-ratio is (the subscript 1,2 refers to two states in a binary variable or a two numerical level of a continuous factor with one unit difference, and not refers to the two competing-risks): In the proportional hazard assumption, One approach in getting the approximation in Eq.10 is the Taylor expansion assuming small Relationship between csHR d and csHR r for binarized continuous factors: The HR for continuous factors measures the ratio of two hazard rates when the unit of the factor increases by one. Therefore, when one unit change is negligible compared to the possible range, HR can be very close to 1. In order to see the true impact of a continuous variable, we discretize 7 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2021. ; https://doi.org/10.1101/2021.04.22.21255955 doi: medRxiv preprint continuous factors into levels. Although one can choose three levels for below-normal, normal, and above-normal, the normal range of a factor may not be universally accepted. We use binary levels (higher and lower than a threshold) where the threshold value is a compromise between two selections: the first threshold is chosen to maximize the Youden index (Youden, 1950) , which is simply the sum of sensitivity and specificity (minus 1). The second threshold is chosen by providing the population prevalence of cases (which is set at 10%), which in turn gives weight to samples in the dataset. Both thresholds are obtained from the medcalc.org program. The final threshold is a geometric mean of the Youden-based threshold and that after considering the 10% population case prevalence. The resulting threshold values for all factor (except for age, neutrophil/lymphocyte ratio, platelet/lymphocyte ratio, where the thresholds are more intuitively selected) are given in Tables 1,2, 3. The resulting csHR d and csHR r have larger magnitude than the corresponding continuous value, because the "unit change" is much larger. In order to study the numerical relationship between csHR d and csHR r for binarized factors, we examine all possible threshold values and plot csHR d (red) and 1/csHR d (blue) as a function of threshold value, for 18 test measurements, in Fig.3 . The 95% confidence interval (CI) of csHR d or 1/csHR r is marked with dash vertical lines. If the discretized factor is not significant at a corresponding threshold, red dots turn pink and blue dots turn light-blue. We also mark normal ranges of blood tests from two different sources (as grey horizontal lines) and the threshold used in Table 1 (as downward arrow in grey). We consistently found the 95% CI of csHR d and 1/csHR r overlap with each other at the chosen (optimal) threshold value. In other words, when a reasonable threshold value is used to convert a continuous factor to a binary factor, and running two survival analyses results in a roughly reciprocal relationship between the two cause-specific hazard ratios. Relationship between csHR d and OR for binarized continuous factors: Similar to Fig.2 where we show the scatter plot for cause-specific hazard ratio for time-to-death (xaxis) and logistic regression odds-ratio (y-axis), Fig.4 shows the similar scatter plot for the discretized factors. The 95% CI for both are shown by horizontal and vertical segments. All points in the first quadrant are above the diagonal line, and those in the third quadrant below the diagonal line. Therefore, | log(OR)| ≥ | log(csHR d )| is true for all binarized factors. 8 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 Effect of log-transformation of factor values: Remember the meaning of HR for continuous variable is the ratio of two hazards evaluated at two factor values differing by one unit: x 1 = c, x 2 = c + 1. The dependence on c is supposed to be averaged out. If the factor is log-transformed, the two evaluation points are log(x 1 ) = c ′ , log(x 2 ) = c ′ + 1, or x 2 is x 1 multipled by a constant 2.718. Not only this one-unit change is much larger, but also, the change in the original scale x 2 − x 1 = e c ′ +1 − e c ′ = 1.718 · e c ′ depends on c ′ . Table 1 ,2,3 show that csHR d or csHR r are dramatically larger (in magnitude) than the un-logged factors. The effect of log-transformation on p-value is unclear, though in our examples, the test becomes more significant after the factor being log-transformed. The concept of harzard ratio has been cautioned in (Hernán, 2010) . In particular, the instantaneous incident rate for an event to occur may depend on the time, and HR is only an average over the potential time dependence. Our attempt to binarize a continuous factor or log-transform a factor illustrates a similar problem. If HR not only depends on the one-unitchange step, but also depends on which level this step is made, and depends on whether the step is in additive scale or multiplicative scale, then the average may not capture the whole spectrum of the behavior of hazard in its full range. The COVID-19 time-to-event data is unique not only the two events, mortality and discharge, are mutually exclusive, but also a susceptible factor might be behind faster death and slower release at the same time. If we consider heart transplant and other open-heart operations as events that may have saved a patient's life, thus are mutually exclusive with mortality, we can not say that the factors causing a longer waiting time until operation are the same ones causing a faster death without these operation. For this reason, we have the basis for the reciprocal cause-specific hazard (csHR) ratios hypothesis which can only be examined in data similar to COVID-19, but not in other survival data just because two events are mutually exclusive. The possible links between the two csHR's have at least two consequences. The first is on hospitalization stay time. If a patient has certain condition (e.g., high glucose or hyper-9 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2021. ; https://doi.org/10.1101/2021.04.22.21255955 doi: medRxiv preprint glycemia), the larger-than-1 csHR d implies that the patient has a higher death-rate than those with normal glucose level; on the other hand, the smaller-than-1 csHR r value indicates the patients should survive longer. Therefore, whether hyperglycemia increases the hospital stay time or not depends on whether the patient survives or not. The hospital stay time is of interest because of the number of hospital beds is limited, and there is a need for bed management (Roimi et al., 2021) . If a factor/condition causes the severity of a COVID-19 patient, intuitively we would conclude that patients with the condition will stay in hospital longer. In reality, if the disease is too severe, the patient will stay in hospital shorter, because the patient succumbed to death faster. Among the deceased patients, we would expect the co-existence of short and long stays, while less diverging in stay time for discharged patients. Indeed, though mean of log(stay time) between the deceased and released groups is not significantly different (t-test p-value= 0.15, though Wilcoxon test p-value is 0.00034), the variances are very different (F-test p-value= 1.1E-8). The second consequence that two csHRs might be related is that if we focus on timeto-release events, we could collect much more samples simply because more patients being recovered/released than deceased. In a sense, this strategy examines which factor delays the recovering time in surviving patients. Larger sample sizes would help to detect more subtle causing factors. This strategy will become more relevant if life-saving drugs for COVID-19 are developed and nobody or almost nobody die from the disease. Even in that future event, we still have surviving patients in our possession for a survival analysis. The fact that OR > csHR d if OR > 1 (and OR < csHR d if OR < 1) for both continuous factors and their discretized version, seem to be a consequence of the definition of the two quantities. Although one may use this result to obtain an upper limit of csHR d , the result in Table 1 ,2,3 seems to indicate that the bound is not tight. In that case, if OR ≫ csHR d , OR will not be very useful in estimating the csHR d value. As discussed thoroughly in the literature that we can not always assume csHR (unlike subdistribution HR) is in the same direction as OR (Lau et al., 2009; Austin et al., 2016) . In other words, csHR > 1 does not universally imply OR > 1. Individual csHR also can not determine the cumulative incidence function caused by multiple risks (Latouche et al., 10 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 2013). However, our results in Tables 1,2,3 show that OR and csHR d are always in the same direction (both larger than 1, or, both smaller than 1), indicating the difference between a theoretical possibility and the reality. Drawing cumulative incidence function is also not a goal in our analysis. Considering all these, we consider the use of csHR better fitted for COVID-19 death/release survival data, than the subdistribution HR. In fact, we doubt subdistribution HR can be applied to this situation at all, because of the exclusive nature of the two events. In conclusion, we draw attention to the connection between the two types of mutually exclusive events, mortality and discharge, in COVID-19 survival data. We also made three observations from COVID-19 data: the opposite direction between the two csHRs: log(HR d ) · log(HR r ) < 0, approximately reciprocal link between them HR d · HR r ≈ 1, and odds-ratio as an upper limit of HR d : | log(csHR d ) | ≤ | log(OR) |. WL acknowledges the support from Robert S Boas Center for Genomics and Human Genetics, and thanks Yaning Yang for discussions. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. 13 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Table 1 : Results from two cause-specific survival analyses and logistic regression analysis with a single factor: gender, age, FER1, d-dimer, and LDH. The csHR d (95%CI) is the cause-specific hazard-ratio for time-to-death event and its 95% confidence interval; pv d is the corresponding p-value; csHR r and pv r are hazard ratio and p-value for time-to-release event; OR(95% CI) and pv(LR) are odds-ratio (and its 95% confidence interval) and p-value from logistic regression. P -values smaller than 0.005 are shown in boldface. All similar results for log-transformed factor value and discretized (binarized) factors are also shown, where the threshold used to discretization are given in the first column. 14 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) 15 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2021. ; https://doi.org/10.1101 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Figure 1 : The x-axis is the cause-specific hazard ratio csHR d for death event, and y-axis is reciprocal of the hazard ratio for the release event (1/HR r ), for the 18 blood test measurements as well as age and gender. The diagonal line indicates the exact relationship csHR d =1/csHR r . A factor is in red if its p-values (for testing csHR d and csHR r =1) are both smaller than 0.005; in blue if both p-values are larger than 0.005; and light-blue if one of the two p-values is smaller than 0.005. Because there are many factors having HR close to 1, the right subplot presents a close-up near csHR d =csHR r =1. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Figure 2 : The x-axis is the cause-specific hazard ratio csHR d for death event, and the y-axis the odds-ratio (OR) from logistic regression, for 20 factors. The right subplot presents a close-up near csHR d =OR=1. 18 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. If the discretized factor's csHR is not significant (at 0.01 level) by the survival analysis, its color turns from red (blue) to pink(light-blue). The threshold used in Table 1 is shown as a downward arrow. The two horizontal lines represent the normal range of these blood test results (from two different sources), and horizontal dash line is csHR=1. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 27, 2021. HRd_binarized OR_binarized Figure 4 : Similar to Fig.2 , but for discretized/binarized factors: the x-axis is the cause-specific hazard ratio csHR d for death event, and the y-axis the odds-ratio (OR) from logistic regression. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 27, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 Practical recommendations for reporting FineGray model analyses for competing risk data Introduction to the analysis of survival data in the presence of competing risks Gemelli Against COVID-19 Post-Acute Care Study Group (2020), Persistent symptoms in patients after acute COVID-19 Survival analyses of COVID-19 patients in a Turkish cohort: comparison between using time to death and time to release The reproducibility of research and the misinterpretation of p-values Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China, Lancet The proposal to lower P value thresholds to .005 A competing risks analysis should report results on all cause-specific hazards and cumulative incidence functions Competing risk regression models for epidemiologic data Beyong standard pipeline and p < 0.05 in pathway enrichment analyses The hazards of hazard ratios Analysing and interpreting competing risk data Development and validation of a machine learning model predicting illness trajectory and hospital utilization of COVID-19 patients: A nationwide study