key: cord-0770210-ro31czwk authors: Sansone, Vito; Tovoli, Francesco; Casadei-Gardini, Andrea; Di Costanzo, Giovan Giuseppe; Magini, Giulia; Sacco, Rodolfo; Pressiani, Tiziana; Trevisani, Franco; Rimini, Margherita; Tortora, Raffaella; Nardi, Elena; Ielasi, Luca; Piscaglia, Fabio; Granito, Alessandro title: Comparison of Prognostic Scores in Patients With Hepatocellular Carcinoma Treated With Sorafenib date: 2021-01-14 journal: Clin Transl Gastroenterol DOI: 10.14309/ctg.0000000000000286 sha: eaf1973e59002772480af148f54590d0d6831fe0 doc_id: 770210 cord_uid: ro31czwk Prognostic classifications for patients treated with sorafenib for hepatocellular carcinoma (HCC) facilitate stratification in trials and inform clinical decision making. Recently, 3 different prognostic models (hepatoma arterial-embolization prognosis [HAP] score, sorafenib advanced HCC prognosis [SAP] score, and Prediction Of Survival in Advanced Sorafenib-treated HCC [PROSASH]-II) have been proposed specifically for patients treated with sorafenib. This study aimed to compare the prognostic performance of different scores. METHODS: We analyzed a large prospective database gathering data of 552 patients treated with sorafenib from 7 Italian centers. The performance of the HAP, SAP, and PROSASH–II models were compared with those of generic HCC prognostic models (including the Barcelona Clinic for Liver Cancer and Italian Liver Cancer staging systems, albumin–bilirubin grade, and Child-Pugh score) to verify whether they could provide additional information. RESULTS: The PROSASH-II model improved discrimination (C-index 0.62) compared with existing prognostic scores (C-index ≤0.59). Its stratification significantly discriminated patients, with a median overall survival of 21.5, 15.3, 9.3, and 6.0 months for risk group 1, 2, 3, and 4, respectively. The HAP and SAP score were also validated but with a poorer performance compared with the PROSASH-II. DISCUSSION: Although suboptimal, PROSASH-II is the most effective prognostic classification model among other available scores in a large Italian population of patients treated with sorafenib. Sorafenib is a multitarget tyrosine kinase inhibitor (TKI) used as frontline systemic treatment for patients with unresectable hepatocellular carcinoma (HCC) not amenable to locoregional procedures (1) . Single-agent treatment with sorafenib has been an effective strategy for the management of advanced HCC since 2007, and the recent approvals of the newer TKIs, lenvatinib (2), regorafenib (3) , and cabozantinib (4) , have further expanded treatment options (5) . Most recently, a combination of the immune checkpoint inhibitor atezolizumab plus the antivascular endothelial growth factor monoclonal antibody bevacizumab outperformed sorafenib (6) , but the role of sorafenib and other TKIs is far from being exhausted. Patients with contraindications to the immune oncology drugs (including liver transplant recipients and patients with systemic autoimmune conditions) are still poised to be treated with TKIs. At the same time, pharmacoeconomic issues and logistical problems in organizing frequent intravenous infusions might slow down the diffusion of the new regimen, especially in the time-lapse immediately after the coronavirus disease 2019 emergency. Even more importantly, many different TKIs (including sorafenib itself) are being tested in combination with immune oncology drugs (7), with very encouraging preliminary results (8) . The identification of the patients who could benefit most from sorafenib is one of the most daunting tasks because sorafenib and other TKIs cost-effectiveness have been questioned (9) (10) (11) (12) (13) . Recently, different prognostic scores specifically designed for sorafenib-treated HCC have been proposed. Edeline et al. found that the hepatoma arterial-embolization prognosis (HAP) score, previously created to assess the prognosis of patients treated with transarterial chemoembolization, also provided useful information in patients treated with sorafenib (14) . In the same study, the authors refined the HAP score and created the new sorafenib advanced HCC prognosis (SAP) score. Most recently, Labeur et al. proposed an elaborate prediction model (Prediction Of Survival in Advanced Sorafenib-treated HCC [PROSASH]) and its simplified version (PROSASH-II), containing only variables easy to acquire in the everyday clinical practice (15, 16) . The HAP, SAP, and PROSASH-II scores seemingly refined the prognostic information deriving from the Barcelona Clinic for Liver Cancer (BCLC) classification and outperformed other prognostic scores such as the albumin-bilirubin (ALBI) grade (17) . However, no external independent validation of HAP, SAP, and PROSASH-II is available so far. In this study, we used a nationwide multicenter dataset of patients treated with sorafenib to verify whether the HAP, SAP, and PROSASH-II models improve the prediction of survival in comparison with other widely adopted HCC prognostic scores. This study was performed using medical records from the Archives of Patients with hEpatocellular carcinoma treated with Sorafenib (ARPES) database. This prospective database was created in 2010 to collect data acquired in a real-life scenario of patients treated with sorafenib, to identify clinical, laboratory, and imaging predictors of response to the drug. This database includes consecutive patients treated with sorafenib in 6 different Italian Centers (Sant'Orsola-Malpighi Hospital, Bologna; Istituto Scientifico Romagnolo per lo Studio e la Cura dei Tumori IRCCS, Meldola; Cardarelli Hospital, Naples; Papa Giovanni XXIII Hospital, Bergamo; Azienda Ospedaliero-Universitaria Pisana, Pisa; Humanitas Clinical and Research Center, Milan). Data were entered every 3-6 months starting from January 2010 into electronic data files by coinvestigators from each center and were checked at the data management center for internal consistency. For this study, we considered patients who were prescribed from January 2010 to December 2018. The starting date coincided with the creation of the database and, therefore, with the possibility of obtaining prospective data from all the study centers. The closing date was chosen to allow an adequate follow-up of patients. The closing time for the last follow-up was December 31, 2019. The following data were available for each patient at the time of the first sorafenib prescription: parameters entailing the residual liver function according to the Child-Pugh score, tumor staging according to the BCLC classification, baseline a-fetoprotein (AFP) value, performance status according to the Eastern Cooperative Group Performance Status, and the size of the main tumor nodule. All patients were prescribed with sorafenib at an initial dose of 400 mg twice a day. Dose modifications (including dose reductions and discontinuation) were performed in cases of intolerable adverse effects. Sorafenib was continued until: (i) radiological (according to the modified RECIST criteria, as recommended by EASL guidelines) (1, 18, 19) and clinical progression (for patients eligible for second-line clinical trials; in these cases, radiological progression alone was sufficient for discontinuation); (ii) unacceptable toxicity; and (iii) deterioration of liver function. The median duration of sorafenib treatment was 4.7 months (interquartile range [IQR] 2.3-10.3 months), and median dose of sorafenib was 474 mg (IQR 400-700 mg). The HAP, SAP, and PROSASH-II scores were calculated according to the authors' original description. In brief, the HAP score was calculated according to the following criteria: largest tumor nodule . The sorafenib-specific HAP, SAP, and PROSASH-II scores were compared with other generic prognostic models for HCC such as the BCLC system (21), the Italian Liver Cancer (ITA.L-I.CA) staging system (22) , the Child-Pugh classification (23), and the ALBI grade (17) . The ALBI score was calculated as follows: ALBI 5 (log10 bilirubin 3 0.66) 1 (albumin 3 20.085), where bilirubin is expressed in mmol/L and albumin in g/L, and categorized as ALBI grade 1 (#2.60), ALBI grade 2 (.0.60 and #2 1.39), and ALBI grade 3 (.1.39) (17). We did not include the Cancer Liver Italian Program score (24) or the Okuda staging system (25) because tumor volumetry was not systematically performed in the real-life clinical practice of the enrolling centers. In addition, the TNM staging system (26) was not considered because of the difficulties in the correct classifications of the porta hepatis lymph nodes in nonsurgical cases with chronic liver diseases. For the same reasons, scores including TNM as a variable, such as the Chinese University Prognostic Index (27) and the Japanese Integrated Staging score (24), were not assessed. The study protocol was reviewed and approved by the local Ethics Committees. All patients gave their written informed consent. The study was conducted according to the ethical guidelines of the 1975 Declaration of Helsinki. Continuous variables are expressed as median and IQR. Categorical variables are expressed as frequencies. Group comparisons were performed with the Mann-Whitney test. Categorical variables were evaluated using the 2-tailed Fisher test. Overall survival (OS) was measured from the starting date of sorafenib until the date of death or of the last visit or the end of the follow-up period. Survival curves were estimated using the product-limit method of Kaplan-Meier. The role of stratification factors was analyzed with log-rank tests. To define the predictors of OS, we used a time-dependent covariates survival approach including statistically significant clinical variables (P , 0.05) from the univariate Cox analysis. For each prognostic model, the utility and discriminative performances were quantified using the Akaike Information Criterion (AIC) and Harrell C-index. A lower AIC indicates better goodness of fit, whereas a higher Harrell C-index indicates a larger proportion of patient pairs has agreement between predicted and observed survival for rank. Some prognostic models consisted of a linear predictor with a risk group categorization, which can lead to loss of information (i.e., ALBI score and ALBI grade 1, 2, and 3). To assess the difference, we tested the performance of these models as linear predictor and as risk groups. Statistical analysis was performed using SPSS Statistics for Windows (version 24.0; IBM) and STATA/SE 14.1 (StataCorp). (95% confidence interval 10.7-13.6), with a median follow-up of 10.6 months (95% confidence interval 9.7-11.6). Most patients were classified as BCLC-C stage (61.6%), Child-Pugh A (93.3%), ALBI grade 2 (73.6%), and ITA.LI.CA 4 (60.9%). A plurality of patients was classified as SAP A (45.1%) or SAP B (46.6%), whereas SAP C patients were fewer (8.3%). The most balanced distribution of patients occurred across the HAP classes (Figure 1 ). The median OS was 19.3 and 9.9 months in patients belonging to the intermediate and advanced BCLC stage, respectively (P , 0.01). Child-Pugh B patients had a significantly lower survival than that of Child-Pugh A patients (6.9 vs 18.9 months, P , 0.001). Patients classified as ITA.LI.CA quartile 2 had a better median OS than those as quartile 3 and quartile 4 (22.5 vs 15.6 and 10.1 months, respectively (P , 0.001). According to the ALBI grade, the median OS was 16.6, 11.3, and 6.0 months for grades 1, 2, and 3, respectively (P 5 0.039). Moreover, the SAP score was able to successfully stratify patients, with a median OS of 16.8, 11.1, and 5.5 months in SAP A, SAP B, and SAP C, respectively (P , 0.001). In the case of the HAP score, although the omnibus log-rank test was still significant across classes taken together (from A to D: 19.2, 11.6, 12.6, and 6.3 months, respectively, P , 0.001), the OS of HAP B and C classes did not differ significantly. Finally, the PROSASH-II score identified the following median OS for its risk groups: risk class 1, 21.5 months; risk class 2, 15.3 months; risk class 3, 9.3 months; and risk class 4, 6.0 months ( Figure 2 ). The hazard ratios for each class of every single score are summarized in Table 2 . The C-index scores ranged for a minimum of 0.53 to a maximum of 0.64 (Table 3 ). The ALBI model had the lowest values both as linear score (0.55) and after categorization (0.53). The BCLC classification (0.57), SAP score (0.58), and HAP score (0.59) had better performances than ALBI. The PROSASH-2 system had the highest C-index values both as linear score (0.64) and after categorization (0.62). The analysis of the AIC largely confirmed the ranking provided by the C-index, with the only difference being slightly better performance of the SAP score (4, 802) in comparison with the HAP score (4,807). With this study, we provided an independent and external validation of the 3 prognostic scores specifically proposed for patients treated with sorafenib. We found that the HAP, SAP, and PROSASH-II scores have prognostic abilities, with the most recently proposed PROSASH-II showing the best performance with a C-index score of 0.62. However, it should be considered that none of these scores had satisfactory performances because, from a merely statistical point of view, only C-indexes .0.70 are usually considered indicative of a good model (28) , and in this study, even the strongest model was far from reaching such a threshold. To explain the unsatisfactory performance of the "sorafenib-dedicated" prognostic models, some considerations have to be made. First, all of these scores only consider pretreatment parameters, and we know that the radiological and biochemical (AFP) responses to the treatment greatly influence the survival (29) (30) (31) (32) . Moreover, the development of sorafenibrelated dermatological events is related to a more favorable prognosis (33) (34) (35) . The occurrence of such key events cannot be guessed when sorafenib is started, and therefore, the current scores have to rely only on parameters that should be considered as "prognostic" rather than "predictive." Better scores could be achieved with the identification of predictive biomarkers. Since the licensing of sorafenib, the search of actual predictors has been felt to be of paramount importance. Early evidence seemed to suggest that specific polymorphisms of the Ang-2 genes might predict the subsequent development of dermatological adverse effects (and, therefore, a better survival) (36) , but unfortunately these findings were unconfirmed. As a matter of fact, to date, no effective predictive biomarker has been identified. Thus, imperfect C-indexes are to be expected. On the other hand, the imperfect current scenario should not lead us to disregard the information provided by the existing scores, and we have to take into account that the PROSASH-II model outperformed the other sorafenib-dedicated scores HAP and SAP scores. The combination of tumor-related parameters, liver function, and performance status was the pivotal element favoring the PROSASH-II model. In addition, the PROSASH-II was designed to obtain more detailed tumor-related information. In fact, although the HAP and SAP scores concern only with the tumor size and AFP, the PROSASH-II model also considers macrovascular invasion and extrahepatic spread as separate predictors of survival. In the brivanib trial, an imbalance between the study and control groups occurred because of the combined "macrovascular invasion and/or extrahepatic spread" stratification factor and contributed to the study failure (37) (38) (39) . Moreover, real-life studies on sorafenib showed that both extrahepatic spread and macrovascular invasion separately contribute to the risk of death (33, 40) . Indeed, the PROSASH-II score was not intended to replace the BCLC system because the latter can provide both prognostic and therapeutic implications for unselected patients with HCC. However, the prognostic value of BCLC staging for patients treated with the same modality is poor. As such, the PROSASH-II model should be seen as a tool to refine the prognostic information deriving from the BCLC system for patients treated with sorafenib. This approach can provide some valuable benefit both in clinical practice and for future studies. In clinical practice, the information derived from the stratification can help in giving more precise information to the patients and in researching for therapeutic alternatives. For instance, patients in the PROSASH-II higher risk class (and in the HAP and SAP highest scores) had a median OS around 6 months, even lower than that of Child-Pugh B patients (41) . Sorafenib can bring limited benefit to this population, in which the therapeutic decision should be discussed on a single patient basis. On the contrary, patients in the lowest risk classes that are borderline candidate for transarterial procedures could receive more benefits from a systemic treatment rather than locoregional therapies that could worsen their liver function (jeopardizing the possibility of receiving any further treatment) (42) . Our study has some limitations. First, we were not able to test all of the prognostic scores for HCC. Prognostic models such as Cancer Liver Italian Program score and Okuda were not included because some of their variables were not present in our database. However, it should be considered that such variables (including volumetry) are not commonly available in clinical practice, and therefore, these scores are not universally used. Second, some patients received postsorafenib treatments or were included in clinical trials testing new second-line therapies. However, the more recently approved second-line treatments for advanced HCC most likely did not have a major impact on our results, because the included patients were treated with sorafenib before the Food and Drug Administration/European Medicines Agency approval of these treatments and the landmark trials of these agents had strict inclusion criteria. Despite its limitations, we think that our multicenter prospective study represents a reliable assessment of the validity of sorafenib-dedicated prognostic score, with potential implications on the future management of these patients. Although not reaching the classical threshold of 0.70 indicating a good prognostic ability, the PROSASH-II score still represents an improvement in comparison with the preexisting scores. Thus, its use might be considered in clinical practice to refine information about patients with HCC, providing both a risk group stratification and an individualized survival prediction that can help to tailor on an individual basis the treatment of HCC in daily practice and allows to improve the design of future studies on the systemic therapy of HCC, taking into account the great differences in life expectancy of the potential candidates. EASL clinical practice guidelines: Management of hepatocellular carcinoma Lenvatinib versus sorafenib in first-line treatment of patients with unresectable hepatocellular carcinoma: A randomised phase 3 non-inferiority trial Regorafenib for patients with hepatocellular carcinoma who progressed on sorafenib treatment (RESORCE): A randomised, double-blind, placebo-controlled, phase 3 trial Cabozantinib in patients with advanced and progressing hepatocellular carcinoma Review article: New therapeutic interventions for advanced hepatocellular carcinoma LBA3-IMbrave150: Efficacy and safety results from a ph III study evaluating atezolizumab bevacizumab (bev) vs sorafenib (Sor) as first treatment (tx) for patients (pts) with unresectable hepatocellular carcinoma (HCC) Immunotherapy for hepatocellular carcinoma: A review of potential new drugs based on ongoing clinical studies as of 2019 Nivolumab (NIVO) 1 ipilimumab (IPI) 1 cabozantinib (CABO) combination therapy in patients (pts) with advanced hepatocellular carcinoma (aHCC): Results from CheckMate 040 Cost-effectiveness of sorafenib versus best supportive care in advanced hepatocellular carcinoma in Egypt Cost effectiveness of regorafenib as second-line therapy for patients with advanced hepatocellular carcinoma Regorafenib treatment for patients with hepatocellular carcinoma who progressed on sorafenib: A costeffectiveness analysis Cabozantinib for patients with advanced hepatocellular carcinoma: A cost-effectiveness analysis Cost-effectiveness of cabozantinib in the second-line treatment of advanced hepatocellular carcinoma Prognostic scores for sorafenibtreated hepatocellular carcinoma patients: A new application for the hepatoma arterial embolisation prognostic score Using prognostic and predictive clinical features to make personalised survival prediction in advanced hepatocellular carcinoma patients undergoing sorafenib treatment Improved survival prediction and comparison of prognostic models for patients with hepatocellular carcinoma treated with sorafenib Assessment of liver function in patients with hepatocellular carcinoma: A new evidence-based approach-the ALBI grade Modified RECIST (mRECIST) assessment for hepatocellular carcinoma Clinical practice guidelines EASL: EORTC clinical practice guidelines: Management of hepatocellular carcinoma European Organisation for Research and Treatment of Cancer A simple prognostic scoring system for patients receiving transarterial embolisation for hepatocellular cancer Clinical management of hepatocellular carcinoma. Conclusions of the Barcelona-2000 EASL conference. European Association for the Study of the Liver Development and validation of a new prognostic system for patients with hepatocellular carcinoma Surgery and portal hypertension Prognostic staging system for hepatocellular carcinoma (CLIP score): Its value and limitations, and a proposal for a new staging system Natural history of hepatocellular carcinoma and prognosis in relation to treatment. Study of 850 patients AJCC Cancer Staging Manual Construction of the Chinese University Prognostic Index for hepatocellular carcinoma and comparison with the TNM staging system, the Okuda staging system, and the Cancer of the Liver Italian Program staging system: A study based on 926 patients Evaluating the yield of medical tests Sorafenib in advanced hepatocellular carcinoma Design and endpoints of clinical trials in hepatocellular carcinoma Inter-operator variability and source of errors in tumour response assessment for hepatocellular carcinoma treated with sorafenib Usefulness of alphafetoprotein response in patients treated with sorafenib for advanced hepatocellular carcinoma Early dermatologic adverse events predict better outcome in HCC patients treated with sorafenib Systematic review with meta-analysis: The critical role of dermatological events in patients with hepatocellular carcinoma treated with sorafenib On-target sorafenib toxicity predicts improved survival in hepatocellular carcinoma: A multi-centre, prospective study Ang-2 polymorphisms in relation to outcome in advanced HCC patients receiving sorafenib Brivanib in patients with advanced hepatocellular carcinoma who were intolerant to sorafenib or for whom sorafenib failed: Results from the randomized phase III BRISK-PS study Systemic therapy in HCC: Lessons from brivanib Prognostic significance of adverse events in patients with hepatocellular carcinoma treated with sorafenib Predictors of survival in patients with advanced hepatocellular carcinoma who permanently discontinued sorafenib Non-transplant therapies for patients with hepatocellular carcinoma and Child-Pugh-Turcotte class B cirrhosis Management of adverse events with tailored sorafenib dosing prolongs survival of hepatocellular carcinoma patients Open Access This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal