A new scoring system for predicting short‐term outcomes in Chinese patients with critically‐ill acute decompensated heart failure

Background Acute decompensated heart failure (ADHF) contributes millions of emergency department (ED) visits and it is associated with high in-hospital mortality. The aim of this study was to develop and validate a multiparametric score for critically-ill ADHF patients. Methods In this single-center, retrospective study, a total of 1268 ADHF patients in China were enrolled and divided into derivation (n = 1014) and validation (n = 254) cohorts. The primary endpoint was any in-hospital death, cardiac arrest or utilization of mechanical support devices. Logistic regression model was preformed to identify risk factors and build the new scoring system. The assigning point of each parameter was determined according to its β coefficient. The discrimination was validated internally using C statistic and calibration was evaluated by the Hosmer-Lemeshow goodness-of-fit test. Results We constructed a predictive score based on six significant risk factors [systolic blood pressure (SBP), white blood cell (WBC) count, hematocrit (HCT), total bilirubin (TBIL), estimated glomerular filtration rate (eGFR) and NT-proBNP]. This new model was computed as (1 × SBP < 90 mmHg) + (2 × WBC > 9.2 × 109/L) + (1 × HCT ≤ 0.407) + (2 × TBIL > 34.2 μmol/L) + (2 × eGFR < 15 ml/min/1.73 m2) + (1 × NTproBNP ≥ 10728.9 ng/ml). The C statistic for the new score was 0.758 (95% CI 0.667–0.838) higher than APACHE II, AHEAD and ADHERE score. It also demonstrated good calibration for detecting high-risk patients in the validation cohort (χ2 = 6.681, p = 0.463). Conclusions The new score including SBP, WBC, HCT, TBIL, eGFR and NT-proBNP might be used to predict short-term prognosis of Chinese critically-ill ADHF patients. Supplementary Information The online version contains supplementary material available at 10.1186/s12872-021-02041-2.

months [2][3][4]. According to the latest American College of Cardiology (ACC) guidelines, it is important for the initial evaluation of the clinical trajectory of ADHF. The identification of a high-risk status at admission may help to allocate limited hospital resources and discuss the appropriate goals of care [5]. Therefore, accurately and timely assessing the severity and risk can be beneficial for ADHF patients [6].
Several risk stratification systems have been published previously. Unfortunately, there are several limitations of them. First, few focused on a contemporary intensive care unit (ICU) population with ADHF. Second, the existing risk assessment tools for inpatients with ADHF are often complex and are uniformly underutilized. Third, with new plasma biomarkers emerging and the wide application of bedside echocardiography [7], existing scoring systems need to be updated in line with reassessing all the risk factors. Finally, a recent study demonstrated that clinical care risk scores established to predict the prognosis in unselected ICU patients performed poorly in CICU with ADHF, emphasizing the urgent need to develop improved tools for risk stratification among critically-ill ADHF patients [8]. The aim of this study was to develop and validate a novel clinical scoring model to predict short-term adverse events in a Chinese population of critically-ill ADHF patients and compare it with the existing systems, such as the Acute Physiology and Chronic Health Evaluation (APACHE) system [9], AHEAD score [10] and ACUTEHA score [11].

Study population
Clinical data were collected from 1268 patients with ADHF who were admitted to ICU from the emergency department at Fuwai hospital between January 2014 and December 2018. All participants met the most recent European guidelines for the diagnosis of acute heart failure [12]. Critically-ill ADHF was defined as exacerbation of chronic HF (CHF) with New York Heart Association (NYHA) III/IV symptoms sufficient to be admitted to intensive care. Exclusion criteria were known diagnosis with malignancy. Cases requiring dialysis treatment were excluded from the study population. Patients with ST-segment elevation myocardial infarction and non ST-segment elevation myocardial infarction were also excluded because TIMI score was established, extensively utilized in these patients and reperfusion treatment itself played an important role on the prognosis. However, patients with comorbid coronary heart disease and CHF who were hospitalized for exacerbation of CHF without indications for reperfusion therapy were also included in this study. All data were retrospectively obtained from Fuwai Hospital electronic medical records. The study was approved by the Ethics Committee of Fuwai Hospital and was conducted in accordance with the Declaration of Helsinki.

Data collection and endpoints
For each patients, baseline information on ED admission was obtained including demographic data, baseline health status, Glasgow coma scale (GCS), body mass index (BMI), vital signs and comorbidities by reviewing their medical records. The primary diagnosis was regarded as the etiology of ADHF even if several pathologies might exist simultaneously. The definition of cardiogenic shock (CS) was consistent with ICU practical guidance [13]. The presence of atrial fibrillation (AF) and bundle branch block (BBB) were measured with 12-lead electrocardiography and pleural effusion was determined by chest X-ray. Left ventricular ejection fraction (LVEF) and estimated pulmonary arterial systolic pressure (PASP) were assessed by using echocardiography (General Electric, USA). The participant`s worst values of blood laboratory tests during the initial 24-h after emergency admission were recorded including arterial pH, PaO 2 , actual bicarbonate (AB), lactate concentration, serum sodium, serum potassium, white blood cell (WBC) count, hemoglobin (Hb) concentration, hematocrit, international normalized ratio (INR), D-dimer concentration, total bilirubin, serum creatinine, serum uric acid (SUA), high-sensitivity troponin I (hs-TNI) and N-terminal pro-B-type natriuretic peptide (NT-proBNP). Estimated glomerular filtration rate was calculated using the Chinese version of the MDRD equation [14].
The main outcome of this analysis was a composite endpoint defined as: (1) in-hospital mortality; (2) in-hospital cardiac arrest; (3)utilization of mechanical support devices during ICU stay which included intra-aortic balloon pumps (IABP) and extracorporeal membrane oxygenation (ECMO). However, some patients transferred from other hospitals who already received mechanical circulatory support before ED visiting were not included in the following analysis. We also collected the information about patients who had listed for heart transplantation (HTx).

Statistical analysis
For patients' background data, categorical variables were expressed as frequencies (percentages), and continuous variables were expressed as means ± standard deviations or medians with quartiles depending on their normality. Normality was assessed using the Shapiro-Wilk W-test.
Participants were divided into derivation (Jan 2014-April 2018, n = 1014) and validation (May 2018-December 2018, n = 254) cohorts according to the order of admission to ED. The comparison of the baseline data indicated that the distribution of age and occurrence of endpoint agreed well between the two cohorts but the validation cohort had marginally more female patients, more patients with AF and higher NT-proBNP concentration. Some thresholds for categorical variables were adopted as commonly used in clinical treatment including heart rate (HR), respiratory rate (RR), AB and PaO 2 whereas age, pH and hs-TNI were considered as continuous variables. Participants were divided into different groups based on the optimal cut-off values of lactate level, serum sodium, WBC, HCT, TBIL, SUA, D-dimer and INR which were determined by respectively performing receiver-operating characteristic (ROC) curve analyses. Patients were defined as underweight by BMI < 18.5 kg/ m 2 , normal by 18.5/kg/m 2 ≤ BMI < 24 kg/m 2 , overweight by BMI ≥ 24 kg/m 2 and obese by BMI ≥ 30 kg/m 2 . Serum potassium < 3.5mmol/L was defined as hypokalemia and potassium > 5.5 mmol/L was defined as hyperkalemia. The cut-off levels for anemia were hemoglobin < 130 g/L in men and < 120 g/L in women, whereas that for NT-proBNP were determined by quartiles. PASP > 30mmHg was recorded as increased pulmonary artery pressure. The thresholds for eGFR were in accordance with Kidney Outcomes Quality Initiative guidelines, which classified participants into five stages (eGFR ≥ 90, 60 ≤ eGFR < 90, 30 ≤ eGFR < 60, 15 ≤ eGFR < 30 and eGFR < 15 ml/ min/1.73 m 2 ). Three subgroups based on LVEF were identified: HF with reduced ejection fraction (HFrEF, LVEF < 40%), HF with middle-range ejection fraction (HFmrEF, LVEF 40-49%) and HF with preserved ejection fraction (HFpEF, LVEF ≥ 50%). The predictive power of patients' characteristics for the short-term adverse outcomes was computed using the univariate logistic regression and described by odds ratios (ORs) and their 95% confidence intervals. Then, the statistically significant predictors identified by univariate analysis were entered into the multivariate logistic regression model with a forward stepwise selection algorithm. Using a method of β-coefficient-based weights similar to that used for the Framingham risk score [15], the assigning weight of each predictor was determined according to the β coefficient in the multivariate logistic regression model to develop a novel scoring system. Subsequently, in order to test the prognostic power of the new score, the ROC methodology was adopted both in derivation and validation groups. The discriminative capacity of the new score was quantified with C-statistic while calibration was graphically evaluated by the Hosmer-Lemeshow goodness-offit test.
The software package SPSS version 25.0 (IBM Corporation, New York, NY, USA) was utilized for statistical analysis. All statistical tests were 2-tailed, with a p value < 0.05 considered statistically significant. Graphs were generated using the software GraphPad Prism 8.0.

Baseline characteristics
The baseline characteristic of derivation and validation cohorts with critically-ill ADHF were summarized in Table 1. For both groups, the gender, age distribution and risk of adverse outcomes were comparable without significant difference. Of the 1268 patients enrolled, 873 were male with a median age of 58 (± 17) years, among whom the elderly accounted for 17.9%. The top three causes of Chinese ADHF patients were cardiomyopathy (34.5%), ischemic heart disease (30.4%) and valvular disease (17%). CS occurred in 89 patients on admission and 49% were at NYHA IV class on admission. The proportion of HFrEF was 62.3%, 13.9% for HFmrEF and 23.8% for HFpEF. Coexisting atrial fibrillation was observed in 35.6% patients and pleural effusion was identified in 31.9% of the participants.
During hospitalization, the primary endpoint occurred in 181 patients (14.3%) with 117 death (9.2%). The heart transplantation occurred in 3.5% of the patients. The median total hospitalization time was 13 (9-18) days.

Logistic regression and model establishment
Univariate analysis was performed in derivation cohort using the univariate logistic regression model and included the following 30 clinical parameters: age, elderly, sex, BMI, GCS, temperature, SBP, heart rate, RR, arterial pH, PaO 2 , AB, lactic acid, serum sodium, potassium, WBC, Hb, HCT, TBIL, SUA, eGFR, D-dimer, INR, NT-proBNP, hs-TNI, LVEF, PASP, existence of AF, pleural effusion and BBB. All variates except age, elderly, sex, temperature, RR, arterial pH, PaO 2 , hs-TNI, LVEF, AF and BBB were found to be significantly associated with the incidence of short-term adverse outcomes.
Based on the results of univariate analysis, a forward stepwise method was adopted for 19 indexes that showed significant relations for predicting short-term outcomes. Low SBP, high WBC level, HCT, concentrations of TBIL, NT-proBNP and coexistence of stage five chronic kidney disease (CKD) were identified as the independent predictors. Using these six risk factors and with consideration of the weighing of respective β coefficients, we determined assigned points for each parameter, which led to a new prognostic stratification system. Because the weight associated with HCT was the lowest, we specified low HCT to 1 point and divided all weights by a factor of 1.07 then rounding them to the nearest integer. The novel scoring system was as follows: The univariate and multivariate logistic analysis results were listed in Table 2.

Discrimination and calibration of the new score
In the derivation cohort, the C statistic of new scoring system was 0.794 (95% CI 0.753-0.836, p < 0.001). Among the validation patients, since cases with scores of 5 or higher were limited, we combined them into one group for subsequent analysis. The incidence of adverse outcomes increased from 0% for score of 0 to 7.5%, 8%, 20.4%, 10 and 45.7% for score of 1, 2, 3, 4, and 5 points or higher. The scores of the validation cohort and the incidence of primary endpoint events were shown in Fig. 1. Additional clinical baseline data of each score was presented in the Additional file 1.
Using receiver operating characteristics analysis, the C statistics were calculated for comparison of the discriminative power between the new score and other established systems. The C statistic for our new score was 0.758 (95% CI 0.677-0.838, p < 0.001), whereas for APACHE II was 0.598 (95% CI 0.496-0.700, p = 0.058), for ADEHER risk tree [4] and AHEAD score was 0.631 (95% CI 0.529-0.733, p = 0.011) and 0.540 (95% CI 0.442-0.638, p = 0.439) respectively, demonstrating that our system had a better predictive power for shortterm outcomes in critically-ill ADHF patients. The comparison of these four scores were shown in Fig. 2. The calibration of the system was evaluated with the (1 × SBP < 90 mmHg) + (2 × WBC > 9.2 × 10 9 /L) Hosmer-Lemeshow goodness-of-fit test. In the validation cohort, the new scoring system demonstrated a good calibration (χ 2 = 6.681, p = 0.463) for detecting high-risk ADHF patients admitted to ED. The calibration plots were shown in Fig. 3. Mantel-Haenszel test and Pearson correlation test showed a significantly positive relationship between the score and endpoints (in Additional file 1: Table S2).
Furthermore, we attempted to predict the occurrence of heart transplantation with the new system. The C statistics for our system, APACHE II, AHEAD and ADHERE

Discussion
In the present study of Chinese patients in a single cardiovascular center ICU setting, we developed and validated a predictive model based on physical examinations and laboratory testing withing 24 h after ED admission. We found that six parameters were significantly associated with poor short-term outcomes: low systolic blood pressure (SBP < 90 mmHg); increasing white blood cell (WBC > 9.2 × 10 9 /L); low hematocrit (HCT ≤ 0.407); abnormal liver function (TBIL > 34.2 µmol/L); NT-proBNP ≥ 10728.9 ng/ml and stage 5 CKD (eGFR < 15 ml/min/1.73 m 2 ). In comparison, several commonly used existing tools did not exhibit an adequate ability to predict in-hospital outcomes. The new risk score might aid in the identification of ADHF patients at risk for the incidence of in-hospital death, cardiac arrest or use of mechanical support devices in China.

The predictive elements for ADHF
Previous studies have shown that multiple related risk factors can effectively predict adverse outcomes of AHF. The clinical importance of SBP has been considered and prognostic scores such as ADHERE [2], AHFI [16] and GWTG-HF [17] have been created. Gheorghiade et al. reported that a systolic pressure under 120 mmHg at the time of admission was associated with a poor prognosis compared with a systolic pressure over 120 mmHg [18].
In addition, renal dysfunction is a well-known strong prognostic parameter of ADHF. A retrospective study with 104,794 AHF patients demonstrated that abnormal eGFR on admission was proved to be a significant predictor of mortality and readmission risk [19]. In keeping with the fact that nearly all established systems took renal function into account [7], we used eGFR instead of serum creatinine as the indicator scoring 2 points if less than 15 ml/min/1.73 m 2 in both sex. According to China  . 1 Prevalence of the different scores and incidence of adverse events. Blue bars represent the number of patients for each score. The orange line represents the incidence rates according to the new score heart failure (China-HF) registry, elevated total bilirubin is an independent predictor of adjusted in-hospital mortality [20]. Samsky et al. also reported that increase of total bilirubin was closely related to 30-day, 180-day mortality and HF rehospitalization. Elevated WBC count is the most common abnormality in AHF. Previous studies showed that WBC reflected the systemic inflammatory response [22,23], sympathetic overactivity [24] and a physiological reaction to metabolic acidosis [25]. In the present study, it also emerged as one of he most important determinants of short-term prognosis. Anemia is a frequent co-morbidity in AHF patients. Existing evidence has suggested that development of anemia was correlated with increased mortality and higher hospitalization rates irrespectively of age, gender or NYHA functional class in AHF [26]. Although most previous studies used hemoglobin concentrations as an indicator for anemia, our study employed HCT because of its better prognostic performance in ADHF. Little studies clarified the difference between Hb and HCT in AHF settings, but we speculated volume overload or hemodilution might cause the better performance than Hb [27]. Plasma NT-proBNP is another well-known strong predictor of ADHF, and a meta-analysis of ADHF patients has confirmed that NT-proBNP is an independent predictor of mortality both in all-cause and cardiovascular death despite different cut points, time intervals and prognostic models [28].
Although current studies employed different cut-off values for NT-proBNP, we used quartiles for multivariate analysis, which showed that only patients with the highest plasma concentration of NT-proBNP was related to poorer in-hospital outcomes scoring 1 point in the new system. In addition to the six independent risk factors, there are some clinical indicators that have been attached great importance in clinical practice or included in other scores. Recently, Zymliński et al. reported that, in a study of 237 AHF without overt evidence of peripheral hypoperfusion, blood lactate on admission was associated with markers of organ dysfunction and a worse prognosis [25]. They also found lactic acid was a comprehensive index which was affected by HR, WBC, liver function and big endothelin-1. It might explain the reason why lactate was not an independent risk factor when taking multiple parameters into account. As for LVEF, with the deepening understanding of HFpEF, HFpEF patients showed similar or even worse prognosis compared with HFrEF [29]. In another study with 343 AHF, Uriel et al. found that LVEF was not correlated with outcomes, suggesting cautious interpretation when applying LVEF to evaluate AHF patients [30].

The unique potential value of the new score
To improve the prognosis for ADHF, it is crucial to identify high-risk patients as a first step. Several risk stratification systems have been published for AHF previously such as the Acute Physiology and Chronic Health Evaluation (APACHE) system [9], AHEAD score [10], ADHERE, American Heart Association Get With the Guidelines-Heart Failure (GWTG-HF) [17]. ACUTE HF score [11] and AHFI [16]. In our study, we chose APACHE II, ADHERE and AHEAD as comparisons and discovered better predictive capability of our score in the Chinese ADHF patients. Firstly, the study population in our analysis was critically-ill ADHF patients admitted to ICU who had more complex comorbidities and more severe symptoms. Secondly, these three clinical  predictive models for AHF were derived and externally validated in North American or European patients, their performance might vary substantially across different world regions. A recent study indicated that regionspecific recalibrations were needed for AHF scoring systems [31]. Additionally, APACHE II was published in 1985 while ADHERE in 2005 and AHEAD in 2016. With development and application of multiple new diagnosing techniques and arising plasma biomarkers, some clinical indicators should be brought into reevaluation such as NT-proBNP, hs-TNI and D-dimer.
In view of the urgent and special ED cases, we paid more attention to objective laboratory tests rather than personal past medical history. On the one hand, disease histories are often collected through self-reporting which may cause omission or missing of previous medical history considering the urgent ED clinical cases. On the other hand, Table 1 showed that elderly patients accounted for a large proportion, who might experience memory loss or disturbance of consciousness. These reasons made collecting past medical history accurately and completely a tough task at ED visit. Therefore, we did not include predictive scores containing many medical histories such as GWTG-HF, AHFI and OPTIMIZE-HF into our analysis.
There are several noteworthy features of the present investigation: Because of the exclusion criteria not covering LVEF, it was carried out in a cohort of ADHF patients containing not only HFrEF but also HFpEF and HFmrEF often ignored in other studies. And it offered a relatively comprehensive system for evaluating in-hospital outcomes for critically-ill ADHF patients, due to the complete analysis of clinical, biochemical, electrocardiographic and echocardiographic parameters. Considering the incompleteness and availability of past medical history in practical ED situations, we did not highlight past diseases in building the new scoring system. Also, we utilized logistic regression instead of regression tree analysis, hence constructed a quantifiable tool to reach a better predictive accuracy. Moreover, the final model consisting of six easy-to-obtain indexes with a simple calculation method was relatively convenient to identify high-risk populations and aid to determine whether an ADHF admitted to ICU should be closely monitored and managed. For these reasons, our new score might represent a practical and efficient approach to the critically-ill patients commonly hospitalized for ADHF in China.

New score and heart transplantation
Although this new system showed a satisfactory predictive power for the composite endpoint, it cannot accurately predict HTx. The candidacy for HTx was assessed carefully in Fuwai hospital. Elderly and frail patients with ADHF who failed optimal medical management and mechanical circulatory support often suffered from malnutrition, immune dysfunction and multiple organ failure. They were obviously unsuitable for operations. It was understandable that the score was unparallel to the consideration of HTx. Secondly, the selection for HTx was not only associated with HF conditions but also economic conditions, social support and psychological condition (Additional file 1: Table S3).

Study limitations
This study presents a potential model for triaging emergency department ADHF patients for intensive care unit. Also, it had several limitations. First, our database consisted of a cohort of patients from a single cardiovascular hospital, and the study population included only Chinese patients. The participants evaluated was limited to patients admitted only to the ICU, and ADHF patients who were then admitted to other wards were not enrolled. Although an internal validation was performed by bootstrapping techniques in the same population, the results should be carefully interpreted when applied to external validation studies. Second, the composite endpoint of our study was in-hospital death or cardiac arrest or clinical application of mechanical support devices. The selected 6 parameters demonstrated good ability to distinguish patients with high risk of short-term adverse events. Due to lack of follow-ups after discharge, the ability of our scoring system to predict post-discharge and long-term prognosis was still uncertain. Third, the individual clinical data was collected at the time of admission without counting the effects of pre-hospital managements, such as the widely used inotropic drugs for ADHF, which may influence admission blood pressure and heart rate. Besides, there were still some potentially significant clinical parameters we did not collect such as systemic inflammation as measured by CRP or PCT, frailty index and consciousness score. Further studies will be needed to evaluate these factors for the next scoring system.

Conclusions
Existing predictive systems did not demonstrate enough ability to evaluate the incidences of short-term adverse events in critically-ill ADHF in Chinese population. Our new scoring system including SBP, white blood cell count, hematocrit, total bilirubin, estimated glomerular filtration rate and NT-proBNP might provide a practical tool for daily risk stratification of ADHF patients, irrespective of its etiology.