Time-to-event prediction analysis of patients with chronic heart failure comorbid with atrial fibrillation: a LightGBM model

Background Chronic heart failure (CHF) comorbid with atrial fibrillation (AF) is a serious threat to human health and has become a major clinical burden. This prospective cohort study was performed to design a risk stratification system based on the light gradient boosting machine (LightGBM) model to accurately predict the 1- to 3-year all-cause mortality of patients with CHF comorbid with AF. Methods Electronic medical records of hospitalized patients with CHF comorbid with AF from January 2014 to April 2019 were collected. The data set was randomly divided into a training set and test set at a 3:1 ratio. In the training set, the synthetic minority over-sampling technique (SMOTE) algorithm and fivefold cross validation were used for LightGBM model training, and the model performance was performed on the test set and compared using the logistic regression method. The survival rate was presented on a Kaplan–Meier curve and compared by a log-rank test, and the hazard ratio was calculated by a Cox proportional hazard model. Results Of the included 1796 patients, the 1-, 2-, and 3-year cumulative mortality rates were 7.74%, 10.63%, and 12.43%, respectively. Compared with the logistic regression model, the LightGBM model showed better predictive performance, the area under the receiver operating characteristic curve for 1-, 2-, and 3-year all-cause mortality was 0.718 (95%CI, 0.710–0.727), 0.744(95%CI, 0.737–0.751), and 0.757 (95%CI, 0.751–0.763), respectively. The net reclassification index was 0.062 (95%CI, 0.044–0.079), 0.154 (95%CI, 0.138–0.172), and 0.148 (95%CI, 0.133–0.164), respectively. The differences between the two models were statistically significant (P < 0.05). Patients in the high-risk group had a significantly higher hazard of death than those in the low-risk group (hazard ratios: 12.68, 13.13, 14.82, P < 0.05). Conclusion Risk stratification based on the LightGBM model showed better discriminative ability than traditional model in predicting 1- to 3-year all-cause mortality of patients with CHF comorbid with AF. Individual patients’ prognosis could also be obtained, and the subgroup of patients with a higher risk of mortality could be identified. It can help clinicians identify and manage high- and low-risk patients and carry out more targeted intervention measures to realize precision medicine and the optimal allocation of health care resources.


Introduction
Chronic heart failure (CHF) refers to a syndrome of ventricular filling or contraction disorders caused by damage to the structure and/or function of the heart under the influence of various pathogenic factors, leading to a series of complex clinical symptoms. In developed countries, patients with heart failure constitute about 1% to 2% of all adults, and this proportion increases to > 10% of adults aged > 70 years [1]. The global prevalence of heart failure is estimated to exceed 37.7 million. In the United States, the total medical cost of patients with heart failure was US$20.9 billion in 2012 and is expected to increase to US$53.1 billion by 2030 [2]. The high prevalence rate and poor prognosis of heart failure seriously affect patients' physical and mental health and quality of life, and heart failure has become a global public health problem that threatens human health.
Atrial fibrillation (AF) is the most common arrhythmia in heart failure. AF increases the risk of thromboembolism (especially stroke) and may damage cardiac function, leading to deterioration of high-frequency symptoms. In the Framingham Heart Study, patients with heart failure comorbid with atrial fibrillation have a higher risk of mortality than those with only one disease [3]. The combination of AF and CHF is a major clinical burden because of the common pathophysiology, common risk factors, mutual causality, and poor prognosis of these concomitant diseases.
Accurate risk prediction can promote patient classification, assist clinicians in understanding individual patients' disease risk, and preserve medical resources for patients with potential life-threatening needs in emergency care, thus delaying disease progression and improving the prognosis. However, the existing risk prediction models of heart failure have some shortcomings. First, traditional risk prediction models are based on the assumption that a linear relationship exists between variables and outcomes, which often limits their ability to model complex relationships. Second, the performance of the risk scores is still limited. For example, in the longterm heart failure registry of the European Society of Cardiology, the Meta-Analysis Global Group in Chronic Heart Failure risk score overestimated mortality while the Seattle Heart Failure Model underestimated mortality [4]; this limits their clinical application. Therefore, more accurate prognostic tools are needed. The machine learning model can overcome the conditional limitations of the traditional survival prediction model, deal with high-dimensional interactions and nonlinear relationships between variables, improve the prediction ability of the model, and show better performance in identifying personalized outcome predictions [5]. It has been effectively used in heart disease research [6,7], including the prediction of hospital readmission and mortality, etc. However, few prognostic studies have focused on the outcome of CHF comorbid with AF. Therefore, the goal of the present study was to identify the risk factors for all-cause mortality in patients with CHF comorbid with AF and to design and evaluate a LightGBM-based risk stratification model to predict 1-to 3-year all-cause mortality based on the patient's baseline parameters at admission. It can help clinicians identify and manage high-and low-risk patients and carry out more targeted intervention measures to realize precision medicine and the optimal allocation of health care resources.

Data sources and study population
This is a prospective cohort study that involved patients who were hospitalized in the First Hospital of Shanxi Medical University and Shanxi Cardiovascular Hospital from January 2014 to April 2019 and diagnosed with CHF comorbid with AF. Patients were selected in strict accordance with the inclusion and exclusion criteria, and all patients provided written informed consent.
The inclusion criteria were an age of ≥ 18 years; typical symptoms (e.g., exertional or paroxysmal dyspnea, fatigue, or loss of appetite) or signs (e.g., edema of both lower extremities, rales in the lungs, or positive signs of hepatic jugular venous reflux) of CHF; New York Heart Association (NYHA) class of II to IV; current treatment with heart failure drugs or other treatment measures; and a history of AF or diagnosis of AF through clinical examination, standard electrocardiogram, and single-lead portable electrocardiogram monitoring.
The exclusion criteria were acute cardiovascular events in the past 2 months, concurrent mental illness, inability to understand or complete the questionnaire because of speech or intellectual impairment, and refusal to participate in the study.

Data collection and predictor variables
According to the content of case records and heart failure guidelines [8], our group developed the chronic heart failure case report form (CHF-CRF) to collect the patients information. CHF-CRF included demographics (age, sex, family history, and other parameters), vital signs (blood pressure, body temperature, heart rate, and respiratory rate), causes of CHF [e.g. coronary heart disease (CHD), old myocardial infarction (OMI)], CHF comorbidities [chronic obstructive pulmonary disease (COPD), diabetes, atrial fibrillation, renal insufficiency and other conditions], symptoms and signs, laboratory test results included blood cell analysis, blood glucose, blood lipid, liver and kidney function, potassium, sodium, chlorine, B-type natriuretic peptide (BNP) and N-terminal pro B-type natriuretic peptide (NT-proBNP) et al., echocardiography was recorded along with standard and tissue Doppler imaging. LVEF was quantified by Simpson's method. QRS duration was measured manually from limb leads using standard 12 lead ECG (25 mm/s). Drug therapy, percutaneous coronary intervention (PCI) and coronary artery bypass grafting (CABG) and other treatment information were also recorded on CHF-CRF.

Outcomes
The patients were followed up at 1, 3, 6 and 12 months after discharge and annually thereafter. Patients with less than 3-year follow-up time were excluded, and the outcome was all-cause mortality within 1, 2, and 3 years, including death from heart failure, cardiovascular causes, and other causes. The death information was composed of two parts: one was that the follow-up personnel conduct regular follow-up of the patient, and the other was to inquire in the information system of the death cause registration report of Shanxi Province based on the patient's ID number.

Data pre-processing
To make full use of clinical information, we filled in the missing data before variable screening, missing continuous variables were imputed with median, and missing categorical variables were imputed with mode. At the same time, BNP, coronary CT and coronary angiography results were excluded in order to exclude the influence of variables with high missing ratio on the prediction performance of the model. Estimated glomerular filtration rate was calculated by CKD-EPI using cystatin C [9].

Machine Learning Modeling Approach: LightGBM
To solve the time-consuming shortcomings of the traditional boosting algorithm under big data, Ke et al. [10] proposed two novel techniques: Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB). GOSS retains all data with large gradients and randomly samples data with small gradients, thereby reducing the amount of calculation and optimizing speed and memory. EFB can bundle mutually exclusive features into a single feature to reduce the dimension of features. LightGBM is a new gradient boosting decision tree algorithm with GOSS and EFB.

Model development and performance evaluation
The data set was randomly divided into a training set and test set at a 3:1 ratio. This process was repeated 100 times to ensure the stability of the model. In the training set, the SMOTE algorithm was used for data equalization sampling, and fivefold cross validation was used for LightGBM model training; a prediction performance evaluation was performed on the test set and compared using the logistic regression method.
The area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and f-measure were calculated to quantify the model's discriminative ability in each year. Calibration of the model was evaluated by the Brier score, which is defined as the mean square difference between the observed outcomes and the predictions. The Hosmer-Lemeshow goodnessof-fit test of the model was visualized by calibration curve plots. The net reclassification index was used to quantify the degree of improvement in the prediction ability of the LightGBM algorithm compared with the logistic regression model.

Risk groups
One of the data splits was randomly selected for training and testing of the model, and the receiver operating characteristic curves were plotted. Using the maximal Youden's index as the best cut-off value, the 1-, 2-, and 3-year probabilities of death predicted by the LightGBM model were divided into high-risk and low-risk groups.

Statistical analysis
Continuous variables are presented as median (interquartile range), and categorical variables are presented as number (percentage). To determine the factors related to all-cause mortality, the recursive feature elimination method was used for feature selection. The selected continuous variables and categorical variables were analyzed with the Mann-Whitney U test and chi-square test, respectively.
DeLong test was used to compare the AUC between models, and P < 0.05 was considered statistically significant. The survival rate was presented on a Kaplan-Meier curve and compared by a log-rank test, and the hazard ratio was calculated by a Cox proportional hazard model.

Sensitivity analysis
Sensitivity analyses were performed using different subgroups, including heart failure type [heart failure with a reduced left ventricular ejection fraction (LVEF), midrange LVEF, or preserved LVEF], sex, and age (≤ 74 or ≥ 75 years). All statistical analyses were performed using Python 3.7.

Baseline characteristics of patients
The baseline characteristics of patients with CHF comorbid with AF are shown in Table 1. In total, 1796 patients were included in this study. The median age of all patients in the entire cohort was 73 (64-80) years, and 63.42% were male. The most common comorbidity was hypertension (62.97%), followed by diabetes (28.56%). The 1-, 2-, and 3-year cumulative mortality rates of patients with CHF comorbid with AF were 7.74%, 10.63%, and 12.43%, respectively.

Predictor variable
The recursive feature elimination method based on the random forest model was used for feature screening. As shown in Table 2, the main predictors of all-cause mortality were older age; a higher white blood cell count (WBC), red blood cell distribution width (RDW), aspartate aminotransferase (AST) level, total bilirubin (TBIL) level, alkaline phosphatase (ALP), blood urea nitrogen (BUN) level, uric acid level, N-terminal pro-brain natriuretic peptide (NT-proBNP) level, and NYHA class; a lower body mass index (BMI), diastolic blood pressure (DBP), hemoglobin level, albumin level, estimated glomerular filtration rate was calculated using cysteine C level (CyscGFR) and left ventricular ejection fraction (LVEF); a wider QRS complex; the combination of COPD and diabetes; and not taking beta-blockers, and angiotensin-converting enzyme inhibitor (ACEI)/angiotensin receptor blocker (ARB).

Model prediction performance
Compared with logistic regression, the LightGBM model exhibited higher discrimination and lower Brier score in the 1-, 2-, and 3-year follow-up of the test cohort (Table 3 The calibration curve plots indicated that the Light-GBM model was generally well calibrated, with intercepts closer to 0 and slopes closer to 1, while logistic regression showed poor calibration (Fig. 2). The classification improvement of each year was calculated and compared with the logistic regression model. The net reclassification index of the Light-GBM model in the 1-, 2-, and 3-year follow-up was 0.062 (95%CI, 0.044-0.079, P < 0.05), 0.154 (95%CI, 0.138-0.172, P < 0.05), and 0.148 (95%CI, 0.133-0.164, P < 0.05), respectively, suggesting that the mortality prediction ability of the LightGBM model was better than that of logistic regression.

Feature importance
The feature importance of all-cause mortality is shown in Fig. 3. The importance of each feature was quantified by the number of times a feature was used to split in the model, and a higher value of feature importance was associated with a greater contribution to the risk prediction of the model. The importance of the first 11 features was ranked as follows: NT-proBNP level, COPD, albumin level, TBIL level, CyscGFR level, DBP, NYHA class, betablockers, AST level, age, and LVEF.

LightGBM model-based risk stratification
The probabilities of death predicted by the LightGBM model were divided into high-risk and low-risk groups, using the maximal Youden's index as the best optimal cut-off value (0.492, 0.498, 0.497, respectively). At each cut-off, the sensitivity and specificity of model prediction were 0.738 and 0.843, 0.776 and 0.815, 0.815 and 0.795, respectively.
As shown by the Kaplan-Meier curve, log-rank test, and Cox proportional hazards model, there were significant differences in the distribution of death events between the two groups in all follow-up years (Fig. 4, Table 4). Patients in the high-risk group had a significantly higher hazard of death than those in the lowrisk group, the hazard ratio was 12.68, 13.13, 14.82, respectively.

Results of the sensitivity analysis
The models in each subgroup performed well, and the predictive performance between the sexes was similar. For 1-year mortality, however, the discrimination was lower for patients aged ≥ 75 years. For example, the model had a discrimination of 0.693 for patients aged ≥ 75 years and 0.761 for patients aged ≤ 74 years (Table 5).

Discussion
In this study, we designed and evaluated a risk stratification system based on the LightGBM model to predict 1-to 3-year all-cause mortality in patients with CHF comorbid with AF. The risk stratification system showed moderate predictive performance with an average AUC of 0.740. CHF and AF are causes and effects of each other. Damage to the cardiac structure or function, abnormal activation of neurohumoral mechanisms, and remodeling of ion channels in patients with CHF can lead to myocardial remodeling, enlarge the atrium, change the electrical activity characteristics of atrial myocytes, and Table 2 Predictor variables of all-cause mortality in the model Data are presented as median (interquartile range) or n (%) BMI: body mass index, DBP: diastolic blood pressure, WBC: white blood cell, RDW: red blood cell distribution width, ALT: alanine aminotransferase, AST: aspartate aminotransferase, TBIL: total bilirubin, BUN: blood urea nitrogen, ALP: alkaline phosphatase, CyscGFR: estimated glomerular filtration rate was calculated by cystatin C, NT-proBNP: N-terminal pro-brain natriuretic peptide, LVEF: left ventricular ejection fraction, NYHA: New York Hearth Association, COPD: chronic obstructive pulmonary disease, ACEI: angiotensin-converting enzyme inhibitor, ARB: angiotensin receptor blocker    existence of AF is related to the poor prognosis of CHF [11]. However, identifying the risk factors of adverse prognosis of CHF comorbid with AF and taking effective control and treatment measures will help to reduce the incidence of adverse events such as death. Consideration of the relationship between multiple clinical variables of a single patient and mortality is a great challenge for clinicians, and it is often easy to ignore the potential relationship between variables. ML methods can handle complex interactions and nonlinear relationships between predictors, allowing the selection of unknown variables and the best predictive subset of the model through continuous iteration. As an emerging algorithm in machine learning, the LightGBM algorithm overcomes the limitations of traditional boosting algorithms. LightGBM algorithm has the following advantages: first, it has faster training speed, higher efficiency and better accuracy; Second, it has lower memory consumption and can process large-scale data; Third, it supports parallel, distributed, and GPU learning. Experiments show that LightGBM algorithm can speed up the training process of traditional gradient boosting decision tree (GBDT) by more than 20 times, while achieving almost the same accuracy, and LightGBM can be significantly better than the extreme gradient boosting (XGBoost) algorithm and the stochastic gradient boosting (SGB) algorithm in computing speed and memory consumption [10]. Therefore, we chose LightGBM for this research.
We screened the most important predictors of allcause mortality in the cohort of this study. According to the feature importance ranking, older age; a higher NT-proBNP level, NYHA class, AST level, TBIL level; a lower DBP level, albumin level, cyscGFR and LVEF; combined with COPD; and not taking beta-blockers had a relatively large contribution to prediction of the risk of death in patients with CHF comorbid with AF.
Our study found that NT-proBNP is an important predictor of prognosis in patients with CHF comorbid with AF. The NT-proBNP level is positively correlated with the severity of heart failure, and is closely related to NYHA class, end-diastolic pressure, and degree of hemodynamic disturbances, and can be used as an effective means of prognostic evaluation [12,13]. Abnormal liver enzymes often appear in patients with heart failure, with a prevalence of 30-60% [14]. Increased venous congestion and impaired hemodynamics are common causes of abnormal liver enzymes in patients with heart failure. Abnormal liver function may lead to increased fluid overload due to hypoalbuminemia and low-osmolality state, which may lead to deterioration of heart failure. This explains the liver function indicators as powerful predictors of prognosis in patients with CHF comorbid with AF in our study, and is consistent with other studies [15][16][17].
Renal dysfunction is a common complication of CHF, the pathophysiology of cardiorenal syndrome is closely related to decreased cardiac output and increased central venous pressure. About 40% of hospitalized patients with heart failure showed elevated serum creatinine and decreased glomerular filtration rate (GFR) [18]. Cystatin C is considered to be a more sensitive blood marker of renal function than creatinine and is less strongly affected by muscle mass, age, sex, or race. The CyscGFR is closer to the directly measured glomerular filtration rate and has better prognostic value [19]. Renal function parameters were found to be predictors of adverse events in patients with CHF comorbid with AF in the present study, consistent with previous reports [20][21][22][23].
Diabetes (28.12%) and COPD (24.05%) are common complications and predictors of poor prognosis in the present study. COPD and AF have common risk factors and therefore often coexist. COPD greatly limits the survival of patients. Previous study has found that patients with concurrent AF and COPD have higher cardiovascular mortality and all-cause mortality [24]. The prevalence of diabetes is 12-44% in heart failure patients, depending on the severity of heart failure and whether the left ventricular ejection fraction is reduced. Diabetes is a powerful independent predictor of death in patients with advanced heart failure [25]. Type II diabetes can cause inflammation of adipose tissue, and the resulting systemic inflammation can lead to the expansion of epicardial adipose tissue and proinflammatory transformation. In one study, patients with multiple non-cardiovascular comorbidities had a higher risk of competitive death [26]. The white blood cell count is elevated in patients with AF and CHF, and the increase   of inflammatory markers in patients with cardiovascular disease (especially heart failure) can be considered a factor for a poor prognosis [27]. There is a lot of evidence that systemic inflammation is present in COPD patients [28]. Combined with the above, it may explain to some extent the cause of high mortality in patients with COPD or diabetes. The RDW is the coefficient of variation of the red blood cell volume and reflects the heterogeneity of the red blood cell volume. It is a proven predictor of adverse outcomes of heart failure [29]. In the present study, patients with a higher BMI had a lower prognostic risk. This seems to reflect the obesity paradox but laterally indicates that the BMI is not an independent predictor of CHF comorbid with AF. Lower DBP is associated with an increased risk of adverse cardiovascular events in patients with heart failure with a preserved LVEF [30,31], and our study revealed a similar relationship in patients with CHF comorbid with AF. Oral beta-blocker therapy is helpful to control the heart rate. In Chinese elderly patients with heart failure, admission without beta-blocker therapy is a specific independent risk factor for readmission or death within 1 year [32]. Previous studies have also shown a moderate association between the use of ACEI/ARB and lower mortality [33].
Based on the above predictive variables, we conducted subgroup analysis, but found that LightGBM model had a low discrimination for patients aged ≥ 75 years. We speculate that the possible reason is that patients aged ≥ 75 years have more complicated conditions, more complications, and are more likely to have some complex clinical emergencies. The prediction model constructed by conventional inspection indexes can not achieve good prediction results, and we have found the same results in another machine learning study [34].
Compared with previous reports, the innovations of this study are the use of patients with CHF comorbid with AF as the target population; the use of the Light-GBM as a new machine learning prediction model; inclusion of a variety of non-cardiac clinical variables such as COPD, diabetes, and liver and kidney function in the model; and the fact that the patients' clinical variables were easy to obtain. Compared with the traditional risk prediction model, the LightGBM model performs better in predicting all-cause mortality in patients with CHF comorbid with AF. It can risk-stratify individuals and identify patients with a high risk of death during the whole follow-up period. Patients in the high-risk group had a significantly higher hazard of death than those in the low-risk group, the hazard ratio was 12.68, 13.13, 14.82 in our study, respectively. Clinicians can carry out active intervention programs for high-risk patients and controllable variables to improve patients' quality of life and reduce mortality. In addition, we performed 100 random splits on the data set to ensure that the prediction results of the model were more robust.

Limitations
Our study has two main limitations. First, we did not perform external verification; the external accuracy of the model may need to be confirmed by further research. Second, we did not include genetic biomarkers (such as microRNA); thus, the clinical data and biomarkers of patients with CHF should be combined in future studies to establish a new predictive model with comprehensive patient information to improve prognostic risk assessment.

Conclusions
Using patients' routine clinical variables, we designed and evaluated a risk stratification system based on the LightGBM model to effectively predict all-cause mortality in patients with CHF comorbid with AF and identify subgroups of patients with a high risk of death. It can help clinicians identify and manage high-and low-risk patients and carry out more targeted intervention measures to realize precision medicine and the optimal allocation of health care resources.