In this study, we have shown that the RF model integrating clinical variables and CAC score can obtain superior prognostic performance than the traditional LR models for obstructive CAD on CCTA. In addition, our comprehensive RF model obtained high concordance between the predicted risk and actual observed risk. CAC score was the most important variable in the RF model, followed by age, fasting glucose levels, plasma homocysteine levels, and the number of neutrophils.
Obstructive CAD is the most common etiology of atypical chest pain, which significantly increase mortality and healthcare expenditure. To noninvasively predict the occurrence of CAD, many models have been developed, such as CCTA, CACS, and cardiac magnetic resonance angiography [3, 4]. Nevertheless, the performance of many existent models is limited in the presence of obstructive CAD [19, 20]. Beyond that, the discriminative ability of some models has become lower in more than one external population in an ever-changing world [6]. Therefore, there is an urgent need for optimal predictive models for obstructive CAD in individuals with atypical chest pain.
ML, as a scientific algorithm, can make data-driven predictions by learning from the training set and finishing subsequent prediction tasks in an independent set [21]. Compared with other ML algorithms, such as neural network (NNET) and support vector machine (SVM), RF does not need to select features in advance and prevent over-fitting [16, 22]; Compared with the traditional LR models, RF, a classic ML algorithm, can account for non-linear and higher dimensional relationships between a multitude of variables that could potentially lead to an improved explanatory model. Similarly, our research found that although the LR models containing the CAC score have moderate predictive power, however, the calibration curve fitting did not achieve well which has been proved by the Hosmer–Lemeshow. On the contrary, the RF model showed a better predictive performance for obstructive CAD. Additionally, RF models have shown equal or better performance than humans in medical practices such as diagnosis, decision-making, and risk prediction in cardiology [16]. Our findings uphold the RF model based on all available information and CAC scores can more accurately identify high-risk individuals and improve the clinical use of the CAC scanning in risk assessment and guiding management decisions [11, 23,24,25].
In the order of variable importance, consistent with the previous studies, the CAC score is superior to traditional cardiovascular risk factors, such as age, sex, smoking, the presence of diabetes mellitus and hyperlipidemia, and so on. The CAC score measured by non-contrast cardiac-gated computed tomography (CT) provides an evaluation of the global burden of coronary atherosclerosis. Furthermore, the CAC score can provide a long-term and independent prognosis for the clinical risk of cardiovascular disease (CVD) and CAD events [7,8,9]. Therefore, accurate coronary calcification detection and assessment can aid in clinical decision-making. Recent research has shown that deep learning techniques, irrespective of picture quality and calcification, can precisely estimate coronary artery calcification from CT angiography images [15, 26]. In future clinical applications, it might have a significant impact.
A good model should take into account not only its diagnostic effectiveness but also its repeatability, noninvasiveness, and simplicity. In our study, other CT variables such as the total number of calcified coronary lesions, plaque density, the presence of thoracic aorta calcification, and so on, which have been revealed to increase the predictive potency of CAC for CVD events were not included in the present prediction model[27, 28]. However, the prediction results of the RF model in our investigation were similar to those of the previously reported Extreme Gradient Boosting (XGBoost) model [11]. Additionally, our preliminary experiments showed that the RF model had better calibration than XGBoost. Considering its effect on the insensitivity of missing values and the advantages of dealing with high-dimensional data make it easier to generalize in clinical practice. Last but not least, current guidelines have recommended that CACS can be used to guide preventive therapies in asymptomatic individuals at intermediate risk for CVD events [29, 30]. Given the above, patients at lower risk in the RF model may not require further testing, such as CCTA or Coronary angiography.
Several limitations of the present study should be paid more attention to. Firstly, the present investigation was lack of external validation in an independent cohort, which was planned for subsequent analysis. Secondly, the presence of severe calcification may lead to overestimates % stenosis on CCTA. Hence, more than 50% stenosis on CCTA may not represent the accuracy > 50% stenosis evaluated by coronary angiography. Thirdly, further study with long follow-up times is very necessary to assess the long-term predictive role of the CAC score. Fourthly, all the screened subjects were all from China, thus, the predictive model may be not suitable for other ethnic groups. Finally, we did not use multiple ML algorithms for this research, but the RF model has shown better predictive ability in previous studies.