A 10- and 15-year performance analysis of ESC/EAS and ACC/AHA cardiovascular risk scores in a Southern European cohort

Background A key strategy for the primary prevention of cardiovascular disease (CVD) is the use of risk prediction algorithms. We aimed to investigate the predictive ability of SCORE (Systematic COronary Risk Estimation) and PCE (Pooled Cohort Equations) systems for atherosclerotic CVD (ASCVD) risk in Portugal, a low CVD risk country, at the 10-year landmark and at a longer, 15-year follow-up. Methods The SCORE and PCE 10-year risk estimates were calculated for 455 and 448 patients, respectively. Discrimination was assessed by Harrell’s C-statistic. Calibration was analyzed by standardized incidence ratios (SIR). Results During the 10-year follow-up, 7 fatal ASCVD events (the SCORE outcome) and 32 any ASCVD events (the PCE outcome) occurred. The SCORE system showed good discrimination (C-statistic 0.83), while the PCE showed poor discrimination (C-statistic 0.62). Calibration was similar for both systems, according to SIR: SCORE, 0.3 (95% CI 0.1–0.7); PCE, 0.5 (95% CI 0.4–0.7). Globally, both 10-year fatal ASCVD risk and any ASCVD risk were overestimated in the overall population and men. However, the risk was underestimated by both systems in women. Despite an overestimation of 15-year fatal ASCVD by SCORE, the 15-year any ASCVD observed incidence was 1.8 times the 10-year incidence among men and 1.4 times among women. This acceleration of CVD risk was more relevant in the lowest classes of ASCVD risk. Conclusion In this prospective, contemporary, Portuguese cohort, the SCORE had better discriminatory power and similar calibration compared to PCE. However, both risk scores underestimated 10-year ASCVD risk in women.


Background
Cardiovascular disease (CVD) remains the leading cause of morbidity and mortality in developed countries, despite consistent improvement in outcomes [1]. Modifiable risk factors are the major causes of atherosclerotic CVD (ASCVD), accounting for 90% of cardiovascular risk [2,3]. A key strategy in the primary prevention of ASCVD is the use of risk prediction algorithms, as the prophylactic interventions should target the people who should benefit from them most [1,4]. However, it is debatable which is the best algorithm for ASCVD risk estimation.
In 2003, the European Society of Cardiology (ESC) jointly with the European Atherosclerosis Society (EAS) published the SCORE (Systematic COronary Risk Estimation): a 10-year risk estimation system for ACSVD death [5]. The SCORE Project pooled data from 12 European prospective cohort studies from eleven countries, including Southern European countries such as Spain and Italy, but not Portugal [5]. External validation has been performed for the SCORE model in several Western countries and in the Asian population [6][7][8][9][10]. However, calibration of the risk charts according to cardiovascular mortality levels of each country has been proposed for a more precise prediction of risk estimates [5]. This is of utmost importance in countries not represented in the derivation cohorts, as Portugal.
In parallel, the American College of Cardiology (ACC) and American Heart Association (AHA) developed the Pooled Cohort Equations (PCE) that estimate a composite endpoint of 10-year ASCVD risk, instead of the SCORE-estimated 10-year fatal ASCVD risk [11]. However, these equations were all derived from North American cohorts [11], which limit their applicability to other populations. Further, the PCE have been controversial because of reports that they substantially overestimate risk in both American and European populations [12][13][14][15][16].
Besides the different prediction models available to estimate the ASCVD risk, the decision thresholds recommended for drug therapy are different for, respectively, fatal ASCVD (SCORE) and any ASCVD (PCE) [17]. Such conflicting recommendations may create confusion among physicians, potentially reflecting uncertainty about the external validity of different algorithms under different settings [18], namely for low-risk countries, like those in Southern Europe [19,20]. In Portugal, a country with one of the lowest ASCVD event rates in Europe [20], neither risk systems have been validated and doubts concerning their clinical applicability and predictive accuracy in this particular population are raised.
In this study, we aimed to determine the 10-and 15year incidence of ASCVD events in a primary prevention cohort of a Southern European country. In addition, we aimed to compare the calibration and discrimination of both SCORE and PCE systems in this cohort at the predefined 10-year landmark and at a longer, 15-year follow-up, in order to estimate their external validity for clinical practice.

Study design and population
We conducted a prospective single-center study, including 663 patients consecutively referred by Primary Care physicians to Centro Hospitalar e Universitário de Coimbra's Lipidology Clinic from 1994 to 2007. All patients were followed until December 2017. Among recruited participants, we selected those aged 40-79 years and with no previous history of heart disease and stroke at the screening visit. Patients with familial hypercholesterolemia or with missing information at the baseline examination were excluded, resulting in 455 individuals available for this study.

Assessment of risk factors
Baseline, demographic and clinical variables were collected, including age at referral, sex, hypertension, diabetes and smoking status. Hypertension was defined as systolic blood pressure ≥ 140 mmHg and/or diastolic blood pressure ≥ 90 mmHg, or current antihypertensive treatment. Diabetes was defined as fasting glucose ≥126 mg.dL − 1 or the use of hypoglycemic drugs. Smoking status was dichotomized as current versus past/never smokers. On the first appointment, several baseline laboratory variables and a complete lipid profile on fasting blood samples were collected.

Risk classification definitions
The predictors used to estimate risk for a first fatal or any ASCVD event included age, sex, smoking status, total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), systolic blood pressure, diabetes, and antihypertensive treatment. SCORE and PCE systems vary in the age ranges to which they apply. SCORE is recommended for use in the 40-to 65-year age range [1]. In contrast, the age range in which PCE is validated is 40-79 years [11]. Thus, we compared the performance of both risk systems in individuals aged 40-79 years. Baseline SCORE (using the model for low-risk countries) [5] and PCE [11] predicted risks were calculated for each patient. To comply with real-life clinical use of the SCORE, the absolute fatal ASCVD risk for those aged 66-79 years was corresponding to the risk at age 65 [21]. The cohort was stratified in four groups based on the SCORE risk categories: low (SCORE < 1%), moderate (SCORE 1-5%), high (SCORE 5-10% or diabetes without organ damage or major CVD risk factor), and very high (SCORE > 10% or diabetes with organ damage or major CVD risk factor) [1]. Major CVD risk factors were defined by markedly elevated single risk factor, namely TC > 310 mg.dL − 1 and blood pressure ≥ 180/110 mmHg [1]. Regarding the PCE risk, the cohort was also stratified in four risk categories: low (ASCVD risk < 5%), borderline (ASCVD risk 5-7.5%), intermediate (ASCVD risk 7.5-20%), and high (ASCVD risk > 20%) [4].

Outcome variables
In SCORE, the predicted outcome is fatal ASCVD, comprising the occurrence of death from coronary artery disease, stroke, hypertension, heart failure, peripheral artery disease or aortic disease [5]. The endpoint defined by PCE is a first ASCVD event, including nonfatal myocardial infarction, ASCVD death, and stroke [11]. The events among the study cohort were identified through record linkage with the national health registry and Health Data Platform. We ascertained nonfatal myocardial infarction or stroke events, defined according to the Cardiovascular and Stroke Endpoint Definitions for Clinical Trials [22].
Cases of fatal ASCVD [22] were ascertained from the cause of death listed on death certificates.

Statistical analysis
A total of 455 patients had complete information on the components of SCORE and 448 had complete information on the components of PCE. The median (interquartile range (IQR)) follow-up time was 15 (11)(12)(13)(14)(15)(16)(17) years. All patients were followed up for at least 10 years and 55.4% (n = 252) were followed for 15 years regarding fatal ASCVD and 48.9% (n = 219) regarding any ASCVD.
Continuous variables were expressed as mean ± standard deviation (SD). Median and IQR were used if the distribution was not normal, assessed by the use of the Kolmogorov-Smirnov test. The Student's t-test for normal variables and the Mann-Whitney test for nonnormal variables were used for comparisons among groups (male vs female). Categorical variables were presented as percentages and were compared using chisquare or Fisher's exact test.
The predictive ability of the SCORE and PCE systems for the Southern European population was evaluated based on discrimination and calibration of the models. The discriminative power was compared with Harrell's Cstatistics, which takes into account the timing of events. A calibration analysis was conducted by using standardized incidence ratios (SIR), comparing the 10-year predicted event rate to the observed rate at 10-year follow-up. As all patients completed the 10-year follow-up, 10-year observed event rate was the actual rate. Regarding the calibration analysis at the 15-year follow-up, the Kaplan-Meier method was utilized to estimate the 15-year observed risk for each group. Both rates were plotted against the predicted risk, estimated as mean 10-year risk score in each group [23]. Participants were censored at the time of the first occurrence of ASCVD, death, time of the last follow-up, or at 15 years of follow-up.
Concerning head-to-head comparison, an intersection of the two risk systems specific subgroups was created (n = 448). The discrimination was evaluated by Harrell's C-statistic for each risk system-specific outcome. The pairwise differences among the C-statistics corresponding to the risk systems were calculated, and, to compensate for multiple testing, bootstrapping was used to obtain 99% confidence intervals (CI) [23].

Results
The mean age ± SD of the study sample of 455 individuals was 57.8 ± 9.6 years and 61.8% were male. The overall baseline characteristics were similar between genders, except that women had significantly higher levels of low-density lipoprotein cholesterol (LDL-C) and a larger proportion of statin therapy ( Table 1).
The average 10-year risk estimates by each of the risk systems were as follows: the mean risk for ASCVD mortality was 4.7% (5.6% in men and 3.1% in women) according to SCORE and the mean ASCVD risk was 13.5% (16.2% in men and 9.0% in women) according to PCE, globally indicating a population with moderate ASCVD risk (Table 1).

Discrimination
We assessed the ability of SCORE and PCE to discriminate between patients who developed events defined by SCORE (fatal ASCVD) and PCE (any ASCVD) and those who did not, after calculating the 10-year risk for each patient. During the 10-year follow-up, 7 (1.6%) SCOREspecific events and 32 (7.0%) PCE-specific events occurred ( Table 1). The C-statistic corresponding to the model with risk score as the only covariate were 0.83 and 0.62 for the SCORE-specific and PCE-specific outcomes, respectively (Table 2).
In the head-to-head comparison, while the SCORE displayed the highest C-statistics for fatal ASCVD, the values were similar considering any ASCVD as the outcome (Fig. 1). No statistically significant differences were observed between the predictive ability of the two risk systems (Table 3).
Calibration Score While 7 fatal events occurred during the 10-year followup period, SCORE predicted 21.4 events, which means there were 66% more expected events than observed (Fig. 2).
Discordance between observed and expected cases was found throughout the risk continuum, with a predominance of overestimation in the overall population and men by all of the risk strata at 10-year and 15-year follow-up. Importantly, SCORE underestimated risk in women, particularly in those at high risk (5 to 10%) (Fig. 2).

PCE
Globally, the PCE predicted 47% more any ASCVD events (Fig. 2). Actually, the PCE overestimated 10-year risk when the predicted risk was 5% or higher. Both in the overall population and men, when the predicted risk was low (< 5%), the risk was slightly underestimated. Regarding women, PCE underestimated risk in low (< 5%) and intermediate (7.5-20%) risk strata. In the 219 individuals with 15 years of follow-up, an acceleration of the risk was observed, surpassing the 10-year predicted risk in low (< 5%) and borderline (5-7.5%) risk strata in men and low (< 5%) and intermediate (7.5-20%) risk strata in women (Fig. 2).

Discussion
In our cohort of Portuguese patients, the evaluation of PCE and SCORE risk prediction algorithms revealed that both performed at an acceptable level. The SCORE system performed better than the PCE system in discriminating their respective endpoints. Although calibration was similar, both risk systems markedly overestimated the risk.
While both risk systems address the same practical question, which individuals benefit from interventive measures, its equations differ significantly. The SCORE includes as predictors age, sex, smoking, systolic blood pressure, and TC; it is traditionally applicable in individuals from 40 to 65 years, and the predicted outcome is 10-year fatal ASCVD,    Table 2 Harrell's C-statistic SCORE and PCE in predicting fatal ASCVD events and any ASCVD events in a Portuguese cohort aged 40-79 years in primary prevention, respectively The predictive ability of the risk scores was assessed by Harrell's C-statistic for risk system-specific outcomes ASCVD Atherosclerotic cardiovascular disease, PCE Pooled Cohort Equations, SCORE Systematic COronary Risk Estimation disregarding non-fatal events [5]. Besides the predictors included in SCORE, PCE also incorporates ethnicity, HDL-C, antihypertensive use, and diabetes. Furthermore, PCE has a larger age range (40-79 years) and predicts the 10-year risk of fatal and non-fatal ASCVD [11].
Existing studies regarding the SCORE risk prediction system generally reported good discrimination, consistent with our findings [5,8,9,24]. Considering the SCORE system, our study reported a favorable Cstatistic of 0.83. The discrimination performance of the SCORE risk to predict 10-year ASCVD mortality was evaluated in the SCORE project and the C-statistic values ranged from 0.71 to 0.84 [5]. In the global population, we observed a systematic risk overestimation that has also been previously reported in low-risk populations, as in Spain [8,9], and high-risk populations [24].
The external validation studies that evaluate the PCE system have generated controversy around the predictive accuracy of these equations in contemporary cohorts [25]. The PCE models discriminated risk reasonably (Cstatistic 0.62) in our cohort, but performing worse in comparison to other studies [11,15,16]. The original PCE C-statistic ranged from 0.71 to 0.82 [11] and 0.74 to 0.79 in contemporary cohorts [16]. Similar to the SCORE, we observed a risk overestimation in our population, with an exception in predicting ASCVD risk in low-risk strata (< 5%). Previous literature established that the PCE generally overestimate risk among modern European and American populations [13,15,16,23,26]. Although there are no reports of underestimation of ASCVD risk by the PCE, it has been widely recognized that a large number of people with ASCVD risk prediction < 7.5% will paradoxically experience ASCVD events [27], reinforcing the value of additional risk markers to reclassify those patients [4,27].
Accurate calibration of the risk prediction models is crucial for the success of preventive measures. Although both models overestimated the risk, when observed risks were taken into account, the accuracy of risk prediction in women was less impressive. Actually, the risk was underestimated by the PCE risk system in the low-risk (< 5%) and intermediate-risk (7.5-20%) strata. Underestimation of ASCVD risk in the female gender has only been described once previously in a Korean cohort [28]. Concerning the SCORE system, the interaction between sex and events should be analyzed as an exploratory and hypothesis-generating finding, due to the very low number of 10-year fatal ASCVD events (n = 7), especially in men (n = 2). This low ASCVDmortality rate is consistent with what has been found in a Spanish cohort of 608 non-diabetic patients on  Table 3 Head to head comparison of the risk scores according to Harrell's C-statistic in a Portuguese cohort aged 40-79 years in primary prevention (n = 448) The predictive ability of the risk scores was assessed by Harrell's C-statistic for each risk system-specific outcome. The pairwise differences among the Cstatistics corresponding to the risk systems were calculated, and, to compensate for multiple testing, bootstrapping was used to obtain 99% confidence intervals (CI) ASCVD Atherosclerotic cardiovascular disease, CI Confidence interval, PCE Pooled Cohort Equations, SCORE Systematic COronary Risk Estimation primary prevention with a similar moderate 10-year risk of fatal ASCVD (2.1%), where only 9 ASCVD deaths were reported (1.5%) [8]. Nevertheless, women's risk was underestimated in high-risk women (5-10%). An underestimation of the risk by the SCORE system in women has already been reported in Australian women below 50 years [6], and in a Malaysian population [7]. Conversely, in another Southern European cohort from Italy, the 10-year fatal ASCVD risk was properly predicted by SCORE [29]. However, the low-risk (< 1%) women from this cohort had a 20-year rate of any ASCVD of 3.7, and 40% of ASCVD events occurred in this class, which signals potential problems with the calibration of this equation in women [29]. The causes of the unexpected findings regarding the underprediction of fatal or any ASCVD risk in women and the very low number of ASCVD deaths in men are Fig. 2 Calibration accuracy evaluated by comparing observed and predicted events and standardised incidence ratios in a Portuguese cohort aged 40-79 years in primary prevention. A calibration analysis was conducted by using SIR, comparing the 10-year predicted event rate to the observed rate at 10-and 15-year follow-up. As all patients completed the 10-year follow-up, 10-year observed event rate was the actual rate. 15year observed events were Kaplan-Meier adjusted. SIR, standardised incidence ratios, SCORE: Systematic COronary Risk Evaluation; PCE: pooled cohort equations not clear. Although women from our cohort had higher levels of LDL-C, they were also treated more commonly with statins. In the past, numerous studies have shown excess mortality in women after myocardial infarction [30]. In addition, women suffering an acute coronary syndrome are treated less intensively than men [31], and, even on primary prevention, it has been observed a lower use of lipid-lowering drugs in women [32]. Independently of the cause of the excess risk of ASCVD events in women, our results suggest that at least in our population, the risk stratification systems might need to be better calibrated among women, in order to reduce the proportion of women miscategorized and to avoid CVD events by implementing appropriate preventive measures [5].
Lastly, the SCORE overestimated the 15-year risk in the overall population, probably reflecting the decline of CVD mortality in the last decades [33]. Conversely, regarding PCE, we observed an important increase in any ASCVD events at 15-year follow-up. The observed 15year any ASCVD incidence was 1.8 times the 10-year incidence among men and 1.4 times among women. This acceleration of ASCVD risk is even more relevant in the low-and borderline-risk strata in men and the low-and intermediate-risk strata in women, as the observed 15year risk surpasses the 10-year predicted risk. A divergence between short-time and long-term risk has already been described in two Italian studies [29,34]. Among the relatively large group with low and intermediate short-term risk, differences in risk factor burden seem to translate into marked differences in the incidence of ASCVD events in the remaining lifespan [35]. The added value of long-term risk estimation becomes even more relevant in women and younger men, more prone to be classified as being at low short-term risk [36].
It is essential to evaluate the applicability of risk models to each population, as risk scores may perform worse in a different setting from the one they were originally derived [37]. It is well-known that the incidence of CVD events is declining, and Portugal is no exception. Despite the high prevalence of hypertension and other chronic diseases, Portugal has one of the best indicators for cardiovascular mortality [20], namely age-and riskstandardized mortality rates concerning coronary artery disease and cerebrovascular disease [19]. This problem is well-recognized and the literature recommends that scores should be recalibrated if incidence rates differ substantially in the new settings [5,13], as we have previously discussed for women. Additionally, newer statistical methods could considerably improve the accuracy of ASCVD risk estimates [16]. Another, more straightforward option to improve its predictive ability would be the incorporation of additional risk variables such as coronary artery calcium score and high-sensitivity C-reactive protein [4,27] or consider the lifetime risk for CVD in addition to short-time (10-year) risk [4].
We acknowledge several limitations of our study. We have a limited sample size, representative of the Center region of Portugal, one of the regions with the lowest rate of CVD events in our country [20]. This may explain partially the overestimation of the risk by both systems, and extrapolation of our results to other populations should be done cautiously. In addition, only half of the patients had a complete 15-year follow-up, as many patients were included in this cohort after 2003. Also, we cannot overlook that two-thirds of our cohort was under statin therapy; however, the majority of the patients were under lowand moderate-intensity statins as they were recruited before the era of high-intensity statins recommendation and are representative of a real-world population seen in everyday practice. Interestingly, in a sensitivity analysis of the Copenhagen General Population Study, including statin users at baseline (7% of the population) did not explain the observed mismatch between the predicted and observed event rates [21]. Finally, we were also not able to adjust for prescription of other preventive medication during follow-up. This might also have contributed to some of the observed mismatch. A strength of our study is that the results originate from the largest contemporary Portuguese cohort evaluated so far, with no patient lost to follow-up.

Conclusions
In this prospective, contemporary, Portuguese cohort, SCORE had better discriminatory power compared with PCE. Overall, both systems overestimated ASCVD risk at the 10-year landmark. However, among women, ASCVD risk was underestimated with both risk systems. Lastly, we also observed an acceleration of the cardiovascular risk at 15-year follow-up, particularly at the lowest risk classes. These findings may have the potential to improve risk stratification and, ultimately, treatment allocation in a primary prevention setting.