Common genetic variants do not predict recurrent events in coronary heart disease patients

Background It is unclear whether genetic variants identified from single nucleotide polymorphisms (SNPs) strongly associated with coronary heart disease (CHD) in genome-wide association studies (GWAS), or a genetic risk score (GRS) derived from them, can help stratify risk of recurrent events in patients with CHD. Methods Study subjects were enrolled at the close-out of the LIPID randomised controlled trial of pravastatin vs placebo. Entry to the trial had required a history of acute coronary syndrome 3–36 months previously, and patients were in the trial for a mean of 36 months. Patients who consented to a blood sample were genotyped with a custom designed array chip with SNPs chosen from known CHD-associated loci identified in previous GWAS. We evaluated outcomes in these patients over the following 10 years. Results Over the 10-year follow-up of the cohort of 4932 patients, 1558 deaths, 898 cardiovascular deaths, 727 CHD deaths and 375 cancer deaths occurred. There were no significant associations between individual SNPs and outcomes before or after adjustment for confounding variables and for multiple testing. A previously validated 27 SNP GRS derived from SNPs with the strongest associations with CHD also did not show any independent association with recurrent major cardiovascular events. Conclusions Genetic variants based on individual single nucleotide polymorphisms strongly associated with coronary heart disease in genome wide association studies or an abbreviated genetic risk score derived from them did not help risk profiling in this well-characterised cohort with 10-year follow-up. Other approaches will be needed to incorporate genetic profiling into clinically relevant stratification of long-term risk of recurrent events in CHD patients. Supplementary Information The online version contains supplementary material available at 10.1186/s12872-022-02520-0.


Introduction
Genome-wide association studies (GWAS) have identified large numbers of loci reliably associated with prevalent CHD [1]. Genetic risk scores (GRS) derived from SNPs which had the strongest associations with CHD improved the prediction of first events [2], but attempts to use a GRS for genetic profiling to predict recurrent events in established CHD have yielded conflicting findings. This may be because they have been statistically underpowered with short-term periods of follow-up [3]. While polygenic risk scores (PRS) including thousands of SNPs have been shown to assist with risk stratification, their value in clinical application remains uncertain [4]. Therefore, in the present study, we tested whether individual SNPs or an abbreviated, previously validated GRS derived from SNPs known to be strongly associated with CHD could be applied clinically to predict long-term cardiovascular outcomes.

The LIPID genetic cohort
The LIPID (Long-term Intervention with Pravastatin in Ischaemic Disease) trial (Trial Registration ACTRN12616000535471) comparing pravastatin 40 mg per day with placebo was conducted between 1992 and 1998 in 9014 patients. Entry to the trial required a history of acute coronary syndrome (ACS) 3-36 months previously, and patients were in the trial for a mean of 36 months. The results have been reported previously [5]. The cohort was followed for a total of 10 years from the close-out of the randomised controlled trial (RCT) [6,7].
The LIPID Genetic cohort is described in Table 1. It included those patients alive at the end of the RCT in 1998, who had also given consent for collection of a blood sample and had high-quality DNA extracted. Whole blood was not available to enable DNA extraction from samples obtained at the time of patient randomisation. The LIPID Genetic cohort totalled 4932 patients. All fatal events were analysed in the 10 years of followup between 1997 and 2006, and all coronary events (fatal and non-fatal) in the first two years of cohort follow-up.

DNA extraction
DNA was extracted from whole blood samples from consenting patients at their close-out visit and stored at − 80 °C. The reasons for exclusion from the Genetic cohort included death during the trial (n = 1132), lack of consent for DNA extraction at close-out or DNA samples not of suitable quality for analysis (n = 2950).

Exploration and selection of SNPs
A literature review was undertaken using English language reports in PubMed to select SNPs for further exploration and was based on (1) SNPs with a significance of p < 5 × 10 −8 in published GWAS reports of cardiovascular disease and (2) SNPs from known atherothrombotic pathways and other pathways related to rhythm and conduction disturbances, left ventricular dysfunction/cardiac failure and statin responsiveness. A custom designed Illumina GoldGate array of 384 SNPs with minor allele frequency (MAF) > 1% in European populations was used in this study (See Additional file 1: Table S1).

Exploration of a previously derived GRS
We explored the predictive value of a GRS derived by Mega et al. [8] This required an additional five SNPS to be included after amplification using Taq Man probe assays. In our testing of this GRS, we created a score for each patient by summing the number of risk alleles for each SNP weighted by the log of the ORs used by Mega et al. [8]. We created an unadjusted model and also used the same baseline variables as quoted in Mega et al. for multivariable adjustment.

Genotype quality control
Variants were excluded if they had a call rate < 95%, deviated substantially from Hardy-Weinberg equilibrium (p < 10 −6 ), [9] or had a MAF of less than 1%. After quality control procedures, a total of 338 variants were available for analysis in all 4932 individuals.

Statistical analysis
Associations between SNPs and outcomes were assessed individually using adjusted proportional hazards Table 1 Numbers of patients who were randomised, survived to end of the LIPID Trial, and were included in the Genetic Cohort study (n = 4932) and events that occurred in the Genetic Cohort (shown in bold in the Table)  regression models. The choice of potential covariates was based on our previous analyses which stratified risk for fatal as well as non-fatal outcomes [10]. SNPs that remained significant using a cut-off of p < 0.01 were reported. The Bonferroni method of adjusting for multiple comparisons suggested a cut-off of 0.0001. As no SNPs met the predetermined cut-off of p < 0.0001, the less conservative cut-off of 0.01 was used for retention of SNPs in the models. The SNPs that were independent predictors for each cause of death in the LIPID data were used to create a risk score for each patient, applying the log of the hazard ratio from a model adjusted for clinical risk factors. The resulting scores were divided into quintiles and then the three middle groups were combined into one group. For the validation of the Mega et al. model, this procedure was repeated for CHD death using the SNPs and odds ratios previously reported in their manuscript [8].

Ethics
The LIPID trial was approved by the ethics committee at each participating site. All patients gave written informed consent for cohort follow-up, either in the clinic or remotely. The LIPID Genetic cohort study was approved by the Human Research Ethics Committee of Sir Charles Gairdner Hospital, Perth. The Long term follow-up of patients in the LIPID cohort study was approved by University of Sydney Human Research Ethics Committee Reference No: 01-2002/2454. The Genetic Cohort study was approved by The Human Research Ethics Committee of Sir Charles Gairdner Hospital, Perth HREC No 2011-060.

Patient involvement
Patients were not involved in the study design. Patient involvement in the study occurred at the time of informed consent, supervised by the Human Research Ethics Committee of each site.

Characteristics of the study population
The baseline characteristics of the patients in the LIPID Genetic cohort (n = 4932) at the time of entry into the LIPID trial are summarised in Additional file 1: Table S2 and were similar to those of the full LIPID trial cohort [5][6][7]. Table 2 shows cause-specific deaths in the 10-year follow-up of the LIPID Genetic cohort. There was a total of 1558 deaths, of which 898 were cardiovascular, including 727 related to CHD, and 375 due to cancer.

Association of individual SNPs with 10-year fatal outcomes in the LIPID Genetic cohort
The associations (unadjusted for baseline variables or multiple testing) of fatal outcomes over 10 years from the end of the double-blind phase of the RCT with the individual SNPs are presented in Table 2. After adjustment for baseline variables, and after further correction for multiple SNP testing, there were no statistically significant associations of individual SNPs with subsequent deaths. When testing for internal validation, the risk score for each patient was based on the hazard ratios discovered from our own data. In this internal validation, Table 2 Risk stratification for cause-specific deaths over 10 years derived from the single nucleotide polymorphisms (SNPs) with the highest and lowest hazard ratios (HR) with an association stronger than the predetermined threshold of p < 0.01, unadjusted for baseline risks or multiple testing * Hazard ratios discovered from the data, not the odds ratios previously published in GWAS reports. Unadjusted for baseline risks or for multiple testing we found highly significant stratification of risk, including for deaths from cancer, which had an adjusted hazard ratio of 3.8 (95% confidence interval 2.59-5.57, p = 7.82 × 10 −12 ) between the lowest and highest quintile ( Table 3). The pattern for each of the 10-year fatal outcomes examined is displayed graphically in Fig. 1.

Prediction of 10-year risk of fatal CHD from a previously-derived GRS
The GRS described by Mega et al. [8] both unadjusted and after adjustment for the baseline risk factors listed in that publication, showed no statistically significant stratification for CHD death over 10 years. The results for CHD death are shown in Fig. 2. Before and after adjustment for baseline variables, the categories of risk based on each individual's GRS did not distinguish between high (top quintile) moderate (middle 3 quintiles) and low (bottom quintile) risk of CHD death over 10 years (Table 4). The variables adjusted for are described in the Table. Two-year non-fatal outcomes We also examined the value of the same SNPs for predicting non-fatal as well as fatal outcomes in the two years of open label follow-up, using a composite of CHD events (CHD death, non-fatal myocardial infraction, unstable angina, coronary artery bypass grafting and percutaneous coronary revascularisation) as the principal outcome measure, as described by Mega et al. [8] ( Table 5).

Reclassification
When the 27 SNPs were added to a model with the baseline risk factors used by Mega et al. [8] there was moderate improvement in the net reclassification index (NRI) with a value of 0.097 and a very minor increase in the C-statistic from the ROC (receiver operating curve) curve from 0.69 to 0.70. When we added a history of prior CHD or prior MI (none, one or multiple) to the baseline model, with and without the SNPs, there were NRIs of 0.064 and 0.067, respectively (Table 5), and there was no significant change in the C-statistic.

Discussion
Our results do not support the hypothesis that individual SNPs strongly associated with prevalent or incident CHD on GWAS, or a previously validated 27-SNP GRS based on these SNPs [8] can predict long-term outcomes of patients with CHD who have had an ACS in the past.

The role of individual SNPs
In the early GWAS reports of associations of SNPs with CHD, p values < 10 −10 were found for multiple SNPs, particularly the rs1333049 variant in the 9p21 gene [11]. We have significantly extended these previous observations by selecting a large number of other SNPs that have shown statistically strong associations with CHD or CHD pathways in previous GWAS reports. Our conclusion Fig. 1 Plots of all-cause, coronary, cardiovascular, and cancer deaths over 10 years based on hazard ratios of risk for patients with high (top quintile), moderate (middle 3 quintiles) and low (lowest quintile) risk. Risk stratification derived from associations of SNPs with statistically significant hazard ratios with outcomes on unadjusted analyses Fig. 2 Plot of coronary heart disease death over 10 years using the genetic risk score derived by Mega et al. [8]. High risk = top quintile, moderate risk = middle three quintiles, low risk = bottom quintile. Unadjusted for baseline variables from this first part of our study is that selecting individual SNPs from strong associations with CHD in GWAS did not improve prediction of long-term cardiovascular risk in established CHD.

The role of a previously validated GRS based on 27 SNPs
We next tested a previously validated GRS for its potential for clinical application to improve risk prediction in known CHD patients [8]. Since the risk score for each patient was derived from the hazard ratios within the data set, it was expected that ranking of the risk scores would correlate strongly with cardiovascular risk, and this was indeed the case, even after adjustment for clinical variables. The strong correlation with cancer deaths was unexpected. However, a more stringent test of the role of a genetic influence on outcome is to test if an externally derived GRS is predictive. GRS ranging from 19 to 300 SNPs [12] and more recently, PRS (Polygenic Risk Scores) of 50,000 [13] to 6 million SNPs [14] have been evaluated for their value in identifying risk of incident CHD. The larger panels have been shown to be superior to smaller scale scores in predicting events in people at high risk of incident CHD, but recent reports show only a modest improvement in prediction over clinical predictors [15].
GRSs developed for prediction of recurrent events in known CHD patients have been tested in smaller cohorts than the present study, less well characterised to enable full adjustment for confounding, or with shorter followup, and external validation has been infrequent [16]. Modest associations with recurrent events have been shown, but none have demonstrated clear-cut improvement in risk prediction in patients with established vascular disease [1,[16][17][18][19][20][21][22][23]. These are summarised in Table 6.
Because of the inherent appeal of a clinically applicable GRS with a limited number of SNPs we chose to evaluate the 27 SNP GRS which had been derived by Mega et al. [8] and which has been externally validated for predicting incident CHD [24]. This GRS was derived from large number of patients with established    CHD including nearly 5000 who were in the CARE [25] and PROVE-IT [26] clinical trials of statin therapy. When the Mega et al. [8] score was applied to the LIPID cohort, it did not show any genetic contribution to prediction of recurrent CHD events or of fatal outcomes even before adjustment for clinical determinants of risk. We conclude from this second part of our study that there is a low likelihood of identifying CHD patients at high risk of recurrent events based on GRSs composed of an abbreviated SNP panel.

Limitations
There are several limitations of this study. The analyses are subject to recruitment bias which would be relevant if we were studying early survival after ACS but is less relevant for a longer-term study of a defined CHD cohort whose ACS was years distant. We chose only one previously described GRS for validation. The GRS derived by Mega et al. [8] was the most relevant score for testing against outcomes in our cohort as it included patients with a similar clinical profile although the endpoints in the Mega GRS were only for the duration of the clinical trials. The Mega et al. database included primary prediction studies but also included 17,000 person years of follow-up in secondary prediction studies.
The reasons why this study of genetic polymorphisms of individual genes did not reveal an effect on the risk of recurrent events in patients with CAD remains unclear. Firstly, the sample size in this study may have been too small to detect a genetic effect on outcomes, but with a total of over 1500 deaths, this seems unlikely. Secondly, many of the clinical variables used for adjustment in the statistical models, are themselves subject to genetic influence, but it is striking that the lack of prediction by genetic variants was observed even before adjustment for clinical predictors. Finally, CHD, particularly when an ACS has occurred in the past, may simply be too complex a condition for genetics to influence survival.
It is important to recognise that our data do not exclude a genetic influence on survival in CHD patients. However, it is clear from these analyses that clinically applicable genetic profiling with single SNPs or a SNPderived GRS with a limited number of highly selected SNPs did not add precision to the prediction of recurrent major CHD outcomes. It is conceivable that a polygenic risk score (PRS) with many thousands of SNPs will demonstrate a genetic influence on outcomes, but it remains to be established whether a PRS will have a clinically applicable role in enhancing the precision of recurrent event prediction beyond clinical markers of risk [27]. Further studies to clarify the genetic contribution to risk in established CHD will require the pooling of data from large numbers of individual cohorts [28], recognition of the limitations of GWAS [29] and possibly an omnigenic approach with exploration of regulatory genes undetected on GWAS [30].

Conclusion
In this large cohort of patients with CHD who had an ACS in the past, individual SNPs strongly associated with prevalent or incident CHD on GWAS, and a previously validated 27-SNP GRS based on these SNPs did not predict long-term outcomes.