Analysis of 61 SNPs from the CAD specific genomic loci reveals unique set of SNPs as significant markers in the Southern Indian population of Hyderabad
BMC Cardiovascular Disorders volume 22, Article number: 148 (2022)
The present study is a part of the major project on coronary artery disease (CAD) carried out at Indian Statistical Institute, Hyderabad to investigate the pattern of association of SNPs selected from the CAD specific genomic loci. The study is expected to portray the genetic susceptibility profile of CAD specifically in the Southern Indian population of Hyderabad.
The study was conducted in a cohort of 830 subjects comprising 350 CAD cases and 480 controls from Hyderabad. A prioritized set of 61 SNPs selected from the NHGRI GWAS catalogue were genotyped using FluidigmNanofluidic SNP Genotyping System and appropriate statistical analyses were used in interpreting the results.
After data pruning, out of 45 SNPs qualified for the association analysis, four SNPs were found to be highly significantly associated with increased risk for CAD even after Bonferroni correction for multiple testing (p < 0.001). These results were also replicated in the random subsets of the pooled cohort (70, 50 and 30%) suggesting internal consistency. The ROC analysis of the risk scores of the significant SNPs suggested highly significant area under curve (AUC = 0.749; p < 0.0001) implying predictive utility of these risk variants.
The rs10455872 of LP(A) gene in particular showed profound risk for CAD (OR 35.9; CI 16.7–77.2) in this regional Indian population. The other significant SNP associations observed with respect to the pooled CAD cohort and in different anatomical and phenotypic severity categories reflected on the role of genetic heterogeneity in the clinical heterogeneity of CAD. The SNP rs7582720 of WDR12 gene, albeit not individually associated with CAD, was found to be conferring significant risk through epistatic interaction with two SNPs (rs6589566, rs1263163 in ZPR1, APOA5-APOA4 genes) of the 11q23.3 region.
Coronary artery disease (CAD) is a multifactorial disorder involving both genetic and environmental factors and is characterized by genetic as well as etiologic heterogeneity. Hence, identifying the causative factors for CAD development and/or progression is always challenging. The candidate gene approaches have identified approximately 300 genes that belong to a wide range of metabolic pathways to be associated with CAD. The genome wide association studies (GWAS) revealed significant association of 32 chromosomal loci previously [1, 2] whereas, a recent study identified 64 novel genetic loci in the CAD genetic architecture . There were many replication studies on the association of variants at these GWAS loci with CAD in specific populations and meta-analyses which identified the causal genes for CAD development. However, despite the lack of consistency in the association patterns of these genes/loci across the populations, these studies suggested 11q23.3 and 9p21.3 chromosomal regions as the most replicated CAD associated loci across the globe. The candidate gene and GWAS studies revealed that 11q23.3 Apolipoprotein gene cluster region is associated specifically with lipoprotein metabolism which if defective is shown to play an important role in the process of atherosclerosis, the primary event in CAD development [4, 5]. Moreover, 9p21.3 and 11q23.3 loci were shown to have pleiotropic effects harbouring statistically significant Single Nucleotide Polymorphisms (SNPs) associated with various complex disorders including CAD . Given the prominent association of these two loci with CAD, as part of the major project, we have earlier investigated the patterns of association of 95 SNPs of 11q23.3 [7, 8] and 35 SNPs of 9p21.3  chromosomal regions in the Southern Indian population of Hyderabad almost saturating these two genomic regions. We observed rs7865618 of 9p21.3 locus and 12 SNPs from the 11q23.3 region to be significantly associated with CAD and the results have suggested unique patterns of association of different SNPs of these loci with different anatomical and phenotypic categories of CAD and epistatic interactions between SNPs of the same and/or different chromosomal regions [7,8,9]. Given the significant findings of our previous studies, the present study in the same cohort was focused on the analysis of 61 more SNPs reported with greater than genome wide threshold p value in 3 metabolic syndrome GWAS selected from the National Human Genome Research Institute (NHGRI) GWAS catalogue. We presented here the results on the pattern of association of 61 SNPs with the pathogenesis of CAD, and also of the epistatic interactions between these SNPs and the previously analyzed SNPs of 11q23.3 and 9p21.3 regions. The study might help in comprehensive portrayal of the genetic susceptibility profile of the population of Hyderabad for this important and life-threatening disease.
The population of Hyderabad is a conglomeration of people from different parts of the undivided state of Andhra Pradesh and the mother tongue of most of its population is Telugu, one of the four Dravidian languages. It would be pertinent to note that, despite the subdivision of Telugu population into a number of traditionally endogamous castes and sub castes, Reddy et al.  observed genetic differentiation among the populations of Andhra Pradesh to be very low and insignificant. The Markov chain Monte Carlo analysis of population structure which implements model based clustering method for grouping individuals into populations [11, 12] did not reveal any unique population clusters suggesting high degree of genetic homogeneity. This would preclude the possibility of the effect of substructure in the study cohort.
The study protocol was approved by the Indian Statistical Institute (ISI) Review Committee for Protection of Research Risks to Humans and all experiments were performed in accordance with the relevant guidelines and regulations. Written informed consent was obtained from all the participants as per the guidelines.
Study design and population
For this case–control study, a total of 1024 subjects comprising 508 CAD cases and 516 controls broadly representing the populations of the undivided Andhra Pradesh were included. The CAD cases were recruited from the CARE hospitals, Hyderabad after their evaluation by interventional cardiologists. Patients with characteristic symptoms of stable/unstable angina pectoris along with varying degrees (generally > 40%) of stenosis in at least one of the major coronary arteries as determined through angiogram were included. Cases with monogenic diseases, valvular heart disease, cardiomyopathy, renal disease, acute and chronic viral or bacterial infections, asthma, tumors or connective tissue diseases, other vascular diseases and familial cases of CAD were excluded from the study. The baseline characteristics of CAD patients recruited for the present study were furnished in a previous publication .
The control samples were collected from Hyderabad and its vicinity broadly representing similar ethnic composition, socioeconomic backgrounds as that of the cases and aged above 45 years. The individuals with characteristic features of any of the above-mentioned exclusion criteria and/or with positive family history for the same were not included as part of the controls.
Data and blood sample collection
The epidemiological and clinical data pertaining to the individuals, who participated in the study, were obtained through personal interviews using a detailed questionnaire and from the hospital records. About 5 ml of peripheral fasting blood sample was collected from each of the subjects by certified medical lab technicians.
DNA isolation and genotyping
All the blood samples were used for isolation of DNA using phenol chloroform method . The quality and quantity of isolated DNAs were determined with the help of Thermo Scientific VarioskanTMFlash Multimode Reader using Quant-iTTMPicoGreen® dsDNA Assay Kit. Quantification of the samples was done at Sandor Life sciences, a medical laboratory in Hyderabad. Genotyping was performed for the prioritized set of 61 SNPs, selected from the CAD associated SNPs with greater than genome wide threshold p value that were reported in 3 metabolic syndrome GWAS from NHGRI GWAS catalogue, using FluidigmNanofluidic SNP Genotyping System at the same laboratory. Integrated fluidic circuit (IFC) chips were utilized for genotyping which were thermal cycled and, the end-point fluorescent values were measured on Biomark™ system. Final sample wise genotype calls were obtained using Fluidigm SNP Genotyping Analysis software. All the 61 SNPs used for the present study were characterized for genomic localization, function or nearby gene information (Additional file 2: Table S1). Genotype call rate of ≥ 99% was achieved for all the SNPs in 350 cases and 480 controls which were used for the association analyses.
The anatomic and phenotypic severity categorization of CAD cases
The CAD cases were categorized into four ‘anatomical’ subtypes ; (1) cases with 40–70% stenosis and symptomatic for CAD with characteristic atherosclerotic lesions as ‘insignificant’ disease, (2) with > 70% stenosis in any one of the major coronary blood vessel as ‘Single Vessel Disease’ (SVD), (3) with > 70% stenosis in two major coronary blood vessels cases as ‘Double Vessel Disease’ (DVD) and, (4) with > 70% stenosis in three major coronary blood vessels as ‘Triple Vessel Disease’ (TVD). Based on the phenotypic severity, the CAD cases were also categorized into three broad conditions; (1) those with characteristic symptoms of stable or unstable ‘angina’, (2) with symptoms of ‘Acute Coronary Syndrome’ (ACS) and, (3) with reported ‘Myocardial Infarction’ (MI). However, we could not retrieve relevant information for categorizing CAD cases into the above subcategories for some of the case samples hence there was a difference in the total number of CAD cases used for anatomical and phenotypic severity categories when compared to the sample of pooled CAD cases.
The pooled sample of CAD case and control groups and each of the anatomical and phenotypic severity categories were subjected to pertinent statistical analyses. The data pruning, logistic regression analysis with and without covariates (age and sex) were done for 61 SNPs using PLINK software version1.07. After data pruning, only 45 of the 61 SNPs that showed minor allele frequency > 1% and conformed to Hardy–Weinberg Equilibrium (HWE) were qualified for further statistical analyses. The p-value for the association to be significant is set at 0.05 for a single SNP and after Bonferroni correction for multiple testing (p = α/m, where α = 0.05 and m = number of hypotheses or SNPs). Inter group (anatomical and phenotypic) differences were also analysed using PLINK after appropriate categorization of data into these groups as mentioned above. The post-hoc power of the study was calculated using G* power software (vs 184.108.40.206). The ‘SNPassoc’ package of R-PROGRAM was used for genotypic association analyses by considering different genetic models- co-dominant, over-dominant, dominant, recessive and log-additive. The model with significant p value and lowest AIC (Akaike Information Criterion) was selected as the best fit for the respective SNP. Linkage disequilibrium and haplotype analyses were done using HAPLOVIEW (version4.2.). The cumulative risk scores were obtained using SPSS (version 25, IBM) software.
Pair-wise SNP-SNP interactions among the 61 SNPs of the present study and between each of these SNPs and the SNPs of 11q23.3 and 9p21.3 regions earlier studied by us were analyzed using PLINK software version1.07. The multiple SNP interaction analysis was done with the help of non-parametric approach by GMDR (version 0.9), where a tenfold cross-validation with 2, 3, 4 and 5 way interactions were used to detect the gene–gene interactions. Based on the testing balance accuracy and minimal prediction error, the significant interactions were selected. The cumulative risk score for each individual was calculated based on the number of significant SNPs and to determine the predictive potential of the risk variants for CAD, the logistic regression analysis of the risk categories was performed and the receiver operating characteristic (ROC) curve was constructed using SPSS (version 25) and the area under curve (AUC) that reflects the prognostic potential of the risk variants determined.
Allelic association of the SNPs with CAD in the pooled cohort
The minor allele and genotype frequencies of 61 SNPs for the CAD cases and controls were given in the Additional file 3: Table S2. However, after the data pruning using PLINK software, 16 of the 61 SNPs were excluded either because of minor allele frequency < 0.01 or deviation from Hardy–Weinberg equilibrium and the remaining 45 SNPs were subjected to further analyses. The logistic regression analyses of the allelic data in the pooled cohort revealed that five of the 45 SNPs were significantly associated with CAD (p < 0.05) four of which remained highly significant even after Bonferroni corrections for multiple testing (Table 1). The odds ratios of associated SNPs indicated that the minor alleles of the following SNPs; G-rs10455872, C-rs6725887, T-rs782590, T-rs173539 significantly increased the risk for CAD with highly elevated frequencies among CAD cases. The fifth SNP (rs9818870) was marginally significant and protective in nature. An increasing value of the odds ratios of the risk associated SNPs was observed in the order of rs782590 (SMEK1), rs173539 (HERPUD1-CETP), rs6725887 (WDR12) and rs10455872 (LPA) which were also implicated in the increasing severity of the CAD. However, the genes in which the associated SNPs are located appear to have diverse functions suggesting the role of these SNP variants in contributing to the genetic heterogeneity of CAD. These genes encode for proteins such as serine/threonine protein phosphatase, Homocystein inducible endoplasmic reticulum stress inducible ubiquitin like domain member 1-cholesteryl ester transfer protein, WD repeat containing domain 12 (ribosome biogenesis protein) and, lipoprotein-A respectively. The only SNP (rs9818870) that was not significant after corrections for multiple testing showed protective nature of its association with CAD with higher frequency in controls. The association of all the five SNPs were significant even after adjusting for age and sex.
In order to check the internal consistency of our results and to validate the association of SNPs that were significant in the pooled cohort, we analysed 30%, 50% and 70% random subsets of our case and control cohorts and observed quite similar pattern of allelic association when compared to the total cohort for the four highly significant and risk conferring SNPs even after correction for multiple testing excepting for rs782590 in the 30% subset (Additional file 4: Table S3). In addition, the post-hoc power of the study was calculated using G power for the four significant risk conferring SNPs with respect to the putative odds observed, taking the log of odds ratio as the effect size and the total sample size 830. The SNPs rs6725887, rs782590 and rs173539 with significant odds ratios of 8.58, 1.51 and 2.86 (Table 1) yielded statistical power (1 − β error probability) of 100%, 99.9% and 100% respectively. The SNP rs10455872 with odds ratio 60.0 was out of range with respect to log of odds (1.778) since the range of effect size considered in the G*power software is 0–0.999. High odds observed for the SNP could be attributed to the highly elevated minor allele frequency among cases and deviation from HWE which might suggest positive selection of the allele in the disease group. Hence, we used an online post-hoc power calculator (https://clincalc.com/stats/Power.aspx) for computing the power using proportion of minor allele and the sample size of both cases and controls which revealed power as 100%. These results suggested internal consistency and replicability of the results of association of the significant SNPs in our study.
Allelic association of the SNPs with the anatomical categories of CAD
The results of logistic regression analyses of the 45 SNPs suggested altogether nine SNPs to be significantly associated (p < 0.05) with at least one of the four anatomical categories. The minor allele frequencies of the associated SNPs along with respective odds ratios were presented in Table 2. The four SNPs which were shown to increase the risk for CAD under allelic association analysis of the total sample were all found to be significantly associated with the SVD and DVD categories. Excepting rs782590, remaining three SNPs were associated significantly with increased risk for TVD. Two SNPs (rs10455872, rs6725887) remained significant in all categories after correction for multiple testing whereas, rs782590 and rs173539 were significant only in the DVD and SVD categories respectively. In addition to the association of these four SNPs with anatomical categories of CAD, the minor allele frequency of rs247617 in the CETP gene was found to be significantly elevated in SVD (0.333) and TVD (0.356) categories of CAD compared to controls (0.259) with p values 0.036 and 0.019 respectively. While two additional SNPs; rs2107595 (of TWIST1 gene) and rs3127599 (of LPAL2 gene) were shown to confer protection (Odds Ratio (OR): 0.57) and risk (OR: 1.46) respectively to SVD, both the additional SNPs associated with TVD (rs9940128- FTO gene and, rs1083096- MTNR1B gene) were shown to reduce the risk. The genes to which these additional SNPs belonged were found to have diverse functional roles indicating possible genetic heterogeneity in the manifestation of different anatomical categories of CAD.
Allelic association of the SNPs with the phenotypic severity categories of CAD
The results of allelic association of SNPs with three phenotypic severity categories of CAD suggested eight SNPs to be significantly (p < 0.05) associated with at least one of the three phenotypic categories (Table 3). The four SNPs that were significantly associated with CAD in the pooled sample and with the anatomical categories showed significant association with phenotypic categories as well. While the rs10455872 and rs782590 were associated significantly with all three anatomical categories of CAD, the association of rs6725887 was observed in angina and ACS categories and, rs173539 in ACS and MI categories only. Additionally, significant increase in risk for angina, ACS and MI categories was observed with reference to rs174546, rs1122608 and rs4846922 respectively. The rs7767084 was the only SNP shown to reduce the risk for angina. Except for the four common SNPs which showed significant allelic association in the pooled cohort, anatomical, and phenotypic categories, the additional SNPs appeared in the phenotypic categories were completely different from those observed in the anatomical categories of CAD. These results might help in identifying the genetic determinants of the clinical heterogeneity of CAD. The additional risk conferring SNPs of angina, ACS and MI categories were found to be located in FADS1, SMARCA4 and GALNT2 genes respectively and encode for proteins with diverse functional roles like lipid metabolism, chromatin remodelling and oligosaccharide biosynthesis suggesting that the unique set of SNPs associated with the phenotypic severity categories of CAD belonged to diverse functional pathways. However, a specifically designed study with relatively larger samples for subcategories of CAD might be required in order to confirm these findings.
Genotypic association of the SNPs with CAD
The results of logistic regression analysis of the genotypes of five significant SNPs (Table 4) suggested that the SNP rs10455872 with highly significant allelic association showed 48.5-fold (95%CI 22.3–105) increased risk for CAD in the presence of heterozygote AG (p < 0.0001) under co-dominant model. This result might suggest significant functional role of the SNP in contributing to CAD pathology. Additionally, both the heterozygous and homozygous variant genotypes of rs6725887 and rs782590 showed significant risk for CAD under log additive model with highly significant p values. However, the other SNP rs173539 which showed significant allelic association failed to show genotypic association with CAD. Protective role conferred by the SNP rs9818870 at the allelic association level did not turn out to be significant at the genotypic level. The lack of association of these SNPs at the genotype level could be because of the extremely low frequency of variant allele (T) among both cases and controls.
Linkage disequilibrium (LD) and SNP-SNP interactions among the SNPs
The GWAS SNPs selected for analysis in the present study were traced to the sub chromosomal loci of 17 different chromosomes which indicated the presence of more than one SNP on one chromosome. The LD analysis identified 15 SNPs in 105 pair wise combinations (Additional file 1: Fig. S1, Figure legends). Overall, a disrupted LD pattern was seen with only 5 SNP pairs which reported r2 > 0.8 and none of the haplotypes were found to be significantly associated with CAD. On the other hand, neither pair wise nor the multiple SNP interaction analysis (generalized multifactor dimensionality reduction-GMDR) of these SNPs yielded any significant epistatic interactions associated with CAD among them.
Cumulative risk score analysis for CAD associated SNP variants
The cumulative risk score analysis involved computation of weighted mean proportion of the risk alleles of the five significant SNPs by taking 2 for two risk alleles, 1 for one risk allele and 0 for no risk alleles with weights as relative log odds ratios of different SNPs. The cumulative risk score was obtained for each individual by multiplying with 5, the number of significantly associated SNPs. The individuals were grouped into 4 risk categories with increasing risk scores. Odds ratios and Z-scores were calculated by taking risk category 1 as the reference and the results were presented in Table 5. The frequency distribution of CAD cases and controls in different risk categories showed an increasing trend in the frequency of cases relative to controls specifically for the high-risk score categories 3 and 4 with risk scores in the range of 1.10–2.09 and 2.10–6.09 respectively. Furthermore, an increasing trend in the odds ratios and Z-scores was also apparent with increasing risk categories. The ROC curve (Fig. 1) yielded an area under curve (AUC) as 0.749 (95%CI 0.713–0.785) at p < 0.0001 which was statistically highly significant indicating possible predictive utility of the associated SNPs in CAD pathology.
Epistatic interaction of the 61 SNPs with the SNPs of 11q23.3 and 9p21.3 regions
The process of atherosclerosis is a result of disruption in lipid metabolism and cell proliferation pathways that was evident from significant association of few SNP variants at 11q23.3 and 9p21.3 loci respectively. While there were no significant pair wise interactions among the SNPs of the present study, it would be interesting to test if interaction of any of these SNPs with those in the 11q23.3 and/or 9p21.3 region play significant role in the manifestation of CAD. The pair wise interaction analysis suggested that the rs7582720 of WDR 12 gene on chromosome 2, albeit not individually associated with CAD, was found to be significantly conferring risk through epistatic interaction with the two SNPs (rs6589566, rs1263163) of 11q23.3 region which were found to be located in ZPR1 gene and intergenic region of APOA5-APOA4 genes respectively (Table 6).
The case–control studies conducted so far among the Indian populations could help in identifying the CAD associated candidate genes which belonged to the most replicated GWAS loci such as 9p21.3 and 11q23.3 [14,15,16]. However, given the complex nature of the disease and the large number of candidate genes associated with CAD worldwide, these studies were not adequate to characterize the genetic susceptibility profile of either the local or regional populations. Further, the Indian studies could validate only a few conventional polymorphisms of the genes located in these two chromosomal regions. Given the distinct nature of association of the SNP variants of 11q23.3 and 9p21.3 regions with CAD in our earlier studies of the same cohort involving relatively large number of SNPs [7,8,9], the association of variants from the other genomic loci with functional significance to CAD were evaluated in the present study and observed unique pattern of association of the SNPs in this Southern Indian population of Hyderabad as compared to the other populations from India and elsewhere [1, 14].
We found significant association of four of the 45 SNPs in elevating risk for CAD of which rs10455872(A > G) was highly significant with relatively much higher value of odds ratio (Table 1). This SNP was also consistently associated with high risk in different phenotype specific cohorts based on anatomical and clinical categories including the TVD, ACS and MI. Significant association of this SNP was also validated before in other studies wherein it was shown to be associated with acute myocardial infarction , coronary lesions in Brazilian patients submitted to coronary angiography  and calcific aortic valve disease in Bulgaria . The Lipoprotein(a) [Lp(a)] is a low-density lipoprotein (LDL) bound to apo(a), a plasma apolipoprotein [20, 21] which was shown to be involved in lipoprotein metabolism. Lp(a) was considered as a candidate gene of CAD and high concentration of this protein was found to be a documented risk factor [22, 23]. Even though Indians were known to have a unique pattern of dyslipidemia usually characterized by low levels of LDL cholesterol with predominantly atherogenic and small-dense LDLs [24, 25], a study from the north Indian population did not show association of this SNP with CAD . However, a recent study revealed that the variant G allele of the SNP rs10455872 was associated with increased Lp(a) protein levels and aortic valve calcification . This SNP was found to be located in the intron 25 of lipoprotein Lp(a) gene on chromosome6 (Additional file 2: Table S1). Since the SNP was found to be significantly associated with CAD in our study, we tried to understand the effect of variant allele (G) on the protein structure or function through in silico analysis. Alternative Splice Site Predictor (ASSP) analysis indicated that the presence of G allele might result in the formation of cryptic donor splice site and/or alternative isoform (http://wangcomputing.com/assp/). Hence, it could be suggested that the presence of variant allele G of rs10455872 in the intronic region of the Lp(a) gene might affect splicing and rate of gene expression resulting in defective lipoprotein metabolism and subsequent development of CAD.
The second highly significant SNP associated with CAD was rs6725887 with an odds ratio of 8.58. This SNP was found to be located in the intronic region of WDR12 (WD repeat domain12) gene on chromosome2 which encodes for a ribosome biogenesis protein. Other studies did not find association of this SNP with CAD risk factors such as hypertension, lipid traits and atherosclerosis [28, 29]. Further, meta-analysis of this SNP did not show significant association with CAD in the sub-Asian populations . In contrast, the rs9818870 located in the MRAS (muscle RAS oncogene homolog) gene on chromosome3 (3q22.3), which appeared to decrease the risk for CAD in the present study, was found to be the most replicated SNP identified in GWAS susceptible CAD loci . Despite the fact that there were no other validating studies to show the association of SNPs rs782590 and rs173539 with CAD, the present study found significant association of these SNPs with increased risk for CAD. Even though the functional role of these SNPs was not characterized yet, it could be suggested that the proteins coded by the genes in which these SNPs are located might have significant effect in the process of development of CAD.
The four significant CAD associated risk SNPs in the pooled cohort are located on different genes whose protein products have primary role in lipid metabolism (LP(a)) and other following events such as cell division and proliferation (WDR 12, SMEK1, HERPUD1-CETP) in atherosclerosis and development of CAD. The distinct patterns of association of the SNPs with anatomical and phenotypic categories of CAD suggested the possible role of various SNPs in contributing to the genetic and etiologic heterogeneity of CAD, albeit relatively larger sample size for the subcategories of CAD would have provided sufficient statistical power and greater degree of confidence in the inference. Furthermore, the pair wise SNP associations observed between one of the SNPs of the present study (rs7582720 of WDR 12 gene on chromosome2) and three SNPs of 11q23.3 region (belonging to ZPR1, APOA5-APOA4 and, BUD13 genes) also suggested unique epistatic interactions of SNPs in CAD pathology. Variants of 11q23.3 chromosomal region which showed epistatic interactions in the present study are found to be located in apolipoprotein encoding or regulating genes hence might be involved in defective cholesterol homeostasis thereby resulting in increased levels of oxidised low-density lipoproteins (LDLs). Subsequently, the WDR 12 gene on chromosome 2 harbouring the SNP rs7582720 encode for a ribosome biogenesis protein involved in cell proliferation which might play a role in the later event of atherosclerosis. Therefore, the complex nature of CAD phenotype might be suggested to be the outcome of interactions of different genomic loci. However, the in vitro studies on the expression levels of these genes might help in validating the hypothesis.
Although candidate genes and GWAS loci for CAD were found to be replicated in the present study and in few other population-based studies, the SNP variants showing association were found to be different for different populations [2, 31] suggesting the population specific association patterns of GWAS or candidate gene variants with CAD development and/or progression. Interestingly, the current study which attempted to comprehensively explore the pattern of association of variants by considering large number of SNPs across the genomic regions also yielded a different set of SNPs in the population of Hyderabad, India which was evident by the association of rs10455872 of LP(A) gene that showed profound risk for CAD in this regional Indian population. Perhaps, this can be a new and significant observation of our study. Earlier studies on the Indian population did not find the association of this SNP with CAD and although the GWAS for CAD identified LP(A) as one of the candidate genes, it revealed the association of a different SNP (rs3798220) of this gene with CAD. The other four SNPs associated with CAD were also unique for the current study population which were either not studied or not shown to be associated before in the Indian population. Overall, the present study observed that the variants associated with CAD in the population of Hyderabad are unique and with high discriminative power suggesting the possibility of their utility as predictors of the risk for CAD.
The present study could help in identifying unique set of SNPs significantly associated with CAD specifically in the Southern Indian population of Hyderabad. The highly significant odds ratios and their magnitudes observed in the present study suggested that the SNPs rs10455872 and rs6725887 might have independent role, but not in additive fashion, as disease causing variants in CAD which demands further studies and analyses of these SNPs in different populations and by excluding the confounding nature of the effects of associated parameters such as lipid profiles to confirm its independent effect. The distinct pattern of association of different SNPs observed with respect to anatomic and phenotypic categories of CAD also suggested genetic and etiologic heterogeneity of CAD and warrants that CAD patients may be screened for these SNPs in order to explore the genetic susceptibility profile behind the clinical heterogeneity of CAD. However, given the relatively smaller sample sizes for the sub-phenotypes, the present study can be considered exploratory to establish a population wide SNP association pattern for CAD by using specifically designed large scale studies among different populations of India, both local and regional.
Availability of data and materials
The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.
Genome wide association studies
Single nucleotide polymorphisms
Coronary artery disease
National Human Genome Research Institute
Receiver operating characteristics
Area under curve
Single vessel disease
Double vessel disease
Triple vessel disease
Acute coronary syndrome
Hardy Weinberg equilibrium
Minor allele frequency
Generalized multidimensionality reduction
Serine/threonine protein phosphatase/suppressor of mek1
Homocysteine-inducible, endoplasmic reticulum stress-inducible, ubiquitin-like domain member 1
Cholesteryl ester transfer protein
WD repeat domian12
Zinc finger protein 1
Apolipoprotein A5-apolipoprotein A4
Bud homolog 13
Muscle RAS oncogene homolog
Low density lipoprotein
Maouche S, Schunkert H. Strategies beyond genome-wide association studies for atherosclerosis. Arterioscler ThrombVasc Biol. 2012;32:170–81.
Pranavchand R, Reddy BM. Current status of understanding of the genetic etiology of coronary heart disease. J Postgrad Med. 2013;59:1.
van der Harst P, Verweij N. The identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease. Circ Res. 2017;122:433–43.
Braun TR, Been LF, Singhal A, Worsham J, Ralhan S, Wander, GS. A replication study of GWAS-derived lipid genes in Asian Indians: the chromosomal region 11q23.3 harbors loci contributing to triglycerides. PLoSOne. 2012;7(5):e37056.
Willer CJ, Schmidt EM, Sengupta S, Peloso GM, Gustafsson S, Kanoni S. Discovery and refinement of loci associated with lipid levels. Nat Genet. 2013;45:1–24.
Jeemon P, Pettigrew K, Sainsbury C, Prabhakaran D, Padmanabhan S. Implications of discoveries from genome-wide association studies in current cardiovascular practice. World J Cardiol. 2011;3:230–47.
Pranav Chand R, Kumar AS, Anuj K, Vishnupriya S, Mohan Reddy B. Distinct patterns of association of variants at 11q23.3 chromosomal region with coronary artery disease and dyslipidemia in the population of Andhra Pradesh, India. PLoS ONE. 2016;11(6):e0153720.
Rayabarapu P, Arramraju SK, Battini MR. Genetic determinants of clinical heterogeneity of the coronary artery disease in the population of Hyderabad. India Human Genomics. 2017;11:3.
Gorre M, Rayabarapu P, Irgam K, Reddy BS, Reddy BM. The SNP rs7865618 of 9p21.3 chromosomal region emerges as the most promising marker of the pathogenic process of coronary artery disease in the Southern Indian population. Scientific Reports. 2020;DOI:https://doi.org/10.1038/s41598-020-77080-4.
Reddy BM, Naidu VM, Madhavi VK, Thangaraj LK, Kumar V, Langstieh BT. Microsatellite diversity in Andhra Pradesh, India: genetic stratification versus social stratification. Hum Biol. 2005;77:803–23.
Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945–59.
Falush D, Stephens M, Pritchard JK. Inferences of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics. 2003;164:1567–87.
Sambrook J, Fritschi EF, Maniatis T. Molecular Cloning: A Laboratory Manual. Newyork: Cold Spring Harbor Laboratory Press; 1989.
Kumar J, Yumnam S, Basu T, Ghosh A, Garg G, Karthikeyan G. Association of polymorphisms in 9p21region with CAD in North Indian population: Replication of SNPs identified through GWAS. Clin Genet. 2011;79:588–93.
Ashok KM, Emmanuel C, Dhandapany PS, Rani DS, SaiBabu R, Cherian KM. Haplotypes on 9p21modify the risk for coronary artery disease among Indians. DNA Cell Biol. 2011;30:105–10.
Pranavchand R, Reddy BM. Genomics era and complex disorders: Implications of GWAS with special reference to coronary artery disease, type 2 diabetes mellitus, and cancers. J Postgrad Med. 2016;62:188–98.
Werner K, Jakob CM, Matthias S, Hannah W, Johannes K, Albert S. Two rare variants explain association with acute myocardial infarction in an extended genomic region including the apolipoprotein(A) gene. Annals of Human Genetics. 2012;https://doi.org/10.1111/j.1469-1809.2012.00739.x.
Paulo CJLS, Carolina TB, Pedro AL, José EK, Alexandre CP. LPA rs10455872 polymorphism is associated with coronary lesions in Brazilian patients submitted to coronary angiography. Lipids Health Dis. 2014;13:74.
Tomova V, Alexandrova M, Atanasova M, Rashev T, Tzekova M. Polymorphism rs10455872 at the lipoprotein(A) gene locus enhances the risk of aortic valve disease. J Cardiol Cardiovasc Ther. 2018;9(4):555766.
McLean JW, Tomlinson JE, Kuang WJ, Eaton DL, Chen EY, Fless GM. cDNA sequence of human apolipoprotein(a) is homologous to plasminogen. Nature. 1987;330:132–7.
Guevara J, Knapp RD, Honda S, Northup SR, Morrisett JD. A structural assessment of the apo(a) protein of human lipoprotein(a). Proteins. 1992;12:188–99.
Danesh J, Collins R, Peto R. Lipoprotein(a) and coronary heart disease. Meta-analysis of prospective studies Circulation. 2000;102:1082–5.
Nordestgaard BG, Chapman MJ, Ray K, Borén J, Andreotti F, Watts GF. Lipoprotein(a) as a cardiovascular risk factor: current status. Eur Heart J. 2010;31(23):2844–53.
Sekhri T, Kanwar RS, Wilfred R, Chugh P, Chhillar M, Aggarwal R. Prevalence of risk factors for coronary artery disease in an urban Indian population. BMJ Open. 2014;4:e005346–e005346.
Joshi SR, Anjana RM, Deepa M, Pradeepa R, Bhansali A, Dhandania VK. Prevalence of dyslipidemia in urban and rural India: The ICMR–INDIAB Study. PLoS ONE. 2014;9(5):e96808.
Kashyap S, Kumar S, Agarwal V, Misra DP, Rai MK, Kapoor A. The association of polymorphic variants, rs2267788, rs1333049 and rs2383207 with coronary artery disease, its severity and presentation in North Indian population. Gene. 2018;30(648):89–96.
Guillermo CS, Jose MF, Shamar LF, Margarita TT, Carlos PR, Gilberto VA. The rs10455872-G allele of the LPA gene is associated with highlipoprotein(a) levels and increased aortic valve calcium in a Mexican adult population. Genet Mol Biol. 2019;42(3):519–25.
Robert R, Alexandre FRS. Genes and coronary artery disease. Where are we?. Journal of the American College of Cardiology. 2012;60(18).
Michel Z, Isaac S, Carla LG, Sergi SB, de Eric G, Roman A. Association between coronary artery disease genetic variants and subclinical atherosclerosis: An association study and meta-analysis. Rev Esp Cardiol. 2015;68(10):869–77.
Yi H, Rajkumar D, Xuling C, Ling W, Chiea-Chuen K, Xueling S. Genome-wide association study identifies a missense variant at APOA5 for coronary artery disease in Multi-Ethnic Cohorts from Southeast Asia. Sci Rep. 2017;7:17921.
Pranav Chand R, Reddy BM. Genomics era and the complex disorders: Implications of GWAS with special reference to coronary heart disease, type 2 diabetes mellitus and cancers. Journal of Post Graduate Medicine. 2017;62:188–98.
The present study is a part of the major project on CAD entitled “Identification of Susceptible Genetic Polymorphisms Associated with Coronary Artery Disease in the Southern Indian Population of Hyderabad” carried out by the corresponding author at the Indian Statistical Institute (ISI), Hyderabad during 2011-2016. BMR is thankful to the Director General, ICMR, for awarding him the Emeritus Medical Scientist (EMS) position during which this manuscript was generated, and for granting SRF position to Mrs. I. Kumuda, to work in the EMS project and subsequently to work for her Ph.D. under his supervision; to the Director(s) of the Indian Statistical Institute (ISI), Kolkata, for financial and logistics support at different stages of this project of which the present work is an extension, and Head, Department of Genetics and administration of Osmania University for logistic support during the tenure of Emeritus Scientist position.
This study is funded by the Indian Statistical Institute, till genotyping which was accomplished when BMR was Professor of the Indian Statistical Institute at Hyderabad.
Ethics approval and consent to participate
The study protocol was approved by the ‘Indian Statistical Institute Review Committee’ and was performed in accordance with the relevant guidelines and regulations for ‘Protection of Research Risks to Humans’ (Declaration of Helsinki). Written informed consent for all the participants is obtained as per the guidelines.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Battini Mohan Reddy: Sample collection, DNA isolation and genotyping of the SNPs for this study accomplished when BMR was working as Professor of Indian Statistical Institute, and PR was an SRF associated with BMR as a Ph.D student. BMC Cardiovascular Disorders, 2021.
Linkage disequilibrium plot of GWAS SNPs. In the LD plot, each square/block displays the magnitude of LD in terms of D’ value for a pair of markers. The strength of LD between markers is indicated by the colour intensity of the box. LD ranges from 0–100 which is denoted as D’ (0-1). D’ value < 0.30 indicates low LD score, D’ 0.50–0.70 indicates moderate LD and, D’ > 0.70 indicates high LD scores between the markers. Red colour boxes indicate high LD scores between markers and, the boxes in a block represent haplotype combinations.
Chromosomal, gene locations and functions of the 61 SNPs selected for the present study in Coronary Artery Disease
Minor allele and Genotype frequencies of 61 SNPs in CAD cases and controls
Subset analysis (30%, 50%, 70%) of CAD cases and controls to check for the internal consistency and replicability of significant SNPs.
About this article
Cite this article
Gorre, M., Rayabarapu, P., Battini, S.R. et al. Analysis of 61 SNPs from the CAD specific genomic loci reveals unique set of SNPs as significant markers in the Southern Indian population of Hyderabad. BMC Cardiovasc Disord 22, 148 (2022). https://doi.org/10.1186/s12872-022-02562-4