The effect of ALDH2 rs671 gene mutation on clustering of cardiovascular risk factors in a big data study of Chinese population: associations differ between the sexes

Background The ALDH2 rs671 genetic polymorphism has been linked with cardiovascular diseases (CVDs), but comprehensive epidemiological studies are lacking. An observational, retrospective big data study was carried out to evaluate the associations between this polymorphism and clustering cardiovascular risk factors (CRFs) in a Chinese population. Methods A total of 13,101 individuals (8431 males and 4670 females) were enrolled. Genetic polymorphism was assessed using gene mutation detection kits, coupled with an automatic fluorescent analyzer. Other data were obtained from the records of the Department of Health Care at Peking Union Medical College Hospital. Results Comparing the concentrations of common biochemical analytes, including BMI, SBP, DBP, ALT, AST, γ-GT, TBil, Cr, Glu, TC, TG, and HDL-C among individuals with the GG, GA, and AA genotypes of ALDH2 rs671, we found significant differences in males (all p < 0.001), but not in females. For males, the frequencies of hypertension, diabetes, and obesity were significantly higher for GG than for GA or AA (all p < 0.05). However, there was no significant difference for dyslipidemia, and no significant associations were observed for all frequencies in females. The prevalence of individuals with 1–4 CRFs was significantly higher among GG males than those carrying GA or AA, and fewer GG males had non-CRFs (all p < 0.05). Conclusion Polymorphisms of ALDH2 rs671 are associated with clustering CRFs, especially hypertension and diabetes in males, but not in females. These associations are likely mediated by alcohol intake, which is also associated with this gene.

clustering CRFs is greater than the effect of single CRFs on the same individual [2].
Alcohol is one of the most widely used recreational substances worldwide, and its intake is a leading risk factor for global disease burden, including CVDs [4][5][6][7]. Despite general recognition that alcohol intake has a negative effect on health, it has been estimated that the average ethanol consumption of a person aged more than 15 years is approximately 19.7 mL per day [8]. Other data suggest that global adult per-capita consumption is estimated to increase from 6.5 L (95% CI: 6.0 l-6.9 L) in 2017 to 7.6 L (95% CI: 6.5-10.2 L) by 2030 [9].
As an essential bioactivating enzyme, ALDH2 can degrade acetaldehyde to nontoxic acetic acid. It is encoded by the ALDH2 gene, which is commonly polymorphic in East Asian populations [5]. It has been reported that as many as 30-50% of East Asians carry an inactive form of ALDH2-rs671 resulting from a single G-to-A transition causing replacement of glutamate to lysine at position 504, and drastically reducing the carrier's capacity to metabolize alcohol [10][11][12]. The frequency of the A allele was reported to be 0.21 in China [13].
ALDH2 activation has also been found to be associated with improved mitochondrial function and the remodeling of ventricular function [14,15], and many studies have reported an association between ALDH2 and CVDs [1,2,5,13,15,16]. The most important known feature of the myocardial cardio-protective role of ALDH2 is the clearance of toxic aldehydes such as 4-hydroxynonenal and its adducts, which can be induced by acute oxidative stress upon cardiac ischemia or reperfusion [17][18][19]. Activation of ALDH2 may slow down the progression of atherosclerosis via attenuation of endoplasmic reticulum stress and apoptosis in smooth muscle cells [16].
Genetic association studies have recently shown that the ALDH2 rs671 polymorphism is a significant risk factor for hypertension, diabetes, and coronary heart diseases in Asian people [20,21]. Although a number of studies have focused on the association between ALDH2 and single CRFs such as hypertension, diabetes, obesity, and dyslipidemia, and analyses [20,21], the association has not been clearly defined. Thus, detailed studies focused on the association between ALDH2 and clustering CRFs are needed. Interestingly, there is an increasing interest in obtaining annual routine physical examination in China, which has resulted in more data on the health status of the population. Using data from hospital and laboratory information systems is not only cost-effective but also efficient.
Therefore, this retrospective study, which is based on clinical big data, aimed to (1) evaluate the distribution of ALDH2 rs671 genotypes, (2) evaluate the prevalence of single and clustering CRFs in China, and (3) explore the association between ALDH2 rs671 genotypes and CRFs.

Data collection
The study included 13,101 patients aged ≥19 years old. Data including demographic information, common biochemical analytes, and medical history from November, 2013 to October, 2018, were obtained from the hospital information system (HIS) and laboratory information system (LIS) of the Department of Health Care at Peking Union Medical College Hospital (PUMCH). With a unique identification code identifying duplicated measurements, only the first record of each person was saved.

Laboratory measurement
Genomic DNA was extracted from whole peripheral blood via DNA extraction kits (Tianlong Technology Co. LTD, Xi'an, China) and rs671 polymorphism status was determined by an ALDH2 gene mutation detection kit, coupled with an automatic fluorescent analyzer (Beijing market gene technology Co. LTD, Beijing, China). Height, weight, and blood pressure were measured by well-trained nurses and doctors, and body mass index (BMI) was calculated as weight divided by height squared. Common biochemical analytes including Albumin (Alb), alanine aminotransferase (ALT), Aspartate aminotransferase (AST), glutamyl transpeptidase (γ-GT), total bilirubin (TBil), creatinine (Cr), glucose (Glu), total cholesterol (TC), triglyceride (TG), high density lipoprotein cholesterol (HDL-C), and low density lipoprotein cholesterol (LDL-C) were measured by a Roche C8000 automatic analyzer (Roche C8000, Basel, Switzerland) with corresponding reagents, calibrators, and quality control materials. All records including quality control and external quality assessment during this period were reviewed and deemed sound.

Statistical analysis
Excel 2010 (Microsoft Inc., USA), SPSS 20.0 software (SPSS Inc., Chicago, IL, USA), and Graphpad prism for Windows (GraphPad Software, San Diego, CA), were used for our statistical analyses. The Mann-Whitney U or Kruskal-Wallis tests were used to compare measurements among groups, and the comparisons of prevalence were conducted by Chi-square test. Multivariate logistic regression analysis was used to correct for covariates and calculate the odds ratios (ORs), with 95% confidence intervals (CIs), of genotype associations with CRFs. The results were considered statistically significant when the two-sided p-value was < 0.05.

Basic characteristics of the studied population
The baseline demographic and clinical characteristics of studied individuals divided by ALDH2 polymorphism and sex are shown in Table 1. In total, 13,101 individuals including 8431 males and 4670 females were eventually included. The distribution of age was (49 ± 9) years old, and BMI was (24.8 ± 3.8) kg/m 2 . There was no difference in age by ALDH2 polymorphism in either males or females. However, common clinical measurements including BMI, SBP, DBP, ALT, AST, γ-GT, TBil, Cr, Glu, TC, TG, and HDL-C were significantly different in males (all p < 0.001), though not in females.

ALDH2 rs671 genotype frequency by sex and age
The distribution of ALDH2 rs671 gene polymorphism among different years (from 2013 to 2018) did not show significant differences (p = 0.946). As Fig. 1 and Supplemental Table 1 show, the frequencies of the ALDH2 rs671 genotypes GG, GA, and AA in the total population were 67.9, 29.4, and 2.7%, respectively. These frequencies did not differ significantly by sex. Although there was no significant difference of the overall age distribution of the different genotypes in either males or females, the frequency of AA in individuals aged ≥65 years old was lower than other age groups in both males and females, with the opposite distribution in evidence for GG. Also, the  Fig. 1 The frequency of ALDH2 rs671 genotype by sex frequency of GA in those aged between 19 and 29 years was higher than in other age groups, and the frequency of GG was significantly lower (Supplemental Table 1).

Prevalence of CRFs by rs671 genotype
The frequencies of CRFs associated with different rs671 genotypes by sex are shown in Table 2. For males, the frequencies of hypertension, diabetes, and obesity were significantly higher for GG than for GA or AA. However, there was no significant difference in the prevalence of dyslipidemia among the three rs671 genotypes. For females, there was no statistically significant difference in the prevalence of hypertension, diabetes, obesity, or dyslipidemia among the rs671 genotypes (all p > 0.05).

Prevalence of clustering CRFs by rs671 polymorphism
The non-CRFs were defined as individuals who did not have hypertension, obesity, diabetes, or dyslipidemia. The frequencies of non-CRFs were 15.3, 40.9, and 24.4% in males, females, and the total population. The respective frequencies of individuals with one, two, three, and four CRFs were 36.9, 32.4, 12.8, and 2.5% in males, 40.0, 14.7, 3.9, and 0.6% in females. The major cluster of CRFs comprised hypertension, diabetes, obesity, and dyslipidemia. The frequencies of clustering CRFs by rs671 genotype and sex are shown in Table 3. The sex-stratified frequencies of clustered CRFs among the rs671 genotypes were significantly different in males, but not in females. The frequencies of individuals with two, three, and four CRFs were significantly higher in the population with GG than in those with GA or AA in males, while among males with no CRFs, the frequency of GG was statistically lower than GA or AA. However, there was no significant difference between the frequencies of clustering CRFs and ALDH2 genotype in females.

Multivariate logistic regression analysis
Multivariate logistic regression analysis results are shown in Table 4. This analysis estimated OR with 95% CI for each variable, while adjusting for age and other risk factors. Compared with GG, males with GA and AA were less likely to have hypertension (GA: OR = 0.77, 95% CI: 0.69-0.85; AA: OR = 0.56, 95% CI: 0.41-0.75). Also, males with GG were more likely to have diabetes than those with GA (OR = 0.73, 95% CI: 0.62-0.87). There were no differences in overweight or dyslipidemia among male populations with GG, GA, and AA. For females, there was no significant difference among genotypes in hypertension, diabetes, obesity, or dyslipidemia. In males, though not in females, the proportions of GA and AA decreased with increasing numbers of CRFs.

Discussion
Based on the distribution of age, the enrolled individuals fairly reflected the distribution of Chinese adults, the frequency of GA, AA and AA during the whole 5 years was 29.4, 2.7 and 17.4%, similar to those of previous studies [13,24]. The distribution of ALDH2 rs671 gene polymorphism among different years (from 2013 to 2018) did not show significant differences (p = 0.946), which implied the reliability of the measurements without obvious carry-over.  ALDH2 activation, which plays key roles in clearing toxic aldehydes, improving mitochondrial function, and remodeling ventricular function, has been shown to be protective against the development of CVDs [14][15][16][17][18][19]25], suggesting that ALDH2 gene mutation should be harmful for human health. However, the results of clinical trials have been inconsistent, with many of them indicating a protective effect of the A allele against hypertension, dyslipidemia, and diabetes [3,4,21,26]. In this study, we found that the A allele may be more likely to be protective against clustering CRFs, especially hypertension and diabetes in males, though not in females. The contradictory results between basic research and clinical studies, and between males and females, could be explained by the influence of lifestyles, especially the amount and pattern of alcohol consumption. A study based on the China Kadoorie Biobank reported that 33% of males drank alcohol in most weeks, mainly as spirits, while only 2% of females did so [13]. Because of issues with alcohol tolerance, including uncomfortable feelings such as flush, dizziness, vomiting, and even exhaustion, individuals carrying the A allele, especially those with the AA genotype, usually drink less (GG: 157 g/week; AG: 37 g/week; AA: 3 g/week) [13]. Furthermore, alcohol intake has been found to be closely associated with an increased risk of CVDs [4][5][6][7], and reducing alcohol intake can lower blood pressure in a dose-dependent manner [25]. Therefore, it is very likely that the influence of the different ALDH2 rs671 gene polymorphisms on the prevalence of CRFs is substantially mediated by the amount and pattern of alcohol consumption. Interestingly, we also found that the frequency of AA in individuals ≥65 years old was lower than in other age groups, especially 18-29, with p = 0.01, which may imply that the ALDH2 rs671 mutation can induce other mortal diseases and aging independently of CVDs [26,27].
In this study, we found that, compared with GG carriers, males with GA and AA were less likely to have hypertension. Our results are consistent with a case control study which found that those carrying the A allele were at a lower risk of essential hypertension in males [AA/ AG vs. GG: OR (95% CI) = 0.76 (0.58-0.98)], but not in females [21]. However, our results are contrary to a crosssectional study, which found that the individuals with the rs671 A allele were at higher risk for the development of essential hypertension [28]. In that study, the association was not evaluated separately for males and females, and based on our data, that could have substantially influenced the results. Moreover, our data on the relationship between ALDH2 rs671 genotype and the distributions of TC, TG, and HDL-C, are consistent with previous studies [4,26,29]. However, the relationship between rs671 and the prevalence of dyslipidemia as such was not recognized in those studies [4,26,29]. Also, we found that the individuals with the rs671 A allele had lower Glu levels and lower prevalence of diabetes, though multivariate  [30]. Interestingly, another study found that the individuals with the A allele had a lower incidence of microvascular complications associated with alcohol consumption, but a higher incidence of macrovascular complications irrespective of alcohol consumption [31]. This also implied that the incidence of CRFs could be mediated by both genetics and lifestyle factors such as alcohol assumption.
Although there have been many other studies exploring and evaluating the association between ALDH2 genotype and many diseases including CVDs and their risk factors, most of them were animal experiments. Epidemiological studies did not emerge until recently, and most have focused on the association between ALDH2 and single CRFs, rather than clustering CRFs. In this study, we derived clinical big data from the HIS and LIS of PUMCH, which was simple, cost-efficient and a good reflection of the general population. With all individuals represented in PUMCH being analyzed over a fiveyear period by the same analytical systems, variation due different methods or facilitates was avoided, and the demographic information and clinical laboratory measurements were thorough. Furthermore, we were able to analyze hypertension, diabetes, obesity, and dyslipidemia simultaneously, while correcting for covariations via multivariate logistic regression analysis.
However, some limitations of this study are notable. Alcohol intake was not considered in the evaluation, and other important factors such as smoking and socioeconomic situation were also lacking. Also, in this crosssectional study, the major CRFs, including hypertension, diabetes, obesity, and dyslipidemia, were assessed based only on single test of the corresponding clinical measurements. Casual inferences from this study should therefore be avoided. In the future, long term follow-up cohort studies considering more details, especially the pattern of alcohol consumption, are needed to further explore the causal relationships suggested by our data.

Conclusion
Our study indicates that the ALDH2 gene polymorphism is associated with clustering CRFs, and that the rs671 A allele may be protective against clustering CRFs in males. This is likely mediated by alcohol intake or related lifestyle factors associated with this genetic variant.
Additional file 1: Supplemental Table 1. The frequency of ALDH2 rs671 genotype by age and sex.