Data from all individuals who volunteered for resting ~5-min high-fidelity ECG studies from 2001 through mid-2007 (training set) or thereafter (test set) were considered for inclusion. These included data from: 1) Cardiac clinic patients who volunteered for individual studies at any of the following clinical sites: Texas Heart Institute (Houston, TX); the University of Texas Medical Branch (Galveston, TX); the University of Texas Health Sciences Center (San Antonio, TX); Brooke Army Medical Center (San Antonio, TX); St. Francis Hospital (Charleston, WV); the Universidad de los Andes (Mérida, Venezuela); and Lund University Hospital (Lund, Sweden); and 2) Asymptomatic individuals who volunteered as "controls" at any of the following sites: Johnson Space Center (Houston, TX); the Universidad de los Andes and Lund University Hospital. For the test set, additional data from patients whose ~5-min ECGs had been collected at the Charleston Area Medical Center as part of earlier studies but that became available to us during 2007 (i.e., the STAFF III database) were also utilized. All participants gave original informed consent, and the Institutional Review Boards of one or more of the institutions approved the studies.
For both the training and the test sets, to define our "Disease" groups, we included data only from those cardiac clinic patients whose disease (CAD, LVH and/or LVSD) was proven based on ECG-independent information derived from standard clinical imaging tests [16, 21–23] performed within one month of ECG testing by investigators or other clinicians blinded to the automatically-produced A-ECG results. Disease was defined as the presence of at least one of the following: 1) CAD, defined as a coronary angiogram showing at least one obstruction ≥50% in at least one major native coronary vessel or coronary graft, or, if for clinical reasons angiography was not performed, then one or more reversible perfusion defects on 99 m (Tc)-tetrofosmin single-photon emission computed tomography (SPECT); [16, 21, 23]2) LVH, defined as moderate or greater concentric hypertrophy or concentric remodeling according to the guidelines of the American Society of Echocardiography; and/or 3) LVSD of any etiology, defined as LVEF <50% by echocardiography, cardiac magnetic resonance imaging (CMR) or SPECT. Diseased individuals who met none of these three inclusion criteria but who had isolated right ventricular pathology, isolated LV diastolic dysfunction, isolated LV cavity enlargement or isolated fixed defect on SPECT were excluded from the study.
To derive correspondingly definitive "Healthy" groups for both the training and test sets, we included data only from low-risk asymptomatic controls who had no evidence of cardiovascular or other systemic disease based on a negative history and physical examination. Asymptomatic controls who were hypertensive (BP≥140/90), receiving treatment for hypertension, diabetic or active smokers were excluded. All cardiac clinic patients or asymptomatic individuals who had complete bundle branch block, sinus tachycardia, non-sinus rhythm, paced rhythm, pre-excitation, or an incomplete ECG recording were also excluded from both the training and test sets.
Of the 952 individuals who were considered for the training set, 708 met the above inclusion criteria, including 290 for the Disease group training set and 418 for the Healthy group training set. Of the 290 patients constituting the Disease group training set, 188 had normal LV function (136 had CAD; 25 had LVH; and 27 had both CAD and LVH) and constituted a "Disease without LVSD" training subset, whereas another 102 had LVEF <50% (77 with ischemic cardiomyopathy; 25 with nonischemic dilated cardiomyopathy) and constituted a "Disease with LVSD" training subset. Of the 418 controls in the Healthy group training set, a majority also had their disease-free status further demonstrated through normal or unremarkable results on a conventional or SPECT exercise stress test, echocardiogram, and/or CMR test performed for research purposes within 2 years of their ~5-min ECG. These included 55 elite, endurance-trained normotensive Swedish athletes (38 males) who had had clinically unremarkable CMR results.
Data for the test set were obtained from an additional 315 individuals, including from an additional 208 diseased patients and an additional 107 healthy controls. The 208 individuals in the Disease group test set consisted of 139 patients with CAD, 17 with concentric LVH, 11 with both CAD and LVH, and 41 with LVSD (27 with ischemic and 14 with nonischemic dilated cardiomyopathy). The Healthy group test set consisted of 107 consecutive individuals over age 35 (including 9 elite athletes) who met the Healthy group inclusion criteria, recruited after mid-2007 mainly at NASA's Human Test Subject Facility in Houston. Within the Disease Group test set, data for 97 of the 208 patients came from the pre-procedural portion of the STAFF III database. Since all patients in the STAFF III database had catheterization-proven CAD but unreported LV function, their data, as well as data from another 26 diseased patients with unknown LV function were by necessity withheld from the LVSD-related sub-analyses in the test set.
ECG data collection and analyses
At all sites, a high-fidelity (1000 samples/sec) computerized 12-lead ECG system (Siemens-Elema AB, Solna, Sweden or CardioSoft, Houston, TX) was used to acquire at least 256 waveforms acceptable for signal averaging and variability analyses.
A. Conventional ECG parameters and criteria
Signals from the first 10 sec of the conventional ECG recording were analyzed automatically in software to quantify all major intervals, axes, and voltages as well as ST segment levels. Initial candidate criteria used for defining these strictly conventional 12-lead ECGs as "abnormal" were: 1) LVH according to traditional Sokolow-Lyon voltage criteria (SV1 + RV5 or RV6 ≥3.5 mV) or to gender-specific Cornell voltage (RaVL + SV3 ≥2.8 mV in men or ≥2.0 mV in women) or Cornell product (244 mV*ms with a 0.8 mV adjustment for women) criteria;2) old infarction according to Anderson et al's subset of Selvester's criteria;3) resting ST depressions or T-wave abnormalities according to computerized Minnesota Codes 4.1 to 4.4 and 5.1 to 5.3; 4) prolonged QTc (≥450 ms in men and ≥460 ms in women) or QRS (>110 ms) interval (individuals with complete bundle branch blocks being excluded from the study); or 5) left axis deviation (≤-30°).
B. Advanced ECG parameters obtained after signal averaging
Signal averaging was performed over the entire ~5-min (256-beat) recording using software developed by the authors[10, 13] to generate results for parameters of: 1) 12-lead HF QRS ECG;2) derived 3-dimensional ECG, using the regression-related Frank-lead reconstruction technique of Kors et al to generate several vectocardiographic parameters, including for example the spatial mean QRS-T angle,[6, 8, 28] the spatial maximums ("peaks") QRS-T angle and the magnitude,  direction and beat-to-beat variation of the spatial ventricular gradient and its components; and 3) QRS and T-waveform complexity via SVD, to derive for example the principal component analysis (PCA) ratio,[11, 13, 30] the relative residuum[12, 13] and the dipolar and nondipolar voltage equivalents of the QRS and T waveforms. The majority of these parameters and their related detailed methods have been described in other recent publications[10, 13, 31]. We also generated results for several other potentially promising parameters (see Additional file 1: Supplemental Table 1 for partial list), including, for example, for the spatial ventricular activation time  and the total integral of the Z-lead QRS complex above 5 Hz ("Z integral").
C. Advanced parameters derived from variability analyses
Several parameters of 256-beat RRV and QTV described in previous publications[17, 31, 34] were again evaluated via custom software programs. These included the QT variability index (QTVI), but using the means and variances of the RR interval rather than those of the heart rate in the denominator of the QTVI equation, and the "unexplained" part of QTV[31, 34].
Statistical Analyses (including generation, validation and testing of A-ECG scores)
Using the training set, promising candidate subsets of ECG parameters for potential inclusion in primary ("Healthy versus Disease") and secondary ("Disease with versus without LVSD") A-ECG scores were first identified using a branch-and-bound feature selection procedure  implemented in SAS 9.1.3 (Cary, NC). To avoid the so-called "curse of dimensionality", the number of ECG parameters incorporable into any potential A-ECG score was limited to fewer than one-tenth of the minimum number of training samples available in a given group or subgroup. Logistic regression was used to retrospectively estimate the probability of any subject in the training set being a member of the Disease group, and of any diseased subject in the training set being a member the "Disease with LVSD" subgroup, based strictly on his/her A-ECG-based independent variables and a cutoff predicted probability of >0.5. The best candidate subsets of parameters (A-ECG scores) were then further validated by bootstrap analysis in which for each fixed score, the data were iteratively resampled 1000 times and the logistic regression coefficients for each parameter in the given score re-estimated. The bootstrap analyses, implemented in Stata 10.0 (College Station, TX), revealed not only the variability in the coefficients, but also those candidate A-ECG scores that should be discarded because of their doubtful utility for classifying later subjects in the test set, for example scores with coefficients that varied greatly or that did not have the expected sign over all 1000 bootstraps. Prior to subsequent evaluation in the test set, the bootstrap-validated A-ECG scores were further evaluated within the training set via a jackknife procedure in which the score's sensitivity, specificity, accuracy or predictive values were assessed by using the data for all but one observation in the training set to classify the omitted observation, then repeating the process for each observation in turn. Comparisons of accuracies (sensitivities and specificities) and predictive values between strictly conventional and A-ECG classifiers were performed using Cochran's Q and Wald tests, respectively, the latter employing the difference-based weighted least squares method. For simple illustrative comparisons between groups, the Wilcoxon rank sum and receiver operating curve characteristic statistics were used.