Physical functional performance and prognosis in patients with heart failure: a systematic review and meta-analysis

Background Patients with Heart Failure (HF) show impaired functional capacities which have been related to their prognosis. Moreover, physical functional performance in functional tests has also been related to the prognosis in patients with HF. Thus, it would be useful to investigate how physical functional performance in functional tests could determine the prognosis in patients with HF, because HF is the leading cause of hospital admissions for people older than 65 years old. This systematic review and meta-analysis aims to summarise and synthesise the evidence published about the relationship between physical functional performance and prognosis in patients with HF, as well as assess the risk of bias of included studies and the level of evidence per outcome. Methods Major electronic databases, such as PubMed, AMED, CINAHL, EMBASE, PEDro, Web of Science, were searched from inception to March 2020 for observational longitudinal cohort studies (prospective or retrospective) examining the relationship between physical functional performance and prognosis in patients with HF. Results 44 observational longitudinal cohort studies with a total of 22,598 patients with HF were included. 26 included studies reported a low risk of bias, and 17 included studies showed a moderate risk of bias. Patients with poor physical functional performance in the Six Minute Walking Test (6MWT), in the Short Physical Performance Battery (SPPB) and in the Gait Speed Test showed worse prognosis in terms of larger risk of hospitalisation or mortality than patients with good physical functional performance. However, there was a lack of homogeneity regarding which cut-off points should be used to stratify patients with poor physical functional performance from patients with good physical functional performance. Conclusion The review includes a large number of studies which show a strong relationship between physical functional performance and prognosis in patients with HF. Most of the included studies reported a low risk of bias, and GRADE criteria showed a low and a moderate level of evidence per outcome.


Background
Cardiovascular diseases continue to be the leading cause of disability-adjusted life-years (DALYs) due to noncommunicable diseases and the leading cause of death [1][2][3]. Within cardiovascular diseases, Heart Failure (HF) is the only cardiovascular disease which is increasing in incidence and prevalence due to the aging of the world population, because its prevalence increases with age [4][5][6][7][8]. In addition, heart failure constitutes the most important hospital diagnosis in older adults, is the leading cause of hospital admissions for people older than 65 years old and contributes to the increase of medical care costs [5][6][7][8][9].
Heart Failure is characterised by a weak myocardium with decreased cardiac output that is unable to meet the body metabolic demands [4-6, 8, 10-12]. There are several functional symptoms that appear in patients with HF, such as reduced aerobic capacity, decreased muscle strength, low weekly physical activity and exercise intolerance, which are accompanied by fatigue and dyspnea symptoms [12][13][14][15][16][17]. Furthermore, patients with HF show impaired functional capacities, experience a declined ability to carry out their activities of daily living and suffer a reduced quality of life [12,14,17]. It has also been reported that patients with chronic HF show a slower gait speed than healthy subjects of the same age [18]. The maximal aerobic capacity has been inversely correlated to the severity of HF and has been directly correlated to the prognosis and the life expectancy [14,19,20]. Similarly, the lower extremities muscle mass and muscle strength have also been related to long-term survival in patients with HF [14,21]. Some functional tests have been used to predict prognosis in patients with HF. Thus, the 6-min walk test (6-MWT) has been proposed as a simple, inexpensive, safe and reproducible exercise test to assess functional capacity in patients with HF, which could also predict the prognosis of patients with HF based on distance walked [12,[22][23][24]. The Short Physical Performance Battery (SPPB) provides a useful and indirect measure of muscle functional capacity [12]. Moreover, the SPPB and the Timed Up and Go test (TUG) could be used to assess physical or functional frailty in patients with HF, which has been associated with an increased risk of hospitalisation and mortality in chronic heart failure [25,26]. The utility of Gait Speed has also been shown to predict functional independence loss, cardiovascular disease, hospitalisation, and mortality in older adults [27][28][29][30][31]. The 6-MWT measures the distance which patients can walk during 6 min [32]. The test is usually conducted in a closed corridor of 30 m where two marks are placed on the ground at a distance of 30 m, and patients walk from one end to the other, during 6 min [32]. The SPPB includes 3 tests: balance (feet together, semitándem and tandem during 10 s each), gait speed (4 m) and standing up and sitting on a chair 5 times. Each test is scored from 0 (worst performance) to 4 (best performance). The total score for the whole battery that is the addition of the 3 tests and ranges from 0 to 12 [33]. In the TUG test patients are sat down in a chair, and at the order to "go", they stand up from the chair, walk 3 m until a reach a line that is on the floor. Then, patients should turn, return to the chair walking and sit again [34].
Hence, it would be necessary to conduct a synthesis of evidence that explores the relationship between the physical functional performance in functional tests and the prognosis in patients with HF. A systematic review may permit the formation of firm conclusions through an exhaustive synthesis of data [35]. Thus, the aim of this study was to answer the following PECOS (P, participant; E, exposure; C, comparator; O, outcome; S, study design) question through a systematic review of the literature on observational longitudinal cohort studies (prospective or retrospective) (S): Do older patients with HF (P), who have poor physical functional performance in some functional tests, such as 6-MWT, SPPB, TUG or Gait Speed (E), show a worse prognosis (O) than those patients with good physical functional performance (C)?

Methods
The Systematic Review and Meta-analysis was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [36]. The systematic review protocol was registered at the International Prospective Register of Systematic Reviews (PROSPERO: CRD42020177427).

Data sources and search strategy
Two independent reviewers (IJF-A and AIC-V) conducted a systematic search using relevant search terms that were developed from Medical Subject Headings (MeSH) and keywords from other similar studies from inception to March, 24th 2020 using optimised search strategies in the following electronic databases: PubMed, AMED, CINAHL, EMBASE, PEDro, Web of Science (Additional file 1). A manual search of relevant eligible studies, to select any studies missed during the electronic search, was also conducted using cross-references identified in the reference lists within both original and review articles. The grey literature databases, such as New York Academy of Medicine Grey Literature Report, Open Grey and Google Scholar [37] were examined to identify any relevant unpublished data. References were exported, and duplicates were removed using the Mendeley desktop V.1.19.2 citation management software.

Eligibility criteria
The aforementioned PECOS framework was followed to determine which studies were included in the present systematic review and meta-analysis. Each study had to meet the following inclusion criteria: The exclusion criteria were as follows: 1. All studies that did not include an observational longitudinal cohort design (e.g cross-sectional studies, randomised controlled trials). 2. Studies exploring the prognosis value of functional tests in patients with other cardiovascular diseases different from HF. 3. Studies examining the relationship between physical functional performance in functional tests and other outcomes different from mortality or hospitalisation. 4. Studies investigating the prognosis value of physical activity assessed as daily activity, exercise time per week or physical activity scales.

Study selection
Two independent reviewers (IJF-A and AIC-V) carried out the screening of titles and abstracts to detect potentially relevant records and also excluded those documents that were not original papers. The same reviewers conducted the screening of those articles that met all inclusion criteria. A short checklist was carried out and followed in order to select the relevant studies (Additional file 2). In case of disagreements, the articles were always included.

Data extraction
Two independent reviewers (IJF-A and AIC-V) identified the following relevant data from each study: study details (first author and year of publication), region, setting, study design, sample size, functional tests with their cutoff points and characteristics of participants (mean age, %males), HF diagnosis, follow-up, outcome and main results. When necessary, an email was sent to the original authors to try to get OR or HR data that was not included in their original articles.

Quality assessment
The same two reviewers (IJF-A and AIC-V) assessed the risk of bias of the included observational longitudinal cohort studies using the Newcastle Ottawa Scale (NOS) [38]. The NOS has been decribed as a reliable and valid tool for assessing the quality of observational longitudinal cohort studies [38,39].

Data synthesis and analysis
To assess the overall quality and the strength of the evidence per outcome, the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach was used [40,41]. Two researchers (IJF-A and AIC-V) judged whether these factors were present for each outcome reported at least in two studies. Metaanalysis was conducted for each outcome reported in two or more studies, as long as studies assessed the same outcome with the same functional test and the same measurement unit, that is, HR or OR. Outcomes not included in the meta-analysis were reported using a descriptive quantitative analysis. Thus, the most relevant summary measure with the 95% Confidence Interval (95%CI) for each study was provided. The most relevant summary measure with its 95%CI was extracted of adjusted multivariate models when it was possible. In each meta-analysis it was decided to use the inverse variance as statistical method, fixed effects as analysis model and the HR or OR as effect measures. Heterogeneity was assessed using I 2 statistic [42,43]. Values of > 25% is considered as low heterogeneity, > 50% moderate heterogeneity, and > 75% high heterogeneity [42,43]. When heterogeneity was moderate or high, random effects were used as analysis model. Moreover, when metaanalyses included patients with HF with reduced (HFrEF) and preserved (HFpEF) ejection fraction or meta-analyses revealed high heterogeneity, as long as the outcome was reported by three or more studies, sensitivity analyses were conducted including studies dealing only with patients with HFrEF because the inclusion of patients with different ejection fraction could be a source of heterogeneity or could bias the results. The mean effect sizes, 95% CI, and I 2 were calculated for each outcome and used to create forest plots for visualization of each meta-analysis using the Review Manager (RevMan) version 5.3 [44].

Characteristics of included studies
A total of 3881 citations were identified through electronic databases, with 263 additional studies identified through Grey Literature Sources and 14 studies identified through manual search. One thousand six hundred seventy-one titles and abstracts were screened and 110 original papers were assessed. The number of studies retrieved from each database and the number of studies excluded in each screening phase are shown in Fig. 1 showed patients with HFrEF and HFpEF. The 6MWT was the most used test (n = 33) followed by the Gait Speed test (n = 8) and the SPPB (n = 4). The characteristics of the included observational longitudinal cohort studies are reported in Table 1.

Meta-analyses
The outcomes assessed by each study, as well as the main results, the risk of bias summary and the GRADE summary are shown in Table 2. Forest plots and effect sizes of each meta-analysis can also be seen in Additional file 5. Patients with HFrEF, HFpEF and acute HF who showed a poor physical functional performance in the 6MWT reported a larger risk of All-Cause of Mortality [HR = 2.29 95%CI (1.86-2.82), p < 0.001] than those patients who showed a good physical functional performance (Fig. 2a). Moreover, patients with HFrEF who           [51]. A score below 7 points on the SPPB was also associated with a larger risk of HF Hospitalisation (OR = 6.7 95%CI [1.5-30.4, p < 0.05]) in patients with acute HF [78].

Risk of Bias assessment
The risk of bias of included observational longitudinal cohort studies is shown in Table 3. In summary, 26 studies (59.10%) reported a low risk of bias, and 17 studies (38,63%) showed a moderate risk of bias. Selection bias (97,72%) were usual across the included studies. Using GRADE criteria, observational longitudinal cohort studies reported a low evidence in most of the prognostic outcomes. However, HF mortality and all-cause mortality showed a moderate evidence in the 6-MWT (Table 4).

Main findings and comparison with other studies
The current systematic review and meta-analysis showed that patients with HFrEF and HFpEF who reported a poor physical functional performance in 6-MWT have an increased risk of all-cause of mortality and an increased risk of HF mortality. There was consistency in the risk of all-cause of mortality and HF mortality between the studies included in each meta-analysis ( Fig. 2a and Fig. 2b) and the GRADE criteria also reported a moderate level of evidence per otucome. Although patients with HFrEF who decreased the meters they walked in the 6MWT during follow-up showed an increased risk of all-cause of mortality, there was no decreased risk of all-cause of mortality between patients with HFrEF and HFpEF who increased the meters they walked in the 6MWT during follow-up [52,53,59,60,65,67,68,70,73,75,77]. Maybe this is beacuse the most of included studies in the meta-analysis reported a decreased risk of mortality for every 1 m increased [53,65,67,68,70,73] or every 10 m [52,60,77] increased, while a systematic review determined that 45 m is the clinically meaningful change in the 6MWT [89]. Patients with HF who showed a poor physical functional performance in the 6MWT also reported an increased risk of the combined endpoint of hospitalisation and mortality for any cause ( Fig. 2c and Fig. 2d), an increased risk of HF hospitalisation (Additional file 5) and an increased risk of all-cause of hospitalisation [48,51]. However, the level of evidence of those outcomes was low according to the GRADE criteria. Moreover, there was a lack of homogeneity regarding which cut-off point should be used to stratify patients with HF based on their physical functional performance in the 6MWT. A distance traveled < 300 m was the most used distance to define patients with poor physical performance in the 6MWT in this study [47,49,55,56,58,59,61,62,64,69,74], while a previous review reported that a distance traveled ≤350 m in 6-MWT could be the most indicative distance of poor physical functional performance and worse prognosis in patients with HF [24]. A score between 1 and 4 points on the SPPB was associated with an increased risk of all-cause of mortality in this systematic review [80]. However, in the current study a score below 7 points on the SPPB seems to be the most indicative of a worse prognosis in patients with HF since it was associated with a larger risk of the combined endpoint of hospitalisation and mortality for any cause and a larger risk of HF hospitalisation [78].  not be performed. As the present review, a score below 7 points on the SPPB was also associated with large risk of all-cause mortality in older adults [90]. However, other studies reported a large risk of mortality or hospitalisation in older adults who showed a score below 5 points [80,[91][92][93].
Patients who showed a slower gait speed also reported an increased risk of all-cause of mortality (Fig. 3), above all, when gait speed was slower than 0.65 m/s (Additional file 5). Moreover, patients with HF who showed a slower gait speed also reported an increased risk of allcause of hospitalisation (Additional file 5) and an Table 3 Risk of Bias Assessment of Cohort Studies (The Newcastle Ottawa Scale (NOS)).
Note: The NOS assigns up to a maximum of nine points for the least risk of bias based on 3 domains: selection of study groups (four points); comparability of groups (two points); and ascertainment of exposure and outcomes (three points). This checklist has been recommended for cohort studies. The risk of bias based on the NOS was classified as: Low Risk of Bias (7-9 points), Moderate Risk of Bias (4-6 points) and High Risk of Bias (0-3 points). Abbreviations: Quality: High Risk of Bias (H); Moderate Risk of Bias (M); Low Risk of Bias (L); NOTE. Newcastle-Ottawa Quality Assessment Scale: cohort studies: 1 = Representativeness of the exposed cohort; 2 = Selection of the non-exposed cohort; 3 = Ascertainment of exposure; 4 = Demonstration that outcome of interest was not present at start of study; 5-6 = Comparability of cohorts on the basis of the design or analysis; 7 = Assessment of outcome; 8 = Was follow-up long enough for outcomes to occur; 9 = Adequacy of follow-up of cohorts increased risk of the combined endpoint of hospitalisation and mortality for any cause [84], specially when gait speed was slower than 0.80 m/s [83,84,86]. GRADE criteria reported a low level of evidence per outcome in each prognostic outcome in Gait Speed Test. Other studies have shown the relationship between gait speed and survival, death and hospitalisation due to HF [27,94]. In fact, Dodson et al. [95] revealed that patients who showed a gait speed slower than 0.8 m/s were more likely to experience one-year mortality or hospitalisation than patients with gait speed faster than 0.8 m/s. Alfredsson et al. [96] also reported that patients with a In brief, the GRADE classification was carried out according to the presence, or not, of the following identified factors: (1) study design, (2) risk of bias, (3) inconsistency of results (4) indirectness (5) imprecision, and (6) other considerations (e.g. reporting bias). The quality of the evidence based on the GRADE criteria was classified as: (1) high (further research is unlikely to change our confidence in the estimate of effect and there are no known or suspected reporting bias); (2) moderate (further research is likely to have an important effect on our confidence in the estimate of effect and could change the estimate); (3) low (further research is likely to have an important effect on our confidence in the estimate of effect and is likely to change the estimate); or (4) very low (we are uncertain about the estimate) [38] a Design: Observational Longitudinal Cohort Studies show a Low Level of Evidence according to GRADE b Risk Of Bias: > 50% (NO) of the information is from studies with low risk of bias which rarely can affect the interpretation of results. 50% (Not Serious) of the information is from studies with moderate risk of bias which could affect the interpretation of results, and 50% of the information is from studies with low risk of bias. > 50% (Serious) or > 75% (Very Serious) of the information is from studies with high/moderate risk of bias which sufficiently can affect the interpretation of results c Inconsistency: > 50% (Consistency) presence of high degree of consistency in the results, such as effects in same directions and not variations in the degree to which the outcome is affected (large significant effects (Hazard Ratio or Odds Ratio > 2)). > 50% (Not Serious) presence of high degree of consistency in the results, such as effects in same directions although variations in the degree to which the outcome is affected (small significant effects or large significant effects). > 50% (Serious) or > 75% (Very serious) presence of high degree of inconsistency in the results, such as effects in opposite directions, or large variations in the degree to which the outcome is affected (eg, very large and very small effects or no significant effect) d Indirectness: > 50% (NO) of included studies report similar population (similar HF diagnosis and similar age), as well as the same functional test (although different distances or cut-off points) and the same outcome. > 50% (Not Serious) of included studies show different HF diagnosis but population with similar age, and the same functional test (although different distances or cut-off points) and the same outcome is reported e Imprecision: > 50% (NO) of included studies report a 95% CI, with a narrow range (it excludes 1.0), includes large effects in the same direction and the sample size is large. > 50% (Not Serious) of included studies report a 95% CI, with a narrow range (it excludes 1.0), includes large or small effects in the same direction and the sample size could be small. > 50% (Serious) or > 75% (Very Serious) of included studies present 95% CIs with wide range (it does not exclude 1.0) and includes small effects in both directions f Other: Publication Bias is not suspected, and > 75% of included studies included the outcome data in a multivariate models adjusted by variables which could change the effect (NO) gait speed slower than 0.8 m/s after a transcatheter aortic valve replacement, had 35% higher 30-day mortality than patients with faster gait speed. Chainani et al. [97] reported that gait speed and handgrip strength are associated with increased risk of cardiovascular mortality. A meta-analysis published by Yamamoto et al. [98] reported that 6MWT were significantly associated with mortality and cardiovascular disease. Frailty has also been associated with larger risk of mortality and hospitalisation in patients with chronic HF [25,26,30,31,99]. Bagnall et al. [100] revealed that frailty patients had a risk of mortality 2-to 4-fold compared with non-frail patients after acardiac surgery or transcatheter aortic valve implantation. Gait speed is a marker of frailty, although frailty could be also assessed by the 6MWT, the SPPB or the TUG [25,26,30,31,99]. In this way, the use of functional tests seem to be useful to stratify patients with HF based on their physical functional performance and to determine their prognosis.
To our knowledge, our review is the first systematic review reporting the level of evidence per each prognostic outcome using GRADE criteria. Other reviews showed the prognostic role of the 6MWT test or the impact of the physical performance on prognosis in patients with HF, but not reported the risk of bias of included studies or the level of evidence per outcome according to GRADE criteria [22,23,98,[101][102][103].

Implications for clinical practice
The current findings may be useful to promote functional assessments that allow stratify patients with HF according to their functional impairment. Furthermore, accurate prognostic stratification could be essential for optimizing clinical management and treatment decision making, with the aim of maintaining functionality, improving quality of life and reducing the number of hospitalisations, as well as increasing the life expectancy of patients with HF.
Adjusted medical-pharmacological treatment, in addition to improve symptoms, could prevent further cardiovascular accidents and prolong the life expectancy of patients with HF [13]. Moreover, adjusted exercise programs could reduce mortality, may improve functional capacity and quality of life, and may reduce hospitalisations [5,8]. It has also been shown that patients with more physical activity performed weekly reported a lower risk of mortality [104][105][106]. Functional tests such as 6MWT, Gait Speed or SPPB may provide incremental prognostic value and could help to individualize the exercise prescription [107].

Future research
Future research should aim to determine the optimal cut-off points for prognostic prediction and to determine the utility of functional assessments in the management and treatment of patients with HF. The following recommendations should guide future research: 1) use the same cut off point in functional tests; 2) include a large sample size with patients with HF who show different characteristics.

Strengths and limitations of the study
The strengths of this systematic review and metaanalysis included the use of a pre-specified protocol registered on PROSPERO, the PRISMA checklist, the NOS to determine the risk of bias of each study, the GRADE criteria to assess the overall quality and the strength of the evidence per outcome, a robust search strategy complemented by a manual search, so that all studies that met the eligibility criteria could have been identified. Thus, our systematic review included 44 studies, while a previous similar review carried out by Yamamoto et al. [98] included only 22 studies.
However, there are several limitations that should be mentioned. First, the lack of uniformity among included studies, which included different cut-off points in functional tests, should be taken into account when interpreting the results. Finally, most of prognositc outcomes showed a low level of evidence per outcome according to GRADE criteria.

Conclusion
Patients with HF who report a poor physical functional performance in the 6MWT, in the SPPB or in the Gait Speed Test, show worse prognosis than patients who report a good physical functional performance in terms of an increased risk of hospitalisation or an increased risk of mortality. However, there is a lack of homogeneity regarding which cut-off point should be used to stratify patients with HF based on their physical functional performance in the different functional tests and GRADE criteria show a low level of evidence per outcome in most of examined prognostic outcome variables.
Abbreviations HF: Heart failure; 6MWT: Six minute walking test; SPPB: Short physical performance battery; DALYs: Disability-adjusted life-years; TUG: Timed up and go test; PECOS: Participant, exposure, comparator, outcome, study design; PRISMA: Preferred reporting items for systematic reviews and meta-analyses statement; PROSPERO: International prospective register of systematic reviews; NYHA: New York heart association; OR: Odds ratio; HR: Hazard ratio; NOS: The Newcastle Ottawa scale; GRADE: Grading of recommendations assessment, development and evaluation; HFrEF: Heart failure with reduced ejection fraction; HFpEF: Heart failure with preserved ejection fraction; CI: Confidence interval; m: Meters