Reliability and Validity of Laboratory and Field Cardiorespiratory Exercise Tests for Wheelchair Users: A Systematic Review
Iker Garate, Javier Yanci, Josu Ascondo, Aitor Iturricastillo, Cristina Granados

TL;DR
This study reviews which cardio tests are reliable and valid for wheelchair users, finding some tests work but more research is needed.
Contribution
The paper systematically evaluates the reliability and validity of cardio tests specifically for wheelchair users.
Findings
Moderate evidence was found for reliability in one field test.
Moderate evidence was found for validity in two lab and two field tests.
Sample sizes in studies were small, limiting conclusions.
Abstract
Background: cardiorespiratory fitness is one of the most important components of physical fitness. In this paper, we set out to identify cardiopulmonary tests evaluated for measurement properties in wheelchair users and determine which are reliable and valid for this population. Methods: Articles were collected from PubMed, Scopus, SPORTDiscus, and Web of Science. The initial search was conducted in October 2022 and updated in July 2023 for recent publications. From 1257 screened studies, 42 met the criteria: (a) participants were wheelchair users, (b) tests measured cardiorespiratory fitness, (c) test reliability or validity was reported, (d) articles were original, and (e) full text was in English. Two independent researchers extracted participant details (number, gender, age, disability) and test information, with a third researcher resolving disagreements. Statistical analyses of…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1- —University of the Basque Country (UPV/EHU)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpinal Cord Injury Research · Cerebral Palsy and Movement Disorders · Cardiovascular and exercise physiology
1. Introduction
The wheelchair-using population comprises a wide range of people with disabilities, such as tetraplegia, paraplegia, spina bifida, cerebral palsy, sclerosis or amputations [1], Nevertheless, physical activity is highly recommended for all of them in order to improve their physical fitness and also their quality of life [1,2,3,4]. One of the most important components of physical fitness is cardiorespiratory fitness (CRF), a significant factor in morbidity and mortality [5,6,7]. CRF can be measured directly, expressed as maximal oxygen consumption (VO_2_max), or estimated from the peak work rate achieved on a treadmill. Adequate cardiorespiratory fitness may improve the quality of life of wheelchair users, because it is associated with better health [6], reduced mortality [5,6], greater functional capacity and greater autonomy to move around without depending on others [8]. Due to different factors, such as impairments, activity limitations or participation restrictions, people with disabilities are more likely to be physically inactive [9,10,11], have poorer cardiorespiratory fitness, and tend to develop more chronic diseases and comorbidities [9,12,13]. For this reason, measuring cardiorespiratory fitness in wheelchair users is crucial.
Cardiorespiratory fitness is commonly measured in a laboratory test [14,15,16]. These tests are performed in controlled and stable situations, and they typically measure variables such as gas exchange [16,17,18], blood lactate concentration [16,17,18,19,20,21] or power output [22,23,24]. However, as laboratory tests require expensive and sophisticated equipment, qualified personnel and a high investment of time, field tests have also been used for wheelchair users to measure or estimate cardiorespiratory fitness, because they allow a large number of participants to be tested cheaply, in less time and with greater ecological validity [25,26]. As in the general population, both laboratory [27] and field [28] protocols have been used for wheelchair users. Due to differences in this population’s functional and situational capacities, the protocols used to measure cardiorespiratory fitness in people with disabilities have often been adapted from protocols used for the general population [29]. For example, Eriksson et al. [30] adapted the traditional maximal protocols used in able-bodied people for wheelchair users by creating a fixed resistance roller and measuring wheel speed. Later, more sophisticated systems were utilized for the same purpose. Klaesner et al. [31] developed a dynamometer, adjustable to any wheelchair, that could simulate different resistances and slopes and measure the forces. Arm-crank ergometers [24] or treadmills with special harnesses that attach to wheelchairs to stabilize the person in them [32] have also been used. In the same way, the field protocols used in the ordinary population have also been adapted to the characteristics and needs of wheelchair users [33]. For example, Vanderthommen et al. [34] adapted Leger and Boucher’s shuttle run test [35] for wheelchair tennis players and Yanci et al. [36] did the same with the Yo-Yo recovery test in wheelchair basketball.
Regardless of whether they are laboratory or field tests, when any variable is measured, the tool used (in this case a cardiopulmonary test) must be valid and reliable [37]. This means that it has to measure what it is designed to measure, and whenever it measures the same thing, it must give the same result [37]. The availability of valid and reliable laboratory and field protocols for wheelchair users allows their cardiorespiratory capacity to be measured adequately. This is important for identifying people who could benefit from a prevention program [38] or even for sporting purposes [33]. This allows wheelchair users to be classified by level, and in the case of large groups, to be sorted into subgroups for individualized programs. Whatever the objective of the exercise program, testing is necessary to prescribe, monitor and evaluate it [22,39] correctly. For this reason, it seems essential to deepen out scientific knowledge on the validity and reliability of laboratory and field tests for wheelchair users.
Most of the protocols used in the general population have been widely validated; however, they suffer from important variations in adjusting to the needs/characteristics of wheelchair users [36,40,41,42,43]. Therefore, the validity and reliability of these new/adapted protocols should be analyzed. Nevertheless, although there are studies on the validity of cardiopulmonary tests for wheelchair users, the conclusions that can be drawn are limited due to the reduced number of participants [22,44,45]. For this reason, this study aims to unify the information obtained in those studies, assess their quality, and show the existing scientific evidence for using each test. Similar reviews have been performed in other populations or with other types of tests as targets [29,46,47,48,49], but not for cardiorespiratory tests for wheelchair users. Eerden et al. [27] gathered the existing cardiopulmonary laboratory tests for wheelchair users with spinal cord injury, but did not assess their reliability and/or validity. The only review that is close to our topic is the one by Gossey-Tolfrey et al. [28], but this is not a systematic review, so it only includes a few articles on this topic, and it does not assess the quality of the studies; it only focuses on field tests, and the validity and reliability of the tests is not the main research topic. Therefore, given that there is no comparable systematic review and the topic to be addressed is relevant, the main aim of this review is to analyze the existing scientific evidence on the validity and reliability of laboratory and field tests for measuring cardiorespiratory fitness in wheelchair users.
2. Materials and Methods
2.1. Search Strategy
This systematic review was conducted using the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analysis) recommendations [50]. Research articles were gathered using the PubMed, Scopus, SPORTDiscus, and Web of Science database platforms, representing databases from multiple health and physical activity disciplines. The literature search was conducted in October 2022 and repeated in July 2023 for recently published articles. The search strategy included the following keywords with the relevant Boolean operators inserted: (“disability” OR “physical impairment” OR “wheelchair” OR “cerebral palsy” OR “amputation”) AND (“cardio*” OR “aerobic capacity” OR “respiratory”) AND “test” AND (“reliability” OR “validity”). The protocol was registered and published in OSF (https://doi.org/10.17605/OSF.IO/G45BR, accessed on 1 February 2025).
2.2. Eligibility Criteria
To be included in this systematic review, studies needed to meet the following criteria: (a) at least 50% of participants had to be regular wheelchair users, either for everyday life or sporting purposes; (b) the test under study had to measure or estimate cardiorespiratory fitness; (c) the article had to provide information on reliability or validity of the test; (d) studies had to be original articles published in peer-reviewed impact journals; and (e) full-text articles had to be written in English. No publication date limit was set. Failure to meet any of the criteria meant being excluded from the review. No cardiovascular cycling tests were included in this systematic review.
2.3. Screening
The selection process is presented in Figure 1. The search results were merged, and duplicate records of the same document were removed. This resulted in 1257 articles. Identified articles on the systematic search were initially checked for relevance by 2 independent researchers (the first and last authors). Articles were selected after a sequential reading of the title and abstract, always in this order. Subsequently, the researchers reviewed the full texts of potentially eligible articles. A third researcher (the second author) resolved reviewer disagreements regarding the study’s inclusion. The references of the articles that were fully read were consulted to identify possible additional studies. In the case of articles found in systematic reviews, they were considered for inclusion only when the full text was available.
2.4. Data Extraction
Data extraction was performed by 2 independent researchers (the first and last authors) and supported by a third researcher (the second author) when necessary (Figure 1). The authors critically analyzed the selected articles and extracted the data on participants (number, gender, age, disability, and characteristics), the tests studied and statistical analysis on reliability and validity.
2.5. Quality Assessment
The methodological quality of the articles was evaluated using the COSMIN (Consensus-Based Standards for Selecting Health Measurement Instruments) checklist [51]. This methodology was used in previous reviews to assess the quality of studies on the measurement properties of different tests in other populations [29,47,48]. The checklist consists of 10 boxes that evaluate different measurement properties as “inadequate”, “doubtful”, “adequate” and “very good”. For this review, Boxes 6 (relative reliability), 7 (absolute reliability), 8 (criterion validity) and 9 (convergent validity) were rated if applicable. The overall score for each box is determined by the lowest score obtained on any of the items [29].
2.6. Level of Evidence
Only studies with “doubtful” or better methodological quality were used for the evidence level analysis [29]. Evidence was considered “strong” if there was 1 study with “very good” quality or multiple studies with “adequate” quality with similar results; “moderate” if there was 1 study with “adequate” quality or multiple studies with “doubtful” quality with similar results; and “limited” if there was only 1 study with “doubtful” quality [29].
3. Results
3.1. Quality of Included Studies
A total of 42 studies were included in the review (Table 1): 16 assessed both the reliability and validity of a test, 13 assessed only reliability and 13 assessed only the validity; therefore, 29 studies assessed reliability and 29 assessed validity. The quality of reliability studies was rated as “inadequate” 24 times, “doubtful” 4 times and “adequate” once. The quality of validity studies was rated as “inadequate” 11 times, “doubtful” 14 times and “adequate” 3 times.
3.2. Characteristics of the Participants
A total of 42 studies were included in the review. Regarding gender, 22 studies were mixed, 19 included only men, and 1 article was composed entirely of women. The total number of participants was 1065 (236 females and 829 males). The studies’ sample sizes varied from 6 persons [18,20] to 102 persons [57], and the age range was 4–83 years. A total of 6 articles were focused on young people (4–18 years) [43,61,64,66,67,71], 1 article studied young and adult people together (14–46 years) [56], and the participants in the remaining 35 articles were adults (>18 years). Only three studies included people older than 60 years in their samples [17,69,73]. Concerning the disabilities of the participants, the samples in 15 papers were composed of people with spinal cord injuries (2 of them included people with poliomyelitis too), 5 articles were focused on people with cerebral palsy, 2 studies’ participants had unilateral amputations and 1 article studied people with osteogenesis imperfecta. The remaining 19 articles had people with different disabilities in the sample. In 18 studies, only wheelchair athletes were included.
3.3. Laboratory Tests
Twenty of the included studies evaluated laboratory tests. The total number of laboratory tests studied was 29. Thirteen protocols were performed in a wheelchair ergometer: eleven were incremental maximal tests (six by increasing the resistance, four by increasing the speed and one by increasing the cadence), and two were simulated races. A further six protocols were carried out with an arm-crank ergometer, of which four were incremental maximal tests, and two were submaximal tests. Another seven protocols were conducted by pushing a wheelchair on a treadmill, including six incremental maximal tests (two by increasing inclination and speed, three by increasing inclination only and one by increasing speed only) and one submaximal test. The last three protocols were incremental maximal tests performed on an arm–crank ergometer or recumbent stepper. All laboratory protocols are summarized in Table 2.
Among the 20 articles, 7 assessed both the reliability and validity of a test, 9 assessed only the reliability, and 4 assessed only the validity. A total of 16 reliability reports and 11 validity reports were registered. The quality of reliability reports was evaluated as “inadequate” in 13 cases and “doubtful” in 3 cases. The quality of validity reports was “inadequate” in six cases, “doubtful” in three cases and “adequate” in two cases. There were no “very good” reports. Considering the best quality articles, two tests showed a moderate evidence level for validity: the maximal wheelchair ergometer resistance test [23] and the 6-min arm test [17]. Both tests had limited evidence for reliability. No laboratory test showed moderate or strong evidence for reliability (Table 3).
3.4. Field Tests
Twenty-two of the included studies evaluated field tests. The total number of field tests studied was 18. A total of 12 protocols were maximal incremental wheelchair tests (8 performed in a straight line, 2 in a “figure of eight”, 1 in an octagon and 1 on a 400 m track), of which 3 were intermittent tests, and the rest were continuous. Another five tests involved covering a distance as long as possible over a limited time (5, 6 or 12 min), but in two of them, the pushing cadence was marked with a metronome. In the other four, the participants were free to perform as best they could. The remaining test was a submaximal incremental test. All the field protocols are summarized in Table 4.
Among the 22 articles, 9 assessed both the reliability and validity of a test, 4 assessed only the reliability and 9 assessed only the validity. A total of 13 reliability reports and 18 validity reports were registered. The quality of reliability reports was evaluated as “inadequate” in 11 cases, “doubtful” in 1 case and “adequate” in 1 case. The quality of validity reports was “inadequate” in 5 cases, “doubtful” in 11 cases and “adequate” in 2 cases. There were no “very good” reports. Considering the best quality articles, the shuttle wheelchair test [67] and the adapted Léger and Boucher test [58] showed a moderate evidence level for validity. Only one test showed a moderate evidence level for reliability: the 6-min push test [43] (Table 5).
4. Discussion
The main aim of this review was to analyze the existing scientific evidence on the reliability and validity of laboratory and field tests for measuring cardiorespiratory fitness in wheelchair users. To the authors’ knowledge, this is the first systematic review to analyze the level of evidence of measurement properties of cardiopulmonary tests for wheelchair users. The validity and reliability of cardiorespiratory fitness assessments are of paramount importance when conducting laboratory and field tests. High validity ensures that measures reflect true fitness levels, while reliability guarantees consistent results across different testing conditions, enhancing the effectiveness of fitness evaluations [74]. The results showed that although several studies had measured the reliability and validity of cardiopulmonary tests for wheelchair users, there was no test with strong evidence. The main reason for this was the small sample size of the studies, which is very common when studying a particular population, because it is often difficult to gather many participants. Therefore, different studies should analyze the same tests, instead of trying to evaluate new ones, as most existing studies have positive results. However, according to Janssen et al. [75], there were too many tests (57 different tests in 42 articles), and a need for unification between them.
As is often the case in general physical activity studies [76], with only 75 women (22.2%), women had less presence than men (77.8%) in the included studies. Most of the studies were mixed, with 19 involving only men and 1 involving only women. The sample sizes varied widely, ranging between 6 and 102 persons. Regarding age, most of the investigations were conducted in adults; some studies focused on young people, and very few focused on people older than 60 years. Regarding disability type, most studies were carried out with samples of individuals with different impairments or people with spinal cord injuries; there were few studies on other disabilities or impairments, such as cerebral palsy, amputations or osteogenesis imperfecta. Another important thing to consider is that 16 studies were conducted with an entirely athletic population. It is likely that, in some cases, tests with good reliability and validity values in a concrete sample may not be suitable for other samples [24]. Moreover, given that they are a population that can benefit significantly in terms of health from understanding aerobic fitness [1,2,3,4], more studies on validity and reliability in wheelchair users are needed, especially for women, in people under 18 and over 60 years of age and for some concrete disabilities.
In relation to the laboratory tests, many different tests were analyzed (29 tests) in the included studies (21 studies). Thus, the only test studied in 2 different articles was the maximal wheelchair ergometer speed test [20,21]. Therefore, although most of the studies have shown positive results, due to the limited samples and lack of corroboration, there is currently a low level of evidence to determine whether one laboratory test is more appropriate. The highest level of evidence was obtained for the maximal wheelchair ergometer resistance test [17] and 6-min arm test [23], with both obtaining a moderate level of evidence for validity. However, both had limited evidence for reliability. This shows the need for further studies on the reliability and validity of laboratory tests for wheelchair users. Furthermore, it is unclear which type of laboratory test is the gold standard for this population, so more research should be carried out on this issue. Thus, Bloemen et al. [64] recommended using a wheelchair ergometer over an arm ergometer because they obtained higher oxygen consumption and heart rate values. This is likely due to both the specificity of the movement and the involvement of more muscle mass. The results of Hartung et al. [44] suggested that when performing a wheelchair treadmill laboratory test, combining increases in both incline and resistance was more appropriate, because higher peak values were obtained by increasing both of them than by increasing the incline or resistance alone (in the resistance protocol, they obtained the lowest peak values). Other studies [69,77] suggested that for certain individuals, such as unilateral amputees, it may be more appropriate to use a recumbent stepper, as this would involve more muscle mass, obtaining higher peak values, as in the aforementioned cases. These results highlight the possible need to use different laboratory tests depending on the type of population and measurement objectives. Also, there may not be a single gold-standard test.
Regarding the field tests, the three most analyzed tests appeared in three articles each: the shuttle wheelchair test [61,66,67], the adapted Léger and Boucher test [54,55,58] and the multistage octagonal field test [34,63,65]. All three tests were continuous maximal multistage wheelchair tests: the first over a 10 m straight, the second on a 400 m athletics track, and the third in a 15 × 15 m octagon. The 6-min push test (performed over a 10-m straight) was analyzed in two articles, and the remaining tests appeared in only one. Most field tests analyzed were maximal multistage wheelchair tests (12 tests), which increased speed until exhaustion. These tests can be intermittent or continuous, and can be performed on different routes (e.g., going around an athletics track, going back and forth in a straight line, turning around an octagon or performing a figure eight-shaped route). Having rests between bouts or changes in direction can affect the outcome of the test. De Groot et al. [33] found that the test they used with tennis players was more related to skills (such as turning and the ability to propel the chair correctly at high speeds) than cardiorespiratory fitness. They argued that the increase in speed is the primary limiting factor of the test, and that an increase in resistance, often seen in laboratory tests, is the most appropriate way to obtain maximal oxygen consumption values. This is in line with the aforementioned results of Hartung et al. [44], who obtained worse values in the laboratory test when they increased only the speed. The remaining six tests included a submaximal field test [52] and five time-limited tests, two of which required participants to maintain a constant cadence [39], whereas the other three had no cadence limitations [40,41,43,71]. Different routes were also used in this type of test, such as 200 m indoor tracks, basketball courts or 10 and 15 m straights, which could affect the result. Only two time-limited tests obtained positive results: the 6-min push tests, with one over a 10 m straight [43,71] and the other over a 15 m straight [40]. This might suggest that 6 min is more appropriate than 12 min for this type of test, and that limiting the cadence may not be the most appropriate. This agrees with Christensen et al. [22], who report that upper limb tests should last between 5 and 9 min, unlike lower limb tests. Regarding evidence levels, there were two field tests with moderate evidence for validity: the shuttle wheelchair test [67] and the adapted Léger and Boucher test [58]. On the other hand, there was only one field test with a moderate level of evidence for reliability: the 6 min push test over a 10 m straight [43]. Therefore, although good results were obtained in most of the articles, there is not even moderate evidence to consider any field test as having good reliability and validity for wheelchair users, so future studies should study the psychometric properties of the above-mentioned tests for this population.
Study Limitations
Although this review was conducted with high methodological and scientific rigor standards, it is not without limitations. The literature search was conducted in English, so studies in other languages will have been left out of this review; four databases were used for this purpose, and, undoubtedly, some studies will have been missed. Furthermore, the limitation that at least 50% of participants had to be wheelchair users also meant that interesting studies with non-disabled people were not included in this review.
5. Conclusions
The main conclusion of this review is that most studies appear to obtain good levels of reliability and validity, but are sparse and lack large samples to draw solid conclusions, so more studies evaluating existing tests are needed to strengthen the level of evidence. Furthermore, many similar tests, but different protocol variations, make it impossible to group the results and draw more consistent conclusions. This is why we encourage researchers to replicate studies that assess the validity and reliability of cardiopulmonary tests in wheelchair users in order to establish which laboratory and field tests are most appropriate for this population. In particular, there is a lack of studies on validity and reliability for wheelchair users in women, children, adolescents, people over 60 years of age and people with some specific disabilities.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Selph S.S. Skelly A.C. Wasson N. Dettori J.R. Brodt E.D. Ensrud E. Elliot D. Dissinger K.M. Mc Donagh M. Physical Activity and the Health of Wheelchair Users: A Systematic Review in Multiple Sclerosis, Cerebral Palsy, and Spinal Cord Injury Arch. Phys. Med. Rehabil.202110224642481.e 3310.1016/j.apmr.2021.10.00234653376 · doi ↗ · pubmed ↗
- 2Gauthier C. Arel J. Brosseau R. Hicks A.L. Gagnon D.H. Reliability and minimal detectable change of a new treadmill-based progressive workload incremental test to measure cardiorespiratory fitness in manual wheelchair users J. Spinal Cord. Med.20174075976710.1080/10790268.2017.136921328903627 PMC 5778939 · doi ↗ · pubmed ↗
- 3Kressler J. Cowan R.E. Bigford G.E. Nash M.S. Reducing cardiometabolic disease in spinal cord injury Phys. Med. Rehabil. Clin. N. Am.20142557360410.1016/j.pmr.2014.04.00625064789 · doi ↗ · pubmed ↗
- 4Van der Scheer J.W. Martin Ginis K.A. Ditor D.S. Goosey-Tolfrey V.L. Hicks A.L. West C.R. Wolfe D.L. Effects of exercise on fitness and health of adults with spinal cord injury: A systematic review Neurology 20178973674510.1212/WNL.000000000000422428733344 · doi ↗ · pubmed ↗
- 5Al-Mallah M.H. Sakr S. Al-Qunaibet A. Cardiorespiratory Fitness and Cardiovascular Disease Prevention: An Update Curr. Atheroscler. Rep.201820110.1007/s 11883-018-0711-429340805 · doi ↗ · pubmed ↗
- 6Bermejo-Cantarero A. Álvarez-Bueno C. Martinez-Vizcaino V. García-Hermoso A. Torres-Costoso A.I. Sánchez-López M. Association between physical activity, sedentary behavior, and fitness with health related quality of life in healthy children and adolescents: A protocol for a systematic review and meta-analysis Medicine 201796 e 640710.1097/MD.000000000000640728328839 PMC 5371476 · doi ↗ · pubmed ↗
- 7Kodama S. Saito K. Tanaka S. Maki M. Yachi Y. Asumi M. Sugawara A. Totsuka K. Shimano H. Ohashi Y. Cardiorespiratory fitness as a quantitative predictor of all-cause mortality and cardiovascular events in healthy men and women: A meta-analysis JAMA 20093012024203510.1001/jama.2009.68119454641 · doi ↗ · pubmed ↗
- 8Bouzas S. Molina A.J. Fernández-Villa T. Miller K. Sanchez-Lastra M.A. Ayán C. Effects of exercise on the physical fitness and functionality of people with amputations: Systematic review and meta-analysis Disabil. Health J.20211410097610.1016/j.dhjo.2020.10097632819852 · doi ↗ · pubmed ↗
