Validity of the Quarq Cycling Power Meter

Jon Oteo-Gorostidi; Jesús Camara; Diego Ojanguren-Rodríguez; Jon Iriberri; Iván Vadillo-Ventura; Almudena Montalvo-Pérez

PMC · DOI:10.3390/s25092717·April 25, 2025

Validity of the Quarq Cycling Power Meter

Jon Oteo-Gorostidi, Jesús Camara, Diego Ojanguren-Rodríguez, Jon Iriberri, Iván Vadillo-Ventura, Almudena Montalvo-Pérez

PDF

Open Access

TL;DR

This study confirms that the Quarq D-Zero power meter accurately measures cycling power output compared to other validated devices.

Contribution

The study provides direct validation of the Quarq power meter against existing devices, filling a gap in its prior lack of comparison to the SRM gold standard.

Findings

01

The Quarq power meter showed strong correlations and low variability when compared to the Favero Assioma Duo.

02

Significant differences were observed between the Quarq and Hammer Saris H3 devices at higher power outputs.

03

The Quarq power meter is valid for measuring cycling power output in both seated and standing positions.

Abstract

Technological advancements have led to the development of various devices designed to monitor training loads and athletic performance. Power meters, particularly in cycling, allow for the precise quantification of power output, which is crucial for managing training loads and evaluating performance improvements. This study evaluates the validity of the Quarq D-Zero power meter for measuring cycling power output by comparing it with two previously validated devices—the Favero Assioma Duo (FAD) and the Hammer Saris H3 (H3)—noting that, although it shares the same measurement location as the SRM (the gold standard), it has not been directly validated against it. Thirty-one trained male cyclists participated in this study, undergoing tests across various power outputs (100–500 W) and three 10-s sprint efforts. The protocol incorporated different cadences (70, 85, and 100 revolutions per…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Genes1

FANCD2

Proteins1

Species1

Homo sapiens(human · species)

Chemicals1

carbon

Diseases1

injuries

Figures3

Click any figure to enlarge with its caption.

Keywords

power outputlaboratory testingpedaling positioncadence

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSports Performance and Training · Cardiovascular and exercise physiology · Children's Physical and Motor Development

Full text

1. Introduction

In Sports Science, there has been a continuous increase in the implementation of technological resources aimed at controlling, monitoring, and precisely quantifying athlete performance and the associated training loads and competitive demands. These tools are fundamental for optimizing sports performance and systematically planning training loads. The range of available technologies is extensive, and their applicability varies according to the specific characteristics inherent to each sports discipline.

In endurance-dominant sports, performance is primarily influenced by the athlete’s ability to sustain maximum power output over the competition-specific distance and by the energy cost associated with maintaining a given race speed [1]. The proliferation of advanced technologies in cycling, characterized by the widespread adoption of power meters, has transformed performance analysis, which serves as a precise indicator of intensity and performance. This significance is attributed to the power’s immediate responsiveness to changes in exercise intensity and its ability to be quantified even at supra-maximal intensities [2]. These devices allow for precise quantification of cyclists’ power output while also providing supplementary data such as cadence, torque, and other critical performance parameters [3]. Over the years, the variety of power meters has expanded significantly, resulting in a wide range of formats designed for power measurement. The market currently provides an increasing number of power meters that incorporate various technologies from multiple manufacturers [4]. This increase has led to a decrease in costs, thereby enhancing accessibility for a larger population of cyclists [5].

The primary function of these devices is to measure an athlete’s power output throughout the pedaling cycle, providing an objective quantification of exercise intensity. Notably, portable power meters can be integrated into several mechanical components of bicycles, including pedals, cranks, chainrings, rear hubs, or the spider of the bottom bracket axle [5]. However, all sensors and technological tools introduced to the market face a critical validation challenge. These devices must demonstrate reliability; therefore, they undergo a validation process involving comparison with a “gold standard” to assess measurement accuracy and repeatability [6]. Many power meters currently in use have undergone comparative validation against the SRM, widely recognized as the gold standard in this field [4].

The Quarq D-Zero is a power meter positioned between the bottom bracket axle and the crankset of the bicycle [7]. Since it shares the same location as the SRM, its validity has not been directly tested against this device. To our knowledge, no validation study of the Quarq D-Zero has been published to date. However, previous research has evaluated the performance of another model from the same manufacturer, the Quarq Quatro [8]. Therefore, this study aims to assess whether the Quarq D-Zero power meter is a valid and reliable tool for measuring power output in cycling.

2. Materials and Methods

2.1. Experimental Approach to the Problem

A cross-sectional descriptive study was conducted, with recruitment and data collection taking place between February and March 2024. All tests were performed under standardized temperature and humidity conditions (a temperature of 20.3 ± 0.9 °C and a humidity of 28.2 ± 1.3%). Before participation, all subjects received both verbal and written information about the study procedures and provided their written informed consent. Participation in the study was entirely voluntary. The research adhered to the principles of the Declaration of Helsinki and received ethical approval from the Research Commission of the European University of Madrid (CI 2024-524).

2.2. Participants

A total of 31 well-trained male participants (age: 33 ± 12.19 years; weight: 70 ± 7.42 kg; height: 1.79 ± 0.06 m) were selected for the study, all of whom engaged in regular and structured training, averaging 11.5 ± 3.9 h per week. Inclusion criteria required participants to be in good physical health, with no known injuries or medical conditions that could interfere with high-intensity cycling performance. Additionally, participants had to be capable of sustaining a power output of 500 W while seated for a continuous duration of one minute.

2.3. Procedures

For this study, a bicycle was utilized that could be adjusted to accommodate the measurements of each participating athlete. Both saddle height and frame length were modified to optimize comfort. Three different power meters were affixed to the bicycle for data collection. The first, a Quarq Force D-Zero power meter (Quarq) (SRAM, Chicago, IL, USA), was positioned between the bottom bracket axle and the 170 mm crankset. The second and third devices, Favero Assioma Duo pedals (FAD) (Favero Electronics, Srl., Arcade, TV, Italy) [9,10], were mounted on both pedals, while the Hammer Saris H3 (H3) (CycleOps, Madison, WI, USA) [2] functioned as a cycle-ergometer. Prior to each test, all three devices were calibrated following the manufacturers’ instructions. First, with the crank arm in a vertical position, the Favero and Quarq devices were calibrated. Then, the Saris was calibrated by pedaling until reaching a specific speed and then stopping pedaling, allowing the roller to decelerate on its own. Data from each power meter were collected simultaneously at a frequency of 1 Hz using three Garmin 530 cycle computers (Garmin International Inc., Olathe, KS, USA), which recorded power output in watts (W). Each device (Quarq, FAD, H3) recorded data independently via ANT+ and was later synchronized post hoc using timestamp alignment on a per-second basis, enabling consistent comparison across power meters. To avoid any performance inconsistencies related to battery status, all devices were fully recharged before each testing session. Participants used their cycling shoes, equipped with Look brand cleats.

To validate the Quarq power meter, a protocol like that used in studies such as Montalvo-Pérez et al. (2021) [9] was implemented. This protocol incorporated varying intensities, pedaling cadences, and cycling positions. It began with a 5-min warm-up at 100 W, allowing athletes to select their preferred cadence. Following the warm-up, three identical work blocks were conducted. Each block consisted of six incremental power stages, starting at a submaximal workload of 100 W and increasing to a maximum of 350 W in 50 W increments. The designated cadences (70, 85, and 100 revolutions per minute) were randomized for each cyclist. The first three work intervals (block 1) were performed in a seated position, each lasting 75-s, with 5-min recovery periods at a self-selected cadence and an intensity of 75 W. This was followed by a high-intensity bout (block 2) at 500 W for 75-s, marking the peak intensity of the test. Afterward, three standing intervals of equal duration (block 3) were executed, with power outputs of 250 W for the first, 350 W for the second, and 450 W for the third, interspersed with 2-min recovery periods at 75 W. Following a 5-min rest, the protocol concluded with an all-out sprint phase (block 4), consisting of three maximal 10-s sprints in a seated position, each separated by 2-min recovery periods at 75 W (Figure 1).

Mean power data were collected from all blocks. Of the 75-s measured during the initial three blocks, only 60-s of power and cadence data were analyzed. The initial 10-s and the final 5-s were excluded to ensure the cyclists had sufficient time to stabilize and maintain each designated workload [11]. In the fourth block, which consisted of three 10-s sprints, both the mean power over the 10-s duration and the 1-s power recorded within the first 5-s of the sprint were analyzed [12,13]. To preserve raw signal fidelity, no on-device smoothing or filtering was applied. During post-processing, a 3-s rolling average was implemented to minimize transient noise.

2.4. Data Analysis

Data are presented as mean ± SD. The relationship and level of agreement between potentiometers were analyzed with Pearson correlation coefficients (r), intraclass correlation coefficients (ICC) (2,1), and standard errors of measurement. The typical error of the Mean (TEM) was calculated as the standard deviation of measurement differences divided by the square root of 2 [14]. Additionally, the coefficient of variation (CV) was also determined. R values of 0.1, 0.3, 0.5, 0.7, and 0.9 were considered small, moderate, strong, very strong, and extremely strong, respectively [15]. ICC values less than 0.5, between 0.5 and 0.75, between 0.75 and 0.9, and greater than 0.90 were considered poor, moderate, good, and excellent reliability, respectively [16]. In addition, a two-way repeated-measures ANOVA (power meter [Quarq vs. FAD vs. H3] by power) was conducted to assess differences between the power meters across each cyclist’s pedaling position (sitting/standing) and cadence. To minimize type I error, post hoc Bonferroni tests were only applied when a significant power interaction was found. The magnitude of differences (ES) was analyzed using ηp^2^, with reference values being small (0.01), medium (0.06), and large (0.14). Agreement between power meters was also determined using the Bland–Altman method. The mean difference (bias) and the 95% limits of agreement (bias × 1.96 of the difference) were calculated. Results were graphically examined through Bland–Altman plots, where the differences were plotted against their mean values. The Breusch–Pagan test was performed to assess for heteroscedasticity. Statistical analyses were performed using custom-written software (Python v3.12.7; Python Software Foundation) with an α of 0.05.

3. Results

Significant differences (p < 0.05) were found between the power measurements of the three power meters, except during all-out sprints. Specifically, in the pairwise comparison between the Quarq and the H3, significant differences in power output were observed across all intervals except for the all-out sprints. On the other hand, the Quarq and the FAD showed significant differences in the 500 W interval. Effect sizes were large when comparing the three power meters, except for the all-out sprints, which exhibited a medium effect size. When analyzing different power intervals across the power meters, effect sizes were large at 70 and 85 rev·min⁻^1^, medium at 100 rev min⁻^1^, and small when pedaling standing or during maximal sprints. Pearson and intraclass correlations between the FAD and the Quarq indicated a strong relationship (r > 0.883) and good reliability (ICC > 0.879), while the values, when comparing them to the H3, were considerably lower, except for the sprints, maximum power, and the 500 W interval (r > 0.843, ICC > 0.659) (Table 1).

When analyzing all power data together, Pearson and intraclass correlations were extremely strong in both cases (r = 0.999 and 0.993; ICC = 0.999 and 0.986) (Figure 2). A repeated measures ANOVA revealed significant differences between the power measurements from the three power meters (p < 0.001) with an effect size of ηp^2^ = 0.364.

The coefficient of variation (CV%) ranged from 0.62% to 4.89% for the power values recorded by the Quarq and the FAD, with higher values observed at 100 W during the first block and in sprints. For the Quarq and the H3, very high CV% values were noted at 100 W and during sprints or at maximal power, while the values were below 2% for the remaining intervals (Table 2).

The Bland–Altman analysis showed no systematic bias between the FAD and the Quarq across intensities, and both devices exhibited similar values at all power levels (100 W to 1400 W), with a mean bias of 3.20 W (Figure 3a). As power levels increased, the limits of agreement (LoA) widened, indicating greater differences between devices, while the random bias also showed a slight increase. Specifically, a low bias between the power values of the FAD and the Quarq was observed during the first block (100 W to 500 W) at 70 rev·min⁻^1^, 1.7 W (LoA: −3.17 to 6.59), 1.9 W (LoA: −4.81 to 8.66) at 85 rev·min⁻^1^, and 3.2 W (LoA: −3.79 to 10.17) at 100 rev·min⁻^1^. Blocks 2 (bias: 4.2 W, LoA: −5.41 to 13.86), 3 (bias 5.37 W, LoA: −9.71 to 20.44), and 4 (3.3 W, LoA: −47.96 to 54.67) showed similar results. During the all-out 10-s sprint (fourth block), the bias was 11.9 W (LoA: 46.55–70.36), representing a 0.14% difference.

On the other hand, the Bland–Altman analysis revealed a higher bias between the Quarq and the H3 across all intensities (Figure 3b), indicating a less consistent agreement compared to the FAD and the Quarq measurements. In the first block, the bias was 13.6 W (LoA: −3.68 to 30.90) at 70 rev·min⁻^1^, 20.5 W (LoA: 4.23 to 36.37) at 85 rev·min⁻^1^, 27.1 W (LoA: 8.30 to 45.98) at 100 rev·min⁻^1^. In blocks 2, 3, and 4, the bias values were 39.1 W, (LoA: −2.82 to 81.10), 21.8 W (LoA: −2.52 to 45.94), and 76.7 W (LoA: −2.31 to 155.84), respectively. The bias for peak PO was particularly high, reaching 179 W (LoA: −13.55 to 371.48), highlighting substantial measurement discrepancies at maximal effort levels.

4. Discussion

The main objective of this study was to assess the reliability and validity of the Quarq power meter in measuring power output across different intensities and cadences. Results indicate that the Quarq power meter provides consistent and reliable measurements across various intensities and cadences. This study represents the first validation of the Quarq’s accuracy, noting challenges in comparing it to the SRM power meter, considered the gold standard, due to their shared mounting location on the bike (spider). In contrast, previous studies have typically validated power meters using devices located in different positions, such as rear hub-based systems [17] or ergometers [18], to avoid this potential source of measurement redundancy.

Post hoc analyses using the Bonferroni method confirmed that power measurements from the Quarq and the FAD were similar, with no significant differences except for certain high-intensity intervals. These findings support the reliability and accuracy of the Quarq’s measurements. Previous validation studies [9,10] yield consistent results between the FAD and the SRM, reinforcing the conclusions of this study. However, significant differences were observed between the power outputs of the H3 and the Quarq, with the former showing lower values. During the validation of this device [2], significant differences were found at certain power levels and cadences compared to the SRM.

The observed bias between power meters was minimal, with a reduced bias (<4 W) for power stages up to 350 W across different cadences. However, during all-out sprints, the bias FAD-Quarq increased slightly (<7 W), with the FAD measuring higher values. This small bias, particularly during maximum efforts, may be due to device placement differences and mechanical power transmission variances (e.g., crank deformation and pedal-crank interface losses). Similar results have been reported in previous studies [19] comparing the FAD and the SRM. Consistent with the findings of Fremeaux et al. [20], the H3 recorded higher power values compared to power meters positioned in the spider.

The study observed that the FAD tends to record higher power values due to its placement. Previous studies [21,22,23] comparing power meters (e.g., PowerTap, Stages, Garmin Vector) with the SRM found that pedal-mounted devices (Garmin Vector) and crank-mounted devices (Stages) typically reported higher values, while hub-mounted ones (PowerTap G3) recorded lower values. These findings align with the present study, where the FAD showed slightly higher power than the Quarq, with a similar coefficient of variation (CV) of 2.4% [22]. This discrepancy may be partly attributed to crank deformation, particularly in carbon cranks, which could explain the lower power recorded by the Quarq. However, another study [24] reported that the SRM measured higher power than the pedal-mounted PowerTap P1. Additionally, other research [10] noted that the SRM tends to yield higher readings than the FAD, potentially due to differences in strain gauge sensitivity or signal processing.

Regarding the CV for FAD–Quarq, values remained low (<4%) across most tests, with no cases exceeding 5%. A previous study [9] reported similar low CVs (<2.82%), except during sprints. Yeh et al. (2022) [19] also found mean biases of 3.6%, aligning with the present findings. According to Hopkins [25], CVs below 5% are generally acceptable in sports science. In this study the overall CV between the FAD and the Quarq was 3.6%, with a bias of 3.1 W ± 11.6 W, indicating strong precision and consistency. The importance of using reliable devices to detect small performance changes was been emphasized by Hopkins et al. [14]. Despite the stricter criterion for elite performance analysis (CV below 2%) [26] the Quarq still demonstrates valid and reliable data. These findings ensure that users can rely on the consistency of their daily training measurements and that any changes in power output reflect real performance variations. Furthermore, the precision values for 54 power meters (including pedal-based, crank arm, spider, and wheel hub models) reported in Maier et al.’s study [27] align with our results, showing CV values below 2%. Although the tests were conducted in a controlled indoor environment, the inclusion of both steady-state and high-intensity sprint efforts (up to 1400 W) ensured a broad representation of typical cycling demands [28,29], supporting ecological validity.

The ICC analysis showed excellent reliability (ICC > 0.90) for most tests, further supporting the Quarq’s validity in power measurement and analysis compared with the FAD. Similar to previous validation studies of power meters installed in pedals and spiders [9,10,22], this study also observed increasing typical errors proportional to average power, indicating greater precision at lower power levels compared to maximum power efforts (e.g., all-out and P_max_).

5. Conclusions

This study represents the first validation of the Quarq D-Zero power meter, representing a significant advance in this field. Comparing it with the gold standard (SRM) is challenging due to their shared placement on the bicycle (the crank spider). The analysis conducted in this study demonstrates that the Quarq power meter is a reliable and valid device for measuring cycling power, even under varying intensities, cadences, and cycling positions. The results indicate that the Quarq provides consistent and comparable power measurements to other devices, such as the FAD, particularly in intervals up to 350 W and maximum efforts. These findings establish the Quarq as a valuable tool for athletes and coaches aiming to monitor and optimize performance in training and competition, offering a robust and accurate alternative to existing reference measurement devices.

Bibliography29

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Mujika I. Quantification of training and competition loads in endurance sports: Methods and applications Int. J. Sports Physiol. Perform 201712 S 2-9S 2-1710.1123/ijspp.2016-040327918666 · doi ↗ · pubmed ↗
2Lillo-Bevia J.R. Pallarés J.G. Validity and reliability of the cycleops hammer cycle ergometer Int. J. Sports Physiol. Perform 20181385385910.1123/ijspp.2017-040329182415 · doi ↗ · pubmed ↗
3Sitko S. Cirer-Sastre R. Corbi F. López-Laval I. Power assessment in road cycling: A narrative review Sustainability 202012521610.3390/su 12125216 · doi ↗
4Bouillod A. Soto-Romero G. Grappe F. Bertucci W. Brunet E. Cassirame J. Caveats and recommendations to assess the validity and reliability of cycling power meters: A systematic scoping review Sensors 20222238610.3390/s 2201038635009945 PMC 8749704 · doi ↗ · pubmed ↗
5Valenzuela P.L. Montalvo-Perez A. Alejo L.B. Castellanos M. Gil-Cabrera J. Talavera E. Lucia A. Barranco-Gil D. Are unilateral devices valid for power output determination in cycling? insights from the favero assioma power meter Int. J. Sports Physiol. Perform 20221748448810.1123/ijspp.2021-027834969007 · doi ↗ · pubmed ↗
6Linnamo V. Sensor Technology for Sports Monitoring Sensors 20232357210.3390/s 2302057236679367 PMC 9866738 · doi ↗ · pubmed ↗
7Quarq Available online: https://www.sram.com/en/quarq(accessed on 28 October 2024)
8Miller M.C. Macdermid P.W. Fink P.W. Stannard S.R. Agreement between powertap, quarq and stages power meters for cross-country mountain biking Sports Technol.20158445010.1080/19346182.2015.1108979 · doi ↗