# Comparison of consumer-grade wearable devices with a research-grade instrument for measuring physical activity in a free-living setting

**Authors:** Takuya Miwa, Kazuma Mii, Ryouichi Chatani, Yasuo Sugitani

PMC · DOI: 10.1371/journal.pone.0342543 · PLOS One · 2026-02-23

## TL;DR

This study compares consumer wearables like Apple Watch and Fitbit with a medical-grade device for tracking physical activity in real-life settings.

## Contribution

The study provides empirical validation of consumer-grade wearables against a research-grade device in a free-living setting.

## Key findings

- Apple Watch and Oura Ring step counts were within 10% of ActiGraph measurements, while Fitbit overestimated by 18%.
- Fitbit showed minimal mean difference in MVPA but overestimated PAEE by 139%.
- Strong correlations for step counts, but lower for MVPA and PAEE, with proportional bias observed in some devices at higher activity levels.

## Abstract

Wearable accelerometer devices are now widely used in both research and daily life settings. This study aimed to compare the accuracy of three commercially available consumer-grade activity monitors with the medical-grade ActiGraph device in a free-living setting in Japan.

Thirty-six office workers were enrolled and provided with an ActiGraph. Data were analyzed from participants who also wore Apple Watch (n = 21), Fitbit (n = 22), and Oura Ring (n = 5) over a 3-week period. Step count, physical activity energy expenditure (PAEE), and moderate-to-vigorous physical activity (MVPA) data were collected from all devices. Data were analyzed using correlation coefficients, mean differences, and Bland–Altman plots.

ActiGraph data confirmed comparable physical activity levels across the participant subgroups, ensuring a valid basis for the subsequent inter-device comparisons. Step counts were largely consistent across devices, with Apple Watch and Oura Ring measurements within 10% of ActiGraph measurements (mean percentage differences 2.12% and −6.24%, respectively), while the Fitbit overestimated step count by 18.00%. MVPA showed greater variability, with Apple Watch and Oura Ring underestimating by 46.22% and 11.64% respectively, whereas the Fitbit showed minimal mean difference (0.62%). PAEE showed the largest discrepancies, with Apple Watch and Fitbit overestimating by 25.91% and 139.19% respectively, and Oura Ring underestimating by 16.87%. Correlation coefficients were strong for step counts (r = 0.84–0.92) but lower for MVPA and PAEE across all devices. Bland–Altman analysis revealed proportional bias in the Fitbit’s PAEE and the Apple Watch’s MVPA, with errors increasing at higher activity levels.

Step counts were largely consistent with the ActiGraph for most devices; however, the Fitbit showed a notable overestimation. However, the ability of those devices to accurately measure MVPA and PAEE appeared to be more limited, particularly at higher activity levels. These findings underscore that the selection of a consumer-grade wearable for research or clinical use must be carefully guided by the specific metric of interest. However, the findings for the Oura Ring should be interpreted with caution due to the small sample size.

## Full-text entities

- **Diseases:** dyslipidemia (MESH:D050171), rashes (MESH:D005076), hypertension (MESH:D006973), COVID-19 (MESH:D000086382)
- **Chemicals:** ActiGraph (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12928483/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12928483/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/PMC12928483/full.md

---
Source: https://tomesphere.com/paper/PMC12928483