# Validation of the Transformer-Based Monocular System (Capture4D): A Real-Time Kinematic Analysis in Coaching/Teaching Tennis

**Authors:** Yue Zhao, Shuo Wang, Zan Gao, Haijun Wu, Yixiong Cui, Yuanlong Liu

PMC · DOI: 10.3390/s26051411 · Sensors (Basel, Switzerland) · 2026-02-24

## TL;DR

Capture4D is a cost-effective, real-time motion capture system validated for tennis coaching, offering convenience and accuracy comparable to traditional systems.

## Contribution

A novel Transformer-based monocular system validated for tennis stroke analysis with a universal biomechanical framework.

## Key findings

- Capture4D achieved average NMPJPE of 69.5–88.3 mm for tennis serves, within acceptable coaching accuracy.
- The system reduced setup time by 50% and costs by 80% compared to traditional optical motion capture.
- Joint angle trajectories from Capture4D were comparable to those from the gold-standard OMC system.

## Abstract

Human motion capture is crucial for various fields, but traditional optical systems (OMC) are costly and restrictive. Monocular video-based methods offer accessibility, yet face accuracy challenges, especially in dynamic sports like tennis. This study validates Capture4D, a novel Transformer-based monocular system, for capturing a wide range of tennis strokes. We developed a universal biomechanical analysis framework (K0-K5) applicable to twelve fundamental stroke types. To demonstrate the system’s capabilities, this paper focused on a detailed validation using the tennis serve as a representative example. We conducted experiments with 9 high-level tennis players, and motion data were simultaneously captured using Capture4D (single RGB camera) and OMC Qualisys (gold standard). Accuracy was evaluated by comparing 3D joint coordinates and joint angles using Normalized Mean Per Joint Position Error (NMPJPE), RMSE, and MAE. The results demonstrated that Capture4D effectively captured the tennis player’s motion, with average NMPJPE for tennis serves ranging from 69.5 mm to 88.3 mm, within the acceptable range (70–130 mm) for coaching purposes. Compared to OMC, Capture4D demonstrated comparable joint angle trajectories, with advantages in operational convenience, cost-effectiveness, and wider applicability. It offered an approximately 50% reduction in setup time and 80% cost savings. Capture4D presents a valid and practical monocular motion capture solution for coaching tennis and other broader applications in sports. While slightly less precise than OMC, its accuracy is acceptable for many use cases in coaching and teaching. It offers significant advantages in convenience and cost, paving the way for accessible motion analysis in diverse environments like outdoor settings and multi-person scenarios, in which OMC is not possible to be used. This technology holds promise for democratizing motion capture in sports training and coaching/teaching.

## Full-text entities

- **Diseases:** stroke (MESH:D020521), tennis strokes (MESH:D013716)
- **Species:** Tetrastichus ennis (species) [taxon 2931463], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12986830/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12986830/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC12986830/full.md

---
Source: https://tomesphere.com/paper/PMC12986830