# A hybrid approach for real-time hand tracking using fiducial markers and inertial sensors

**Authors:** Ranjeet Bidwe, Shubhangi Deokar, Yash Parkhi, Tanisha Vyas, Nimita Jestin, Utkarsh Kumar, Satviki Budhia, Armaan Jeswani

PMC · DOI: 10.1016/j.mex.2025.103609 · 2025-09-08

## TL;DR

This paper introduces a hybrid hand-tracking system using markers and sensors for accurate, real-time gesture recognition in immersive environments.

## Contribution

A novel hybrid hand-tracking system combining fiducial markers, capacitive touch, and inertial sensors for real-time gesture recognition.

## Key findings

- The system achieved 3.4 mm localization accuracy and 85–91% orientation accuracy.
- Capacitive sensing provided 96.1% accuracy in finger-state recognition.
- BLE and Wi-Fi communication achieved high reliability and low latency for real-time applications.

## Abstract

This paper presents a cost-effective hybrid hand-tracking technique that integrates fiducial marker detection, capacitive touch sensing, and inertial measurement for real-time gesture recognition in immersive environments. The system is implemented on lightweight hardware comprising a Raspberry Pi Zero 2 W and an ESP32, with OpenCV’s ArUco marker detection enabling 3D hand pose estimation, capacitive sensors supporting finger-state recognition, and an Inertial Measurement Unit (IMU) providing orientation tracking. Optimizations such as exposure adjustment and region-of-interest processing ensure robust marker detection under variable illumination, while sensor data is transmitted via Bluetooth Low Energy (BLE) and WebSocket protocols for synchronization with external devices.

The methodological novelty of this work is highlighted as follows:

•High Accuracy Across Modalities: Achieved 3.4 mm localization accuracy, 85–91% orientation accuracy, and ∼2.9 mm hand pose keypoint accuracy, with trajectory fidelity maintained at 80–81%.

•Robust Finger-State Recognition: The capacitive sensing module consistently delivered 96.1% accuracy in detecting finger states across multiple runs.

•Validated Communication Trade-offs: Latency testing established complementary roles of Wi-Fi (high throughput, ∼467 msg/s) and BLE (low latency, ∼50 ms, >98% reliability) for real-time applications.

By fusing multiple sensing modalities, the method delivers enhanced accuracy, responsiveness, and stability while minimizing computational overhead. The system provides a reproducible, modular, and scalable solution suitable for VR/AR interaction, assistive technology, education, and human–computer interaction.

Image, graphical abstract

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12765117/full.md

---
Source: https://tomesphere.com/paper/PMC12765117