Duo Streamers: A Streaming Gesture Recognition Framework

Boxuan Zhu; Sicheng Yang; Zhuo Wang; Haining Liang; Junxiao Shen

arXiv:2502.12297·cs.CV·February 26, 2025

Duo Streamers: A Streaming Gesture Recognition Framework

Boxuan Zhu, Sicheng Yang, Zhuo Wang, Haining Liang, Junxiao Shen

PDF

Open Access 1 Repo

TL;DR

Duo Streamers is a lightweight, real-time streaming gesture recognition framework that achieves high accuracy with significantly reduced latency and model size, suitable for resource-constrained devices.

Contribution

The paper introduces a novel three-stage sparse recognition mechanism and an RNN-lite model with external hidden state for efficient streaming gesture recognition.

Findings

01

Matches mainstream accuracy metrics

02

Reduces real-time factor by ~92.3%

03

Shrinks parameter counts to 1/38 and 1/9

Abstract

Gesture recognition in resource-constrained scenarios faces significant challenges in achieving high accuracy and low latency. The streaming gesture recognition framework, Duo Streamers, proposed in this paper, addresses these challenges through a three-stage sparse recognition mechanism, an RNN-lite model with an external hidden state, and specialized training and post-processing pipelines, thereby making innovative progress in real-time performance and lightweight design. Experimental results show that Duo Streamers matches mainstream methods in accuracy metrics, while reducing the real-time factor by approximately 92.3%, i.e., delivering a nearly 13-fold speedup. In addition, the framework shrinks parameter counts to 1/38 (idle state) and 1/9 (busy state) compared to mainstream models. In summary, Duo Streamers not only offers an efficient and practical solution for streaming gesture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

X-Intelligence-Labs/Duo-Streamers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Robotics and Automated Systems · Speech and dialogue systems