Duo Streamers: A Streaming Gesture Recognition Framework
Boxuan Zhu, Sicheng Yang, Zhuo Wang, Haining Liang, Junxiao Shen

TL;DR
Duo Streamers is a lightweight, real-time streaming gesture recognition framework that achieves high accuracy with significantly reduced latency and model size, suitable for resource-constrained devices.
Contribution
The paper introduces a novel three-stage sparse recognition mechanism and an RNN-lite model with external hidden state for efficient streaming gesture recognition.
Findings
Matches mainstream accuracy metrics
Reduces real-time factor by ~92.3%
Shrinks parameter counts to 1/38 and 1/9
Abstract
Gesture recognition in resource-constrained scenarios faces significant challenges in achieving high accuracy and low latency. The streaming gesture recognition framework, Duo Streamers, proposed in this paper, addresses these challenges through a three-stage sparse recognition mechanism, an RNN-lite model with an external hidden state, and specialized training and post-processing pipelines, thereby making innovative progress in real-time performance and lightweight design. Experimental results show that Duo Streamers matches mainstream methods in accuracy metrics, while reducing the real-time factor by approximately 92.3%, i.e., delivering a nearly 13-fold speedup. In addition, the framework shrinks parameter counts to 1/38 (idle state) and 1/9 (busy state) compared to mainstream models. In summary, Duo Streamers not only offers an efficient and practical solution for streaming gesture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Robotics and Automated Systems · Speech and dialogue systems
