# Lightweight Visual Dynamic Gesture Recognition System Based on CNN-LSTM-DSA

**Authors:** Zhenxing Wang, Ziyan Wu, Ruidi Qi, Xuan Dou

PMC · DOI: 10.3390/s26051558 · Sensors (Basel, Switzerland) · 2026-03-02

## TL;DR

A lightweight system using a CNN-LSTM-DSA model achieves high accuracy in recognizing both static and dynamic gestures with low latency, enabling real-time robotic hand control.

## Contribution

A lightweight CNN-LSTM-DSA hybrid model for efficient and accurate visual dynamic gesture recognition on resource-constrained devices.

## Key findings

- The system achieved 96% accuracy for static gestures and 90.19% for dynamic gestures.
- It maintained low response latency (<300 ms) and robust performance under varying lighting and background conditions.
- The model is suitable for deployment on devices like the Jetson Nano, minimizing computational overhead.

## Abstract

What are the main findings?
A lightweight CNN-LSTM-DSA hybrid model was developed for visual dynamic gesture recognition, achieving high-precision recognition of both static (96%) and dynamic (90.19%) gestures with low response latency (<300 ms).The system demonstrated excellent robustness under varying lighting and background conditions, with successful real-time mapping of gestures to robotic hand movements.

A lightweight CNN-LSTM-DSA hybrid model was developed for visual dynamic gesture recognition, achieving high-precision recognition of both static (96%) and dynamic (90.19%) gestures with low response latency (<300 ms).

The system demonstrated excellent robustness under varying lighting and background conditions, with successful real-time mapping of gestures to robotic hand movements.

What are the implications of the main findings?
This model provides an efficient solution for embedded gesture recognition, ensuring high accuracy while minimizing computational overhead, making it suitable for deployment on resource-constrained devices like the Jetson Nano.The proposed approach enhances human–robot interaction, offering practical applications in virtual reality, intelligent robotics, and other real-time interactive systems.

This model provides an efficient solution for embedded gesture recognition, ensuring high accuracy while minimizing computational overhead, making it suitable for deployment on resource-constrained devices like the Jetson Nano.

The proposed approach enhances human–robot interaction, offering practical applications in virtual reality, intelligent robotics, and other real-time interactive systems.

Addressing the challenges of large-scale gesture recognition models, high computational complexity, and inefficient deployment on embedded devices, this study designs and implements a visual dynamic gesture recognition system based on a lightweight CNN-LSTM-DSA model. The system captures user hand images via a camera, extracts 21 keypoint 3D coordinates using MediaPipe, and employs a lightweight hybrid model to perform spatial and temporal feature modeling on keypoint sequences, achieving high-precision recognition of complex dynamic gestures. In static gesture recognition, the system determines the gesture state through joint angle calculation and a sliding window smoothing algorithm, ensuring smooth mapping of the servo motor angles and stability of the robotic hand’s movements. In dynamic gesture recognition, the system models the key point time series based on the CNN-LSTM-DSA hybrid model, enabling accurate classification and reproduction of gesture actions. Experimental results show that the proposed system demonstrates good robustness under various lighting and background conditions, with a static gesture recognition accuracy of up to 96%, dynamic gesture recognition accuracy of 90.19%, and an overall response delay of less than 300 ms.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12987035/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12987035/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC12987035/full.md

---
Source: https://tomesphere.com/paper/PMC12987035