Resource-Efficient RGB-Only Action Recognition for Edge Deployment
Dongsik Yoon, Jongeun Kim, Dayeon Lee

TL;DR
This paper presents a compact RGB-only action recognition network optimized for edge devices, balancing accuracy and efficiency without relying on additional sensors or complex pose estimation.
Contribution
A novel efficient RGB-only action recognition model with selective temporal adaptation and parameter-free attention, suitable for deployment on resource-constrained edge devices.
Findings
Achieves strong accuracy-efficiency trade-offs on NTU RGB+D benchmarks.
Demonstrates smaller footprint and resource use on Jetson Orin Nano.
Outperforms existing RGB-based methods in practical deployment scenarios.
Abstract
Action recognition on edge devices poses stringent constraints on latency, memory, storage, and power consumption. While auxiliary modalities such as skeleton and depth information can enhance recognition performance, they often require additional sensors or computationally expensive pose-estimation pipelines, limiting practicality for edge use. In this work, we propose a compact RGB-only network tailored for efficient on-device inference. Our approach builds upon an X3D-style backbone augmented with Temporal Shift, and further introduces selective temporal adaptation and parameter-free attention. Extensive experiments on the NTU RGB+D 60 and 120 benchmarks demonstrate a strong accuracy-efficiency balance. Moreover, deployment-level profiling on the Jetson Orin Nano verifies a smaller on-device footprint and practical resource utilization compared to existing RGB-based action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Context-Aware Activity Recognition Systems · Hand Gesture Recognition Systems
