TL;DR
SASI is a novel framework that leverages sub-action semantics and graph convolution networks to enable early and accurate human action recognition in real-time, enhancing human-robot interaction capabilities.
Contribution
The paper introduces SASI, a new approach integrating sub-action semantics with graph convolution networks for improved early action recognition in HRI.
Findings
SASI improves recognition accuracy over conventional methods.
Operates in real-time at 29 Hz.
Demonstrates effectiveness on the BABEL dataset.
Abstract
Understanding human actions is critical for advancing behavior analysis in human-robot interaction. Particularly in tasks that demand quick and proactive feedback, robots must recognize human actions as early as possible from incomplete observations. \textit{Sub-actions} offer the semantic and hierarchical cues needed for this, since human actions are inherently structured and can be decomposed into smaller, meaningful units. However, conventional approaches focus primarily on holistic actions and often overlook the rich semantic structure embedded in sub-actions, making them poorly suited for early recognition. To address this gap, we introduce SASI (Sub-Action Semantics Integrated cross-modal fusion), a novel framework that integrates existing graph convolution networks to fuse spatiotemporal features with sub-action semantics. SASI exploits a segmentation model with a traditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
