Generative Model-Based Feature Attention Module for Video Action Analysis

Guiqin Wang; Peng Zhao; Cong Zhao; Jing Huang; Siyan Guo; Shusen Yang

arXiv:2508.13565·cs.CV·August 20, 2025

Generative Model-Based Feature Attention Module for Video Action Analysis

Guiqin Wang, Peng Zhao, Cong Zhao, Jing Huang, Siyan Guo, Shusen Yang

PDF

TL;DR

This paper introduces a generative attention-based model that enhances video action analysis by effectively learning feature semantics, improving accuracy in action recognition and detection tasks for IoT applications.

Contribution

The paper presents a novel generative attention module that captures feature semantics and temporal dependencies, addressing limitations of existing methods in IoT-oriented video analysis.

Findings

01

Outperforms existing methods on benchmark datasets

02

Improves accuracy in action recognition and detection

03

Effectively learns feature semantics from foreground and background differences

Abstract

Video action analysis is a foundational technology within the realm of intelligent video comprehension, particularly concerning its application in Internet of Things(IoT). However, existing methodologies overlook feature semantics in feature extraction and focus on optimizing action proposals, thus these solutions are unsuitable for widespread adoption in high-performance IoT applications due to the limitations in precision, such as autonomous driving, which necessitate robust and scalable intelligent video analytics analysis. To address this issue, we propose a novel generative attention-based model to learn the relation of feature semantics. Specifically, by leveraging the differences of actions' foreground and background, our model simultaneously learns the frame- and segment-dependencies of temporal action feature semantics, which takes advantage of feature semantics in the feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.