Faster Diffusion Action Segmentation

Shuaibing Wang; Shunli Wang; Mingcheng Li; Dingkang Yang; Haopeng; Kuang; Ziyun Qian; Lihua Zhang

arXiv:2408.02024·cs.CV·August 6, 2024

Faster Diffusion Action Segmentation

Shuaibing Wang, Shunli Wang, Mingcheng Li, Dingkang Yang, Haopeng, Kuang, Ziyun Qian, Lihua Zhang

PDF

Open Access

TL;DR

EffiDiffAct is a novel, efficient diffusion-based approach for temporal action segmentation that reduces computational costs and improves accuracy by using lightweight encoders and adaptive strategies.

Contribution

The paper introduces EffiDiffAct, combining a lightweight encoder and adaptive skip strategy to enhance diffusion-based TAS with lower computational demands.

Findings

01

Outperforms existing methods on 50Salads, Breakfast, and GTEA datasets.

02

Reduces computational overhead significantly.

03

Improves segmentation accuracy and efficiency.

Abstract

Temporal Action Segmentation (TAS) is an essential task in video analysis, aiming to segment and classify continuous frames into distinct action segments. However, the ambiguous boundaries between actions pose a significant challenge for high-precision segmentation. Recent advances in diffusion models have demonstrated substantial success in TAS tasks due to their stable training process and high-quality generation capabilities. However, the heavy sampling steps required by diffusion models pose a substantial computational burden, limiting their practicality in real-time applications. Additionally, most related works utilize Transformer-based encoder architectures. Although these architectures excel at capturing long-range dependencies, they incur high computational costs and face feature-smoothing issues when processing long video sequences. To address these challenges, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications

MethodsDiffusion