C2F-TCN: A Framework for Semi and Fully Supervised Temporal Action   Segmentation

Dipika Singhania; Rahul Rahaman; Angela Yao

arXiv:2212.11078·cs.CV·December 22, 2022

C2F-TCN: A Framework for Semi and Fully Supervised Temporal Action Segmentation

Dipika Singhania, Rahul Rahaman, Angela Yao

PDF

Open Access

TL;DR

This paper introduces C2F-TCN, a flexible encoder-decoder framework for temporal action segmentation that achieves state-of-the-art results in supervised, semi-supervised, and unsupervised settings, with a novel augmentation strategy.

Contribution

It presents a new architecture with a coarse-to-fine ensemble, a model-agnostic augmentation method, and a semi-supervised learning scheme called ICC, advancing temporal action segmentation.

Findings

01

Achieves accurate results on three benchmark datasets.

02

Semi-supervised scheme with 40% labeled data matches fully supervised performance.

03

Introduces a novel augmentation strategy and unsupervised representation learning approach.

Abstract

Temporal action segmentation tags action labels for every frame in an input untrimmed video containing multiple actions in a sequence. For the task of temporal action segmentation, we propose an encoder-decoder-style architecture named C2F-TCN featuring a "coarse-to-fine" ensemble of decoder outputs. The C2F-TCN framework is enhanced with a novel model agnostic temporal feature augmentation strategy formed by the computationally inexpensive strategy of the stochastic max-pooling of segments. It produces more accurate and well-calibrated supervised results on three benchmark action segmentation datasets. We show that the architecture is flexible for both supervised and representation learning. In line with this, we present a novel unsupervised way to learn frame-wise representation from C2F-TCN. Our unsupervised learning approach hinges on the clustering capabilities of the input…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods