C2F-TCN: A Framework for Semi and Fully Supervised Temporal Action Segmentation
Dipika Singhania, Rahul Rahaman, Angela Yao

TL;DR
This paper introduces C2F-TCN, a flexible encoder-decoder framework for temporal action segmentation that achieves state-of-the-art results in supervised, semi-supervised, and unsupervised settings, with a novel augmentation strategy.
Contribution
It presents a new architecture with a coarse-to-fine ensemble, a model-agnostic augmentation method, and a semi-supervised learning scheme called ICC, advancing temporal action segmentation.
Findings
Achieves accurate results on three benchmark datasets.
Semi-supervised scheme with 40% labeled data matches fully supervised performance.
Introduces a novel augmentation strategy and unsupervised representation learning approach.
Abstract
Temporal action segmentation tags action labels for every frame in an input untrimmed video containing multiple actions in a sequence. For the task of temporal action segmentation, we propose an encoder-decoder-style architecture named C2F-TCN featuring a "coarse-to-fine" ensemble of decoder outputs. The C2F-TCN framework is enhanced with a novel model agnostic temporal feature augmentation strategy formed by the computationally inexpensive strategy of the stochastic max-pooling of segments. It produces more accurate and well-calibrated supervised results on three benchmark action segmentation datasets. We show that the architecture is flexible for both supervised and representation learning. In line with this, we present a novel unsupervised way to learn frame-wise representation from C2F-TCN. Our unsupervised learning approach hinges on the clustering capabilities of the input…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods
