CLOT: Closed Loop Optimal Transport for Unsupervised Action Segmentation
Elena Bueno-Benito, Mariella Dimiccoli

TL;DR
CLOT introduces a cyclic optimal transport framework that improves unsupervised action segmentation by jointly learning action representations and pseudo-labels through a multi-level feedback loop.
Contribution
It presents a novel OT-based cyclic learning framework with an encoder-decoder architecture for enhanced unsupervised action segmentation.
Findings
Improves segmentation accuracy on benchmark datasets.
Leverages multi-level cyclic feature learning for better pseudo-label refinement.
Demonstrates the effectiveness of cyclical learning in unsupervised settings.
Abstract
Unsupervised action segmentation has recently pushed its limits with ASOT, an optimal transport (OT)-based method that simultaneously learns action representations and performs clustering using pseudo-labels. Unlike other OT-based approaches, ASOT makes no assumptions about action ordering and can decode a temporally consistent segmentation from a noisy cost matrix between video frames and action labels. However, the resulting segmentation lacks segment-level supervision, limiting the effectiveness of feedback between frames and action representations. To address this limitation, we propose Closed Loop Optimal Transport (CLOT), a novel OT-based framework with a multi-level cyclic feature learning mechanism. Leveraging its encoder-decoder architecture, CLOT learns pseudo-labels alongside frame and segment embeddings by solving two separate OT problems. It then refines both frame…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
