Leveraging triplet loss for unsupervised action segmentation

E. Bueno-Benito; B. Tura; M. Dimiccoli

arXiv:2304.06403·cs.CV·July 20, 2023·1 cites

Leveraging triplet loss for unsupervised action segmentation

E. Bueno-Benito, B. Tura, M. Dimiccoli

PDF

Open Access 1 Repo

TL;DR

This paper introduces an unsupervised deep metric learning framework using triplet loss and a novel triplet selection strategy to improve action segmentation in videos without requiring labeled training data.

Contribution

It presents a new fully unsupervised approach that learns action representations directly from videos, outperforming existing methods in boundary detection quality.

Findings

01

Higher quality temporal boundary recovery compared to existing methods

02

Achieves competitive results on benchmark datasets

03

Effective in discovering actions without labeled data

Abstract

In this paper, we propose a novel fully unsupervised framework that learns action representations suitable for the action segmentation task from the single input video itself, without requiring any training data. Our method is a deep metric learning approach rooted in a shallow network with a triplet loss operating on similarity distributions and a novel triplet selection strategy that effectively models temporal and semantic priors to discover actions in the new representational space. Under these circumstances, we successfully recover temporal boundaries in the learned action representations with higher quality compared with existing unsupervised approaches. The proposed method is evaluated on two widely used benchmark datasets for the action segmentation task and it achieves competitive performance by applying a generic clustering algorithm on the learned representations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

elenabbbuenob/tsa-actionseg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Gait Recognition and Analysis

MethodsTriplet Loss