2by2: Weakly-Supervised Learning for Global Action Segmentation

Elena Bueno-Benito; Mariella Dimiccoli

arXiv:2412.12829·cs.CV·December 18, 2024

2by2: Weakly-Supervised Learning for Global Action Segmentation

Elena Bueno-Benito, Mariella Dimiccoli

PDF

TL;DR

This paper introduces a weakly-supervised method using triadic learning and Siamese transformers to improve global action segmentation across videos with varying activities, addressing the challenge of non-shared action order.

Contribution

It proposes a novel triadic learning approach with a Siamese transformer backbone for weakly-supervised global action segmentation, outperforming existing methods.

Findings

01

Outperforms state-of-the-art on Breakfast and YouTube Instructions datasets.

02

Effectively learns action representations with weak supervision.

03

Handles diverse activity sequences without shared temporal order.

Abstract

This paper presents a simple yet effective approach for the poorly investigated task of global action segmentation, aiming at grouping frames capturing the same action across videos of different activities. Unlike the case of videos depicting all the same activity, the temporal order of actions is not roughly shared among all videos, making the task even more challenging. We propose to use activity labels to learn, in a weakly-supervised fashion, action representations suitable for global action segmentation. For this purpose, we introduce a triadic learning approach for video pairs, to ensure intra-video action discrimination, as well as inter-video and inter-activity action association. For the backbone architecture, we use a Siamese network based on sparse transformers that takes as input video pairs and determine whether they belong to the same activity. The proposed approach is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSiamese Network