C2C: Component-to-Composition Learning for Zero-Shot Compositional   Action Recognition

Rongchang Li; Zhenhua Feng; Tianyang Xu; Linze Li; Xiao-Jun Wu,; Muhammad Awais; Sara Atito; Josef Kittler

arXiv:2407.06113·cs.CV·July 22, 2024

C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition

Rongchang Li, Zhenhua Feng, Tianyang Xu, Linze Li, Xiao-Jun Wu,, Muhammad Awais, Sara Atito, Josef Kittler

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new zero-shot compositional action recognition task and benchmark, proposing a novel Component-to-Composition learning method that significantly improves generalization to unseen action compositions.

Contribution

The paper presents a new ZS-CAR task, a benchmark dataset, and a novel C2C learning framework with an enhanced training strategy for better compositional generalization.

Findings

01

C2C outperforms existing methods on the new benchmark

02

The framework effectively recognizes unseen action compositions

03

Enhanced training improves generalization to component variations

Abstract

Compositional actions consist of dynamic (verbs) and static (objects) concepts. Humans can easily recognize unseen compositions using the learned concepts. For machines, solving such a problem requires a model to recognize unseen actions composed of previously observed verbs and objects, thus requiring so-called compositional generalization ability. To facilitate this research, we propose a novel Zero-Shot Compositional Action Recognition (ZS-CAR) task. For evaluating the task, we construct a new benchmark, Something-composition (Sth-com), based on the widely used Something-Something V2 dataset. We also propose a novel Component-to-Composition (C2C) learning method to solve the new ZS-CAR task. C2C includes an independent component learning module and a composition inference module. Last, we devise an enhanced training strategy to address the challenges of component variations between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rongchangli/zscar_c2c
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Medical Imaging and Analysis