Zero-shot Compositional Action Recognition with Neural Logic Constraints
Gefan Ye, Lin Li, Kexin Li, Jun Xiao, Long Chen

TL;DR
This paper introduces LogicCAR, a neural framework that incorporates symbolic logic constraints to improve zero-shot compositional action recognition by modeling structure and hierarchy, leading to better generalization.
Contribution
It proposes a novel logic-driven approach that embeds compositional and hierarchical constraints into neural networks for zero-shot action recognition.
Findings
Outperforms baseline methods on Sth-com dataset
Effectively models compositional and hierarchical structures
Enhances reasoning capacity in zero-shot scenarios
Abstract
Zero-shot compositional action recognition (ZS-CAR) aims to identify unseen verb-object compositions in the videos by exploiting the learned knowledge of verb and object primitives during training. Despite compositional learning's progress in ZS-CAR, two critical challenges persist: 1) Missing compositional structure constraint, leading to spurious correlations between primitives; 2) Neglecting semantic hierarchy constraint, leading to semantic ambiguity and impairing the training process. In this paper, we argue that human-like symbolic reasoning offers a principled solution to these challenges by explicitly modeling compositional and hierarchical structured abstraction. To this end, we propose a logic-driven ZS-CAR framework LogicCAR that integrates dual symbolic constraints: Explicit Compositional Logic and Hierarchical Primitive Logic. Specifically, the former models the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Action Observation and Synchronization
