The impact of Compositionality in Zero-shot Multi-label action recognition for Object-based tasks
Carmela Calabrese, Stefano Berti, Giulia Pasquale, Lorenzo Natale

TL;DR
This paper introduces Dual-VCLIP, a simple yet effective zero-shot multi-label action recognition method for videos, enhancing robotic understanding of object-based actions with minimal training prompts.
Contribution
The paper presents Dual-VCLIP, which combines VCLIP and DualCoOp for improved zero-shot multi-label action recognition with only two learned prompts, validated on the Charades dataset.
Findings
Performs favorably compared to existing methods on Charades dataset.
Shows promising results on unseen actions.
Highlights the importance of verb-object class-splits in training.
Abstract
Addressing multi-label action recognition in videos represents a significant challenge for robotic applications in dynamic environments, especially when the robot is required to cooperate with humans in tasks that involve objects. Existing methods still struggle to recognize unseen actions or require extensive training data. To overcome these problems, we propose Dual-VCLIP, a unified approach for zero-shot multi-label action recognition. Dual-VCLIP enhances VCLIP, a zero-shot action recognition method, with the DualCoOp method for multi-label image classification. The strength of our method is that at training time it only learns two prompts, and it is therefore much simpler than other methods. We validate our method on the Charades dataset that includes a majority of object-based actions, demonstrating that -- despite its simplicity -- our method performs favorably with respect to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
