Disentangled Action Recognition with Knowledge Bases
Zhekun Luo, Shalini Ghosh, Devin Guillory, Keizo Kato, Trevor Darrell,, Huijuan Xu

TL;DR
This paper introduces DARK, a scalable, knowledge graph-based model for action recognition that generalizes well to unseen verb-noun combinations by disentangling features and leveraging external knowledge bases.
Contribution
DARK is a novel approach that factorizes action features and uses knowledge graphs to improve compositional action recognition, addressing scalability issues of previous methods.
Findings
Achieves state-of-the-art performance on Charades dataset.
Demonstrates improved generalization to unseen verb-noun combinations.
Proposes a new large-scale benchmark based on Epic-kitchen dataset.
Abstract
Action in video usually involves the interaction of human with objects. Action labels are typically composed of various combinations of verbs and nouns, but we may not have training data for all possible combinations. In this paper, we aim to improve the generalization ability of the compositional action recognition model to novel verbs or novel nouns that are unseen during training time, by leveraging the power of knowledge graphs. Previous work utilizes verb-noun compositional action nodes in the knowledge graph, making it inefficient to scale since the number of compositional action nodes grows quadratically with respect to the number of verbs and nouns. To address this issue, we propose our approach: Disentangled Action Recognition with Knowledge-bases (DARK), which leverages the inherent compositionality of actions. DARK trains a factorized model by first extracting disentangled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Multimodal Machine Learning Applications
