HOLa: Zero-Shot HOI Detection with Low-Rank Decomposed VLM Feature Adaptation
Qinqian Lei, Bo Wang, Robby T. Tan

TL;DR
HOLa is a novel zero-shot human-object interaction detection method that leverages low-rank decomposition of vision-language model features to improve generalization and action distinction, achieving state-of-the-art results.
Contribution
HOLa introduces a low-rank feature decomposition and weight adaptation framework for zero-shot HOI detection, enhancing unseen class generalization and action differentiation.
Findings
Achieves unseen-class mAP of 27.91 on HICO-DET
Sets new state-of-the-art in zero-shot HOI detection
Improves action distinction through human-object tokens and LLM-guided regularization
Abstract
Zero-shot human-object interaction (HOI) detection remains a challenging task, particularly in generalizing to unseen actions. Existing methods address this challenge by tapping Vision-Language Models (VLMs) to access knowledge beyond the training data. However, they either struggle to distinguish actions involving the same object or demonstrate limited generalization to unseen classes. In this paper, we introduce HOLa (Zero-Shot HOI Detection with Low-Rank Decomposed VLM Feature Adaptation), a novel approach that both enhances generalization to unseen classes and improves action distinction. In training, HOLa decomposes VLM text features for given HOI classes via low-rank factorization, producing class-shared basis features and adaptable weights. These features and weights form a compact HOI representation that preserves shared information across classes, enhancing generalization to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
