Mitigating Long-Tail Bias in HOI Detection via Adaptive Diversity Cache
Yuqiu Jiang, Xiaozhen Qiao, Yifan Chen, Ye Zheng, Zhe Sun, Xuelong Li

TL;DR
This paper introduces the Adaptive Diversity Cache (ADC), a training-free, plug-and-play module that mitigates long-tail bias in HOI detection by constructing class-specific caches to improve rare category recognition without additional training.
Contribution
The paper proposes ADC, a novel training-free mechanism that enhances HOI detection by addressing long-tail bias through adaptive feature caching and augmentation.
Findings
ADC improves rare category detection in HOI tasks.
ADC enhances existing detectors without additional training.
Experimental results on HICO-DET and V-COCO validate effectiveness.
Abstract
Human-Object Interaction (HOI) detection is a fundamental task in computer vision, empowering machines to comprehend human-object relationships in diverse real-world scenarios. Recent advances in VLMs have significantly improved HOI detection by leveraging rich cross-modal representations. However, most existing VLM-based approaches rely heavily on additional training or prompt tuning, resulting in substantial computational overhead and limited scalability, particularly in long-tailed scenarios where rare interactions are severely underrepresented. In this paper, we propose the Adaptive Diversity Cache (ADC) module, a novel training-free and plug-and-play mechanism designed to mitigate long-tail bias in HOI detection. ADC constructs class-specific caches that accumulate high-confidence and diverse feature representations during inference. The method incorporates adaptive capacity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
