Funnel-HOI: Top-Down Perception for Zero-Shot HOI Detection

Sandipan Sarma; Agney Talwarr; Arijit Sur

arXiv:2507.12628·cs.CV·July 18, 2025

Funnel-HOI: Top-Down Perception for Zero-Shot HOI Detection

Sandipan Sarma, Agney Talwarr, Arijit Sur

PDF

Open Access

TL;DR

Funnel-HOI introduces a top-down encoder-focused approach with a novel co-attention mechanism for improved zero-shot human-object interaction detection, achieving state-of-the-art results on benchmark datasets.

Contribution

The paper proposes a new top-down framework with an asymmetric co-attention mechanism and a novel loss for better scene understanding in HOID, especially in zero-shot scenarios.

Findings

01

Achieves up to 12.4% and 8.4% improvements on unseen and rare HOI categories.

02

Outperforms existing methods on HICO-DET and V-COCO datasets.

03

Effective in both fully-supervised and zero-shot settings.

Abstract

Human-object interaction detection (HOID) refers to localizing interactive human-object pairs in images and identifying the interactions. Since there could be an exponential number of object-action combinations, labeled data is limited - leading to a long-tail distribution problem. Recently, zero-shot learning emerged as a solution, with end-to-end transformer-based object detectors adapted for HOID becoming successful frameworks. However, their primary focus is designing improved decoders for learning entangled or disentangled interpretations of interactions. We advocate that HOI-specific cues must be anticipated at the encoder stage itself to obtain a stronger scene interpretation. Consequently, we build a top-down framework named Funnel-HOI inspired by the human tendency to grasp well-defined concepts first and then associate them with abstract concepts during scene understanding. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiation Detection and Scintillator Technologies · Infrared Target Detection Methodologies