PanoAffordanceNet: Towards Holistic Affordance Grounding in 360{\deg} Indoor Environments
Guoliang Zhu, Wanjun Jia, Caoyang Shao, Yuheng Zhang, Zhiyong Li, Kailun Yang

TL;DR
This paper introduces PanoAffordanceNet, a novel framework for holistic affordance grounding in 360-degree indoor environments, addressing geometric distortions and semantic challenges to improve scene perception for embodied agents.
Contribution
The paper presents a new end-to-end model with distortion-aware calibration and topological restoration, along with the first high-quality panoramic affordance dataset for indoor scenes.
Findings
PanoAffordanceNet outperforms existing methods in experiments.
The framework effectively suppresses semantic drift.
The new dataset enables better scene-level perception.
Abstract
Global perception is essential for embodied agents in 360{\deg} spaces, yet current affordance grounding remains largely object-centric and restricted to perspective views. To bridge this gap, we introduce a novel task: Holistic Affordance Grounding in 360{\deg} Indoor Environments. This task faces unique challenges, including severe geometric distortions from Equirectangular Projection (ERP), semantic dispersion, and cross-scale alignment difficulties. We propose PanoAffordanceNet, an end-to-end framework featuring a Distortion-Aware Spectral Modulator (DASM) for latitude-dependent calibration and an Omni-Spherical Densification Head (OSDH) to restore topological continuity from sparse activations. By integrating multi-level constraints comprising pixel-wise, distributional, and region-text contrastive objectives, our framework effectively suppresses semantic drift under low…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Robot Manipulation and Learning
