Affostruction: 3D Affordance Grounding with Generative Reconstruction

Chunghyun Park; Seunghyeon Lee; Minsu Cho

arXiv:2601.09211·cs.CV·April 14, 2026

Affostruction: 3D Affordance Grounding with Generative Reconstruction

Chunghyun Park, Seunghyeon Lee, Minsu Cho

PDF

1 Models

TL;DR

This paper introduces Affostruction, a generative framework that reconstructs full object geometry from partial RGBD images to improve affordance grounding, outperforming existing methods on benchmarks.

Contribution

It presents a novel generative approach with sparse voxel fusion, flow-based ambiguity modeling, and active view selection for comprehensive 3D affordance grounding.

Findings

01

Achieves 19.1 aIoU on affordance grounding

02

Attains 32.67 IoU for 3D reconstruction

03

Outperforms existing methods by large margins

Abstract

This paper addresses the problem of affordance grounding from RGBD images of an object, which aims to localize surface regions corresponding to a text query that describes an action on the object. While existing methods predict affordance regions only on visible surfaces, we propose Affostruction, a generative framework that reconstructs complete object geometry from partial RGBD observations and grounds affordances on the full shape including unobserved regions. Our approach introduces sparse voxel fusion of multi-view features for constant-complexity generative reconstruction, a flow-based formulation that captures the inherent ambiguity of affordance distributions, and an active view selection strategy guided by predicted affordances. Affostruction outperforms existing methods by large margins on challenging benchmarks, achieving 19.1 aIoU on affordance grounding and 32.67 IoU for 3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
chrockey/Affostruction
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.