Learning Affordance Grounding from Exocentric Images

Hongchen Luo; Wei Zhai; Jing Zhang; Yang Cao; Dacheng Tao

arXiv:2203.09905·cs.CV·March 21, 2022

Learning Affordance Grounding from Exocentric Images

Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, Dacheng Tao

PDF

Open Access 2 Repos

TL;DR

This paper introduces a novel cross-view framework for affordance grounding that transfers knowledge from exocentric human-object interactions to egocentric object images, improving localization accuracy with minimal supervision.

Contribution

It proposes a new task and a cross-view knowledge transfer method for affordance grounding, leveraging exocentric interactions to enhance egocentric affordance perception.

Findings

01

Outperforms existing models on objective metrics

02

Constructed a large-scale AGD20K dataset with 20K images

03

Effectively localizes affordance regions using minimal supervision

Abstract

Affordance grounding, a task to ground (i.e., localize) action possibility region in objects, which faces the challenge of establishing an explicit link with object parts due to the diversity of interactive affordance. Human has the ability that transform the various exocentric interactions to invariant egocentric affordance so as to counter the impact of interactive diversity. To empower an agent with such ability, this paper proposes a task of affordance grounding from exocentric view, i.e., given exocentric human-object interaction and egocentric object images, learning the affordance knowledge of the object and transferring it to the egocentric image using only the affordance label as supervision. To this end, we devise a cross-view knowledge transfer framework that extracts affordance-specific features from exocentric interactions and enhances the perception of affordance regions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Advanced Vision and Imaging · Human Pose and Action Recognition