Fine-grained Affordance Annotation for Egocentric Hand-Object Interaction Videos
Zecheng Yu, Yifei Huang, Ryosuke Furuta, Takuma Yagi, Yusuke Goutsu,, Yoichi Sato

TL;DR
This paper introduces a new annotation scheme for object affordance in egocentric videos, addressing existing dataset issues, and demonstrates its effectiveness in improving affordance recognition and generalization.
Contribution
It proposes a novel annotation scheme combining goal-irrelevant actions and grasp types, and applies it to enhance affordance understanding in egocentric datasets.
Findings
Models trained with new annotations distinguish affordance from other concepts.
Improved prediction of fine-grained interaction hotspots.
Enhanced cross-domain generalization of affordance models.
Abstract
Object affordance is an important concept in hand-object interaction, providing information on action possibilities based on human motor capacity and objects' physical property thus benefiting tasks such as action anticipation and robot imitation learning. However, the definition of affordance in existing datasets often: 1) mix up affordance with object functionality; 2) confuse affordance with goal-related action; and 3) ignore human motor capacity. This paper proposes an efficient annotation scheme to address these issues by combining goal-irrelevant motor actions and grasp types as affordance labels and introducing the concept of mechanical action to represent the action possibilities between two objects. We provide new annotations by applying this scheme to the EPIC-KITCHENS dataset and test our annotation with tasks such as affordance recognition, hand-object interaction hotspots…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Fine-grained Affordance Annotation for Egocentric Hand-Object Interaction Videos· youtube
Taxonomy
TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Action Observation and Synchronization
MethodsTest
