Multi-label affordance mapping from egocentric vision
Lorenzo Mur-Labadia, Jose J. Guerrero, Ruben Martinez-Cantin

TL;DR
This paper introduces a novel multi-label affordance segmentation method from egocentric videos, creating a comprehensive dataset and enabling spatial mapping and task-oriented navigation in interaction-rich environments.
Contribution
It presents a new approach for pixel-level multi-label affordance detection, a large annotated dataset, and a method for spatial mapping and navigation based on affordance hotspots.
Findings
Multi-label detection improves affordance segmentation accuracy.
The dataset EPIC-Aff is the largest with detailed annotations.
Spatial affordance maps enable task-oriented navigation.
Abstract
Accurate affordance detection and segmentation with pixel precision is an important piece in many complex systems based on interactions, such as robots and assitive devices. We present a new approach to affordance perception which enables accurate multi-label segmentation. Our approach can be used to automatically extract grounded affordances from first person videos of interactions using a 3D map of the environment providing pixel level precision for the affordance location. We use this method to build the largest and most complete dataset on affordances based on the EPIC-Kitchen dataset, EPIC-Aff, which provides interaction-grounded, multi-label, metric and spatial affordance annotations. Then, we propose a new approach to affordance segmentation based on multi-label detection which enables multiple affordances to co-exists in the same space, for example if they are associated with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Multi-label Affordance Mapping from Egocentric Vision· youtube
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Robot Manipulation and Learning · Human Pose and Action Recognition
