Object-Centric Action-Enhanced Representations for Robot Visuo-Motor Policy Learning
Nikos Giannakakis, Argyris Manetas, Panagiotis P. Filntisis, Petros Maragos, George Retsinas

TL;DR
This paper introduces an object-centric encoder using Slot Attention and pretrained models to improve robot visuo-motor learning by integrating semantic segmentation and visual representation, reducing the need for robot-specific datasets.
Contribution
The work presents a novel integrated object-centric encoder that combines semantic segmentation with visual representation learning, leveraging pretrained models and fine-tuning on human action videos.
Findings
Pretrained models on out-of-domain datasets benefit robot learning.
Fine-tuning on human action videos improves performance.
Integrated segmentation and encoding enhance reinforcement and imitation learning.
Abstract
Learning visual representations from observing actions to benefit robot visuo-motor policy generation is a promising direction that closely resembles human cognitive function and perception. Motivated by this, and further inspired by psychological theories suggesting that humans process scenes in an object-based fashion, we propose an object-centric encoder that performs semantic segmentation and visual representation generation in a coupled manner, unlike other works, which treat these as separate processes. To achieve this, we leverage the Slot Attention mechanism and use the SOLV model, pretrained in large out-of-domain datasets, to bootstrap fine-tuning on human action video data. Through simulated robotic tasks, we demonstrate that visual representations can enhance reinforcement and imitation learning training, highlighting the effectiveness of our integrated approach for semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsSoftmax · Attention Is All You Need
