R3L: Relative Representations for Reinforcement Learning

Antonio Pio Ricciardi; Valentino Maiorca; Luca Moschella; Riccardo; Marin; Emanuele Rodol\`a

arXiv:2404.12917·cs.LG·February 19, 2025

R3L: Relative Representations for Reinforcement Learning

Antonio Pio Ricciardi, Valentino Maiorca, Luca Moschella, Riccardo, Marin, Emanuele Rodol\`a

PDF

Open Access

TL;DR

This paper introduces R3L, a framework that uses relative representations to enable reinforcement learning agents to adapt to new visual and task variations without retraining, reducing resource use.

Contribution

It adapts the relative representations framework to visual reinforcement learning, allowing for zero-shot generalization across different visual and task domains.

Findings

01

Enables agents to handle unseen visual-task pairs effectively

02

Reduces retraining time and computational resources

03

Demonstrates improved generalization in visual RL environments

Abstract

Visual Reinforcement Learning is a popular and powerful framework that takes full advantage of the Deep Learning breakthrough. It is known that variations in input domains (e.g., different panorama colors due to seasonal changes) or task domains (e.g., altering the target speed of a car) can disrupt agent performance, necessitating new training for each variation. Recent advancements in the field of representation learning have demonstrated the possibility of combining components from different neural networks to create new models in a zero-shot fashion. In this paper, we build upon relative representations, a framework that maps encoder embeddings to a universal space. We adapt this framework to the Visual Reinforcement Learning setting, allowing to combine agents components to create new agents capable of effectively handling novel visual-task pairs not encountered during training.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Face and Expression Recognition · Hand Gesture Recognition Systems

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings