# Learning Sparse Control Tasks from Pixels by Latent   Nearest-Neighbor-Guided Explorations

**Authors:** Ruihan Zhao, Ufuk Topcu, Sandeep Chinchali, Mariano Phielipp

arXiv: 2302.14242 · 2023-03-01

## TL;DR

This paper introduces a novel approach for training vision-based reinforcement learning agents on sparse-reward manipulation tasks using demonstration-guided latent space exploration, achieving high sample efficiency in simulation and real robots.

## Contribution

It proposes a new method combining neural dynamics models, demonstration-based embedding, and latent-space reward shaping for efficient sparse-reward learning from pixels.

## Key findings

- State-of-the-art sample efficiency in simulation
- Successful real-robot manipulation training
- Effective exploration guided by demonstration trajectories

## Abstract

Recent progress in deep reinforcement learning (RL) and computer vision enables artificial agents to solve complex tasks, including locomotion, manipulation and video games from high-dimensional pixel observations. However, domain specific reward functions are often engineered to provide sufficient learning signals, requiring expert knowledge. While it is possible to train vision-based RL agents using only sparse rewards, additional challenges in exploration arise. We present a novel and efficient method to solve sparse-reward robot manipulation tasks from only image observations by utilizing a few demonstrations. First, we learn an embedded neural dynamics model from demonstration transitions and further fine-tune it with the replay buffer. Next, we reward the agents for staying close to the demonstrated trajectories using a distance metric defined in the embedding space. Finally, we use an off-policy, model-free vision RL algorithm to update the control policies. Our method achieves state-of-the-art sample efficiency in simulation and enables efficient training of a real Franka Emika Panda manipulator.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14242/full.md

## Figures

25 figures with captions in the complete paper: https://tomesphere.com/paper/2302.14242/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/2302.14242/full.md

---
Source: https://tomesphere.com/paper/2302.14242