Accelerating Residual Reinforcement Learning with Uncertainty Estimation

Lakshita Dodeja; Karl Schmeckpeper; Shivam Vats; Thomas Weng; Mingxi Jia; George Konidaris; Stefanie Tellex

arXiv:2506.17564·cs.LG·March 16, 2026

Accelerating Residual Reinforcement Learning with Uncertainty Estimation

Lakshita Dodeja, Karl Schmeckpeper, Shivam Vats, Thomas Weng, Mingxi Jia, George Konidaris, Stefanie Tellex

PDF

TL;DR

This paper introduces uncertainty-aware residual reinforcement learning techniques that improve sample efficiency and handle stochastic base policies, demonstrated through simulation benchmarks and real-world deployment.

Contribution

It proposes two novel methods leveraging uncertainty estimates and a modification for stochastic policies, enhancing residual RL's effectiveness and applicability.

Findings

01

Significantly outperforms existing residual RL baselines.

02

Effective in both simulation and real-world tasks.

03

Handles stochastic base policies better than prior methods.

Abstract

Residual Reinforcement Learning (RL) is a popular approach for adapting pretrained policies by learning a lightweight residual policy that provides corrective actions. While Residual RL is more sample-efficient than finetuning the entire base policy, existing methods struggle with sparse rewards and are designed for deterministic base policies. We propose two improvements to Residual RL that further enhance its sample efficiency and make it suitable for stochastic base policies. First, we leverage uncertainty estimates of the base policy to focus exploration on regions in which the base policy is not confident. Second, we propose a simple modification to off-policy residual learning that allows it to observe base actions and better handle stochastic base policies. We evaluate our method with both Gaussian-based and Diffusion-based stochastic base policies on tasks from Robosuite and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsBalanced Selection · Focus