Confounding Robust Continuous Control via Automatic Reward Shaping

Mateo Juliani; Mingxuan Li; Elias Bareinboim

arXiv:2602.10305·cs.LG·February 12, 2026

Confounding Robust Continuous Control via Automatic Reward Shaping

Mateo Juliani, Mingxuan Li, Elias Bareinboim

PDF

Open Access

TL;DR

This paper introduces a method to automatically learn reward shaping functions for continuous control in reinforcement learning, robust to unobserved confounders, using causal inference techniques and tested with strong results on benchmark tasks.

Contribution

It presents a novel approach to automatically learn confounding-robust reward shaping functions from offline data using causal Bellman equations and potential-based reward shaping.

Findings

01

Strong performance on continuous control benchmarks

02

Robustness to unobserved confounders demonstrated

03

First causal perspective approach in this context

Abstract

Reward shaping has been applied widely to accelerate Reinforcement Learning (RL) agents' training. However, a principled way of designing effective reward shaping functions, especially for complex continuous control problems, remains largely under-explained. In this work, we propose to automatically learn a reward shaping function for continuous control problems from offline datasets, potentially contaminated by unobserved confounding variables. Specifically, our method builds upon the recently proposed causal Bellman equation to learn a tight upper bound on the optimal state values, which is then used as the potentials in the Potential-Based Reward Shaping (PBRS) framework. Our proposed reward shaping algorithm is tested with Soft-Actor-Critic (SAC) on multiple commonly used continuous control benchmarks and exhibits strong performance guarantees under unobserved confounders. More…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Neurological disorders and treatments