# Adaptive Scaling of Policy Constraints for Offline Reinforcement Learning

**Authors:** Tan Jing, Xiaorui Li, Chao Yao, Xiaojuan Ban, Yuetong Fang, Renjing Xu, Zhaolin Yuan

arXiv: 2508.19900 · 2026-04-30

## TL;DR

The paper introduces ASPC, a dynamic framework for balancing policy constraints in offline RL, reducing the need for dataset-specific hyperparameter tuning and improving performance across diverse datasets.

## Contribution

ASPC is a second-order differentiable method that adaptively scales policy constraints, outperforming existing methods with minimal tuning and computational overhead.

## Key findings

- ASPC outperforms other adaptive constraint methods on 39 datasets.
- ASPC requires only a single hyperparameter setting across datasets.
- The code for ASPC will be publicly available at the specified GitHub link.

## Abstract

Offline reinforcement learning (RL) enables learning effective policies from fixed datasets without any environment interaction. Existing methods typically employ policy constraints to mitigate the distribution shift encountered during offline RL training. However, because the scale of the constraints varies across tasks and datasets of differing quality, existing methods must meticulously tune hyperparameters to match each dataset, which is time-consuming and often impractical. We propose Adaptive Scaling of Policy Constraints (ASPC), a second-order differentiable framework that dynamically balances RL and behavior cloning (BC) during training. We theoretically analyze its performance improvement guarantee. In experiments on 39 datasets across four D4RL domains, ASPC using a single hyperparameter configuration outperforms other adaptive constraint methods and state-of-the-art offline RL algorithms that require per-dataset tuning while incurring only minimal computational overhead. The code will be released at https://github.com/Colin-Jing/ASPC.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.19900/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/2508.19900/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/2508.19900/full.md

---
Source: https://tomesphere.com/paper/2508.19900