Improving Actor-Critic Training with Steerable Action-Value Approximation Errors

Bahareh Tasdighi; Nicklas Werge; Yi-Shan Wu; Melih Kandemir

arXiv:2406.03890·cs.LG·August 21, 2025·1 cites

Improving Actor-Critic Training with Steerable Action-Value Approximation Errors

Bahareh Tasdighi, Nicklas Werge, Yi-Shan Wu, Melih Kandemir

PDF

Open Access

TL;DR

This paper introduces USAC, a flexible framework for actor-critic reinforcement learning that dynamically balances optimism and pessimism in value estimates, improving performance in continuous control tasks.

Contribution

USAC allows independent control of optimism and pessimism in actor-critic algorithms, enabling adaptive exploration strategies based on critic uncertainty.

Findings

01

USAC outperforms state-of-the-art algorithms in various control tasks.

02

Adjusting optimism and pessimism impacts learning stability and performance.

03

Dynamic adaptation of exploration improves policy refinement.

Abstract

Off-policy actor-critic algorithms have shown strong potential in deep reinforcement learning for continuous control tasks. Their success primarily comes from leveraging pessimistic state-action value function updates, which reduce function approximation errors and stabilize learning. However, excessive pessimism can limit exploration, preventing the agent from effectively refining its policies. Conversely, optimism can encourage exploration but may lead to high-risk behaviors and unstable learning if not carefully managed. To address this trade-off, we propose Utility Soft Actor-Critic (USAC), a novel framework that allows independent, interpretable control of pessimism and optimism for both the actor and the critic. USAC dynamically adapts its exploration strategy based on the uncertainty of critics using a utility function, enabling a task-specific balance between optimism and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOptimism, Hope, and Well-being · Mental Health Research Topics