Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker,, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, and Sergey Levine

TL;DR
The paper introduces Soft Actor-Critic (SAC), an off-policy deep reinforcement learning algorithm based on maximum entropy principles, which improves training stability, sample efficiency, and robustness, making it suitable for real-world robotics applications.
Contribution
The paper extends SAC with modifications for faster training and hyperparameter stability, including automatic temperature tuning, demonstrating state-of-the-art performance on benchmarks and real-world tasks.
Findings
SAC outperforms prior methods in sample efficiency and asymptotic performance.
SAC demonstrates high stability across different random seeds.
SAC is effective in real-world robotics tasks like locomotion and manipulation.
Abstract
Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample complexity and brittleness to hyperparameters. Both of these challenges limit the applicability of such methods to real-world domains. In this paper, we describe Soft Actor-Critic (SAC), our recently introduced off-policy actor-critic algorithm based on the maximum entropy RL framework. In this framework, the actor aims to simultaneously maximize expected return and entropy. That is, to succeed at the task while acting as randomly as possible. We extend SAC to incorporate a number of modifications that accelerate training and improve stability with respect to the hyperparameters, including a constrained formulation that automatically tunes the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Evolutionary Algorithms and Applications
MethodsExperience Replay · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Soft Actor-Critic (Autotuned Temperature)
