Weighted Entropy Modification for Soft Actor-Critic
Yizhou Zhao, Song-Chun Zhu

TL;DR
This paper introduces a weighted entropy approach to reinforcement learning, enhancing exploration and achieving state-of-the-art results on Mujoco tasks by incorporating prior knowledge and experience replay.
Contribution
It generalizes maximum Shannon entropy to weighted entropy, proposing a self-balancing exploration algorithm that improves RL performance.
Findings
Achieved state-of-the-art performance on Mujoco tasks
Introduced a weighted entropy framework for RL exploration
Demonstrated simplicity and effectiveness of the method
Abstract
We generalize the existing principle of the maximum Shannon entropy in reinforcement learning (RL) to weighted entropy by characterizing the state-action pairs with some qualitative weights, which can be connected with prior knowledge, experience replay, and evolution process of the policy. We propose an algorithm motivated for self-balancing exploration with the introduced weight function, which leads to state-of-the-art performance on Mujoco tasks despite its simplicity in implementation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural dynamics and brain function · Evolutionary Algorithms and Applications
