Soft Actor-Critic With Integer Actions
Ting-Han Fan, Yubo Wang

TL;DR
This paper introduces a novel approach combining Soft Actor-Critic with an integer reparameterization to efficiently handle high-dimensional integer action spaces, outperforming existing methods in certain control tasks.
Contribution
It proposes a low-dimensional integer reparameterization for SAC that leverages the structure of integer actions, improving performance in industrial and robotic control applications.
Findings
SAC with integer reparameterization matches continuous SAC in robot tasks.
Outperforms PPO in power distribution system control.
Reparameterization avoids one-hot encoding, reducing complexity.
Abstract
Reinforcement learning is well-studied under discrete actions. Integer actions setting is popular in the industry yet still challenging due to its high dimensionality. To this end, we study reinforcement learning under integer actions by incorporating the Soft Actor-Critic (SAC) algorithm with an integer reparameterization. Our key observation for integer actions is that their discrete structure can be simplified using their comparability property. Hence, the proposed integer reparameterization does not need one-hot encoding and is of low dimensionality. Experiments show that the proposed SAC under integer actions is as good as the continuous action version on robot control tasks and outperforms Proximal Policy Optimization on power distribution systems control tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Smart Grid Security and Resilience · Adversarial Robustness in Machine Learning
MethodsAverage Pooling · Global Average Pooling · Convolution · Dilated Convolution · 1x1 Convolution · Switchable Atrous Convolution
