A Benchmark for Low-Switching-Cost Reinforcement Learning
Shusheng Xu, Yancheng Liang, Yunfei Li, Simon Shaolei Du, Yi Wu

TL;DR
This paper introduces the first empirical benchmark for low-switching-cost reinforcement learning, evaluating various approaches across multiple environments to balance reward maximization with minimal policy changes.
Contribution
It systematically compares different low-switching-cost RL methods and provides insights into reducing policy switches without sacrificing sample efficiency.
Findings
Certain approaches significantly reduce policy switches.
Maintained sample efficiency under low-switching constraints.
Benchmark results guide future low-switching RL development.
Abstract
A ubiquitous requirement in many practical reinforcement learning (RL) applications, including medical treatment, recommendation system, education and robotics, is that the deployed policy that actually interacts with the environment cannot change frequently. Such an RL setting is called low-switching-cost RL, i.e., achieving the highest reward while reducing the number of policy switches during training. Despite the recent trend of theoretical studies aiming to design provably efficient RL algorithms with low switching costs, none of the existing approaches have been thoroughly evaluated in popular RL testbeds. In this paper, we systematically studied a wide collection of policy-switching approaches, including theoretically guided criteria, policy-difference-based methods, and non-adaptive baselines. Through extensive experiments on a medical treatment environment, the Atari games, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
