Tactical Reward Shaping: Bypassing Reinforcement Learning with Strategy-Based Goals
Yizheng Zhang, Andre Rosendo

TL;DR
This paper introduces a strategy-based goal setting approach in Deep Reinforcement Learning that accelerates convergence to winning strategies in robotic competitions by focusing on geometric advantages rather than traditional win/loss rewards.
Contribution
It proposes a novel goal formulation method that bypasses traditional reward functions, demonstrating that geometric-based objectives can outperform standard DRL algorithms in complex environments.
Findings
Geometric goal setting accelerates learning convergence.
Geometric-based searches outperform Deep Q Learning in certain scenarios.
Multi-agent path planning improves robot cooperation.
Abstract
Deep Reinforcement Learning (DRL) has shown its promising capabilities to learn optimal policies directly from trial and error. However, learning can be hindered if the goal of the learning, defined by the reward function, is "not optimal". We demonstrate that by setting the goal/target of competition in a counter-intuitive but intelligent way, instead of heuristically trying solutions through many hours the DRL simulation can quickly converge into a winning strategy. The ICRA-DJI RoboMaster AI Challenge is a game of cooperation and competition between robots in a partially observable environment, quite similar to the Counter-Strike game. Unlike the traditional approach to games, where the reward is given at winning the match or hitting the enemy, our DRL algorithm rewards our robots when in a geometric-strategic advantage, which implicitly increases the winning chances. Furthermore, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Evolutionary Game Theory and Cooperation
