Environment as Policy: Learning to Race in Unseen Tracks
Hongze Wang, Jiaxu Xing, Nico Messikommer, Davide Scaramuzza

TL;DR
This paper introduces an adaptive environment-shaping framework for reinforcement learning agents, enabling drone racing policies to generalize effectively to unseen tracks without retraining, by dynamically balancing environment difficulty during training.
Contribution
We propose a novel adaptive environment-shaping method using a secondary RL policy to improve generalization of drone racing agents to new tracks without retraining.
Findings
Agents successfully race in unseen challenging tracks
Method outperforms existing environment-shaping techniques
Effective in both simulation and real-world experiments
Abstract
Reinforcement learning (RL) has achieved outstanding success in complex robot control tasks, such as drone racing, where the RL agents have outperformed human champions in a known racing track. However, these agents fail in unseen track configurations, always requiring complete retraining when presented with new track layouts. This work aims to develop RL agents that generalize effectively to novel track configurations without retraining. The naive solution of training directly on a diverse set of track layouts can overburden the agent, resulting in suboptimal policy learning as the increased complexity of the environment impairs the agent's ability to learn to fly. To enhance the generalizability of the RL agent, we propose an adaptive environment-shaping framework that dynamically adjusts the training environment based on the agent's performance. We achieve this by leveraging a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAmerican Environmental and Regional History
