Automatic Environment Shaping is the Next Frontier in RL
Younghyo Park, Gabriel B. Margolis, Pulkit Agrawal

TL;DR
This paper argues that automating environment shaping in reinforcement learning is crucial for scalable robotic task learning, emphasizing the need to reduce human effort in designing training environments.
Contribution
It highlights the importance of automating environment design processes to advance scalable RL for robotics, shifting focus from algorithm tuning to environment shaping.
Findings
Current RL success depends heavily on manual environment design.
Automating environment shaping can significantly reduce human effort in robotics RL.
Focusing on environment parameters is key to scaling RL across diverse tasks.
Abstract
Many roboticists dream of presenting a robot with a task in the evening and returning the next morning to find the robot capable of solving the task. What is preventing us from achieving this? Sim-to-real reinforcement learning (RL) has achieved impressive performance on challenging robotics tasks, but requires substantial human effort to set up the task in a way that is amenable to RL. It's our position that algorithmic improvements in policy optimization and other ideas should be guided towards resolving the primary bottleneck of shaping the training environment, i.e., designing observations, actions, rewards and simulation dynamics. Most practitioners don't tune the RL algorithm, but other environment parameters to obtain a desirable controller. We posit that scaling RL to diverse robotic tasks will only be achieved if the community focuses on automating environment shaping…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Software Testing and Debugging Techniques · Modular Robots and Swarm Intelligence
MethodsSparse Evolutionary Training
