Learning to Guide Multiple Heterogeneous Actors from a Single Human Demonstration via Automatic Curriculum Learning in StarCraft II
Nicholas Waytowich, James Hare, Vinicius G. Goecks, Mark Mittrick,, John Richardson, Anjon Basak, Derrik E. Asher

TL;DR
This paper introduces a method that uses a single human demonstration to automatically generate curricula, guiding reinforcement learning agents to effectively command multiple heterogeneous units in StarCraft II, outperforming standard baselines.
Contribution
The work presents a novel automatic curriculum learning approach that leverages a single human demonstration to train complex multi-actor agents in StarCraft II.
Findings
Automated curriculum learning improves training efficiency.
Agents match human expert performance.
Outperforms state-of-the-art reinforcement learning baselines.
Abstract
Traditionally, learning from human demonstrations via direct behavior cloning can lead to high-performance policies given that the algorithm has access to large amounts of high-quality data covering the most likely scenarios to be encountered when the agent is operating. However, in real-world scenarios, expert data is limited and it is desired to train an agent that learns a behavior policy general enough to handle situations that were not demonstrated by the human expert. Another alternative is to learn these policies with no supervision via deep reinforcement learning, however, these algorithms require a large amount of computing time to perform well on complex tasks with high-dimensional state and action spaces, such as those found in StarCraft II. Automatic curriculum learning is a recent mechanism comprised of techniques designed to speed up deep reinforcement learning by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
