School of hard knocks: Curriculum analysis for Pommerman with a fixed   computational budget

Omkar Shelke; Hardik Meisheri; Harshad Khadilkar

arXiv:2102.11762·cs.AI·January 11, 2022

School of hard knocks: Curriculum analysis for Pommerman with a fixed computational budget

Omkar Shelke, Hardik Meisheri, Harshad Khadilkar

PDF

Open Access

TL;DR

This paper investigates curriculum strategies for training reinforcement learning agents in Pommerman within a strict computational budget, finding that starting with diverse opponents from the beginning yields more robust policies than gradual difficulty increase.

Contribution

It demonstrates that training against all opponent policies early on outperforms curriculum-based approaches in constrained computational settings.

Findings

01

Early exposure to all opponent policies improves robustness.

02

Curriculum strategies are less effective under strict computational budgets.

03

Modifying environment properties impacts agent performance.

Abstract

Pommerman is a hybrid cooperative/adversarial multi-agent environment, with challenging characteristics in terms of partial observability, limited or no communication, sparse and delayed rewards, and restrictive computational time limits. This makes it a challenging environment for reinforcement learning (RL) approaches. In this paper, we focus on developing a curriculum for learning a robust and promising policy in a constrained computational budget of 100,000 games, starting from a fixed base policy (which is itself trained to imitate a noisy expert policy). All RL algorithms starting from the base policy use vanilla proximal-policy optimization (PPO) with the same reward function, and the only difference between their training is the mix and sequence of opponent policies. One expects that beginning training with simpler opponents and then gradually increasing the opponent difficulty…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)