Imitation Learning with Concurrent Actions in 3D Games
Jack Harmer, Linus Gissl\'en, Jorge del Val, Henrik Holst, Joakim, Bergdahl, Tom Olsson, Kristoffer Sj\"o\"o, Magnus Nordin

TL;DR
This paper introduces a deep reinforcement learning architecture enabling multiple actions per step, significantly improving training efficiency and performance in complex 3D game environments through imitation learning and TD RL.
Contribution
A novel multi-action policy architecture that combines imitation learning with TD reinforcement learning for enhanced efficiency and performance in 3D game agents.
Findings
4x faster training time compared to standard TD RL
2.5x performance improvement over single action methods
Enhanced exploration and rapid learning in complex 3D environments
Abstract
In this work we describe a novel deep reinforcement learning architecture that allows multiple actions to be selected at every time-step in an efficient manner. Multi-action policies allow complex behaviours to be learnt that would otherwise be hard to achieve when using single action selection techniques. We use both imitation learning and temporal difference (TD) reinforcement learning (RL) to provide a 4x improvement in training time and 2.5x improvement in performance over single action selection TD RL. We demonstrate the capabilities of this network using a complex in-house 3D game. Mimicking the behavior of the expert teacher significantly improves world state exploration and allows the agents vision system to be trained more rapidly than TD RL alone. This initial training technique kick-starts TD learning and the agent quickly learns to surpass the capabilities of the expert.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
