Behaviour-conditioned policies for cooperative reinforcement learning tasks
Antti Keurulainen (1, 3), Isak Westerlund (3), Ariel Kwiatkowski, (3), Samuel Kaski (1, 2), Alexander Ilin (1) ((1) Helsinki Institute, for Information Technology HIIT, Department of Computer Science, Aalto, University, (2) Department of Computer Science

TL;DR
This paper introduces a meta-learning approach for cooperative reinforcement learning, enabling agents to quickly adapt to unknown partner behaviors by training on synthetic behavioral data, improving cooperation efficiency.
Contribution
It proposes a novel meta-learner architecture trained on synthetic agent behaviors, allowing rapid adaptation in cooperative tasks with unknown partners.
Findings
Meta-learner enables quick adaptation to new partner behaviors.
Synthetic behavioral data improves training efficiency.
Method supports automatic task distribution formation.
Abstract
The cooperation among AI systems, and between AI systems and humans is becoming increasingly important. In various real-world tasks, an agent needs to cooperate with unknown partner agent types. This requires the agent to assess the behaviour of the partner agent during a cooperative task and to adjust its own policy to support the cooperation. Deep reinforcement learning models can be trained to deliver the required functionality but are known to suffer from sample inefficiency and slow learning. However, adapting to a partner agent behaviour during the ongoing task requires ability to assess the partner agent type quickly. We suggest a method, where we synthetically produce populations of agents with different behavioural patterns together with ground truth data of their behaviour, and use this data for training a meta-learner. We additionally suggest an agent architecture, which can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Machine Learning and Data Classification
