Do deep reinforcement learning agents model intentions?

Tambet Matiisen; Aqeel Labash; Daniel Majoral; Jaan Aru; Raul Vicente

arXiv:1805.06020·cs.AI·May 22, 2018

Do deep reinforcement learning agents model intentions?

Tambet Matiisen, Aqeel Labash, Daniel Majoral, Jaan Aru, Raul Vicente

PDF

Open Access 1 Repo

TL;DR

This paper investigates whether deep reinforcement learning agents explicitly encode other agents' intentions, demonstrating that their neural representations contain goal-related information and proposing training modifications for better generalization.

Contribution

The study shows that deep RL agents explicitly model intentions and introduces training adjustments to improve generalization to unseen agents.

Findings

01

Agents' hidden states encode explicit goal information.

02

Differential goal preferences hinder generalization.

03

Modified training algorithms improve generalization performance.

Abstract

Inferring other agents' mental states such as their knowledge, beliefs and intentions is thought to be essential for effective interactions with other agents. Recently, multiagent systems trained via deep reinforcement learning have been shown to succeed in solving different tasks, but it remains unclear how each agent modeled or represented other agents in their environment. In this work we test whether deep reinforcement learning agents explicitly represent other agents' intentions (their specific aims or goals) during a task in which the agents had to coordinate the covering of different spots in a 2D environment. In particular, we tracked over time the performance of a linear decoder trained to predict the final goal of all agents from the hidden state of each agent's neural network controller. We observed that the hidden layers of agents represented explicit information about other…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

NeuroCSUT/intentions
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Data Stream Mining Techniques

MethodsExperience Replay · Dense Connections · Weight Decay · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Convolution · Batch Normalization · MADDPG