Observational Learning by Reinforcement Learning
Diana Borsa, Bilal Piot, R\'emi Munos, Olivier Pietquin

TL;DR
This paper demonstrates that reinforcement learning agents can acquire observational learning capabilities by leveraging environmental cues from other agents without explicitly modeling them, highlighting a simplified approach to social learning.
Contribution
It shows that explicit modeling of other agents is unnecessary for observational learning, which can emerge from pure reinforcement learning combined with memory in shared environments.
Findings
RL agents can learn from observing others without explicit models
Observational learning emerges through environmental effects of other agents' actions
Memory and reward correlation facilitate observational learning
Abstract
Observational learning is a type of learning that occurs as a function of observing, retaining and possibly replicating or imitating the behaviour of another agent. It is a core mechanism appearing in various instances of social learning and has been found to be employed in several intelligent species, including humans. In this paper, we investigate to what extent the explicit modelling of other agents is necessary to achieve observational learning through machine learning. Especially, we argue that observational learning can emerge from pure Reinforcement Learning (RL), potentially coupled with memory. Through simple scenarios, we demonstrate that an RL agent can leverage the information provided by the observations of an other agent performing a task in a shared environment. The other agent is only observed through the effect of its actions on the environment and never explicitly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Evolutionary Algorithms and Applications
