A Song of Ice and Fire: Analyzing Textual Autotelic Agents in ScienceWorld
Laetitia Teodorescu, Xingdi Yuan, Marc-Alexandre C\^ot\'e, Pierre-Yves, Oudeyer

TL;DR
This paper investigates how autotelic reinforcement learning agents can effectively learn from social feedback, rare goal examples, and multi-stage exploration within a rich textual environment to improve autonomous behavior discovery.
Contribution
It introduces methods for social peer feedback selectivity, over-sampling rare goals, and combining exploration strategies to enhance autotelic agent learning.
Findings
Selective social feedback improves goal learning.
Over-sampling rare goals enhances experience replay.
Sequential exploration with intermediate goals boosts performance.
Abstract
Building open-ended agents that can autonomously discover a diversity of behaviours is one of the long-standing goals of artificial intelligence. This challenge can be studied in the framework of autotelic RL agents, i.e. agents that learn by selecting and pursuing their own goals, self-organizing a learning curriculum. Recent work identified language as a key dimension of autotelic learning, in particular because it enables abstract goal sampling and guidance from social peers for hindsight relabelling. Within this perspective, we study the following open scientific questions: What is the impact of hindsight feedback from a social peer (e.g. selective vs. exhaustive)? How can the agent learn from very rare language goal examples in its experience replay? How can multiple forms of exploration be combined, and take advantage of easier goals as stepping stones to reach harder ones? To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multi-Agent Systems and Negotiation · Reinforcement Learning in Robotics
MethodsExperience Replay
