Learning Dynamic Belief Graphs to Generalize on Text-Based Games
Ashutosh Adhikari, Xingdi Yuan, Marc-Alexandre C\^ot\'e, Mikul\'a\v{s}, Zelinka, Marc-Antoine Rondeau, Romain Laroche, Pascal Poupart, Jian Tang,, Adam Trischler, William L. Hamilton

TL;DR
This paper introduces GATA, a graph-aided transformer agent that learns latent belief graphs from raw text to improve planning and generalization in text-based games, outperforming previous methods.
Contribution
It presents a novel end-to-end trained graph-structured representation method for text-based game agents, enabling better decision-making and generalization.
Findings
GATA outperforms text-only baselines by 24.2% on average.
Learned belief graphs improve policy convergence.
Effective generalization across 500+ games.
Abstract
Playing text-based games requires skills in processing natural language and sequential decision making. Achieving human-level performance on text-based games remains an open challenge, and prior research has largely relied on hand-crafted structured representations and heuristics. In this work, we investigate how an agent can plan and generalize in text-based games using graph-structured representations learned end-to-end from raw text. We propose a novel graph-aided transformer agent (GATA) that infers and updates latent belief graphs during planning to enable effective action selection by capturing the underlying game dynamics. GATA is trained using a combination of reinforcement and self-supervised learning. Our work demonstrates that the learned graph-based representations help agents converge to better policies than their text-only counterparts and facilitate effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Artificial Intelligence in Games
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Byte Pair Encoding
