DiffG-RL: Leveraging Difference between State and Common Sense
Tsunehiko Tanaka, Daiki Kimura, Michiaki Tatsubori

TL;DR
DiffG-RL is a novel reinforcement learning agent that constructs a difference graph between environment states and common sense to improve decision-making in text-based games, outperforming baselines by 17%.
Contribution
It introduces a difference graph approach and a framework for effectively utilizing common sense in environment understanding for text-based games.
Findings
Outperforms baselines by 17% in text-based game scores.
Effectively models the difference between environment states and common sense.
Demonstrates the importance of selective common sense usage in decision-making.
Abstract
Taking into account background knowledge as the context has always been an important part of solving tasks that involve natural language. One representative example of such tasks is text-based games, where players need to make decisions based on both description text previously shown in the game, and their own background knowledge about the language and common sense. In this work, we investigate not simply giving common sense, as can be seen in prior research, but also its effective usage. We assume that a part of the environment states different from common sense should constitute one of the grounds for action selection. We propose a novel agent, DiffG-RL, which constructs a Difference Graph that organizes the environment states and common sense by means of interactive objects with a dedicated graph encoder. DiffG-RL also contains a framework for extracting the appropriate amount and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
