TRAC: A Textual Benchmark for Reasoning about Actions and Change
Weinan He, Canming Huang, Zhanhao Xiao, Yongmei Liu

TL;DR
TRAC is a new textual benchmark designed to evaluate how well transformer-based language models can reason about actions and change, focusing on structural generalization in complex scenarios.
Contribution
The paper introduces four RAC tasks as a comprehensive benchmark, TRAC, to assess language models' reasoning about actions and change with minimal linguistic confounds.
Findings
Transformers struggle with RAC tasks in TRAC.
TRAC enables detailed evaluation of models' reasoning capabilities.
Additional research is needed to improve model performance on RAC.
Abstract
Reasoning about actions and change (RAC) is essential to understand and interact with the ever-changing environment. Previous AI research has shown the importance of fundamental and indispensable knowledge of actions, i.e., preconditions and effects. However, traditional methods rely on logical formalization which hinders practical applications. With recent transformer-based language models (LMs), reasoning over text is desirable and seemingly feasible, leading to the question of whether LMs can effectively and efficiently learn to solve RAC problems. We propose four essential RAC tasks as a comprehensive textual benchmark and generate problems in a way that minimizes the influence of other linguistic requirements (e.g., grounding) to focus on RAC. The resulting benchmark, TRAC, encompassing problems of various complexities, facilitates a more granular evaluation of LMs, precisely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
