Knowledge-Aware Procedural Text Understanding with Multi-Stage Training
Zhihan Zhang, Xiubo Geng, Tao Qin, Yunfang Wu, Daxin Jiang

TL;DR
This paper introduces KOALA, a knowledge-aware model with multi-stage training that enhances procedural text understanding by integrating external knowledge and fine-tuning strategies, achieving state-of-the-art results.
Contribution
The paper proposes a novel approach combining external knowledge retrieval and multi-stage training to improve procedural text understanding.
Findings
KOALA outperforms baselines on ProPara and Recipes datasets.
External knowledge integration improves reasoning accuracy.
Multi-stage training enhances model performance on procedural tasks.
Abstract
Procedural text describes dynamic state changes during a step-by-step natural process (e.g., photosynthesis). In this work, we focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process. Although recent approaches have achieved substantial progress, their results are far behind human performance. Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved, which require the incorporation of external knowledge bases. Previous works on external knowledge injection usually rely on noisy web mining tools and heuristic rules with limited applicable scenarios. In this paper, we propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge in this task. Specifically, we retrieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Dropout · Linear Warmup With Linear Decay · Layer Normalization · Attention Dropout · WordPiece · Weight Decay
