Towards Evaluating Plan Generation Approaches with Instructional Texts
Debajyoti Paul Chowdhury, Arghya Biswas, Tomasz Sosnowski and, Kristina Yordanova

TL;DR
This paper introduces a new benchmark dataset of 83 English textual instructions, their structured refinements, and corresponding plans, enabling objective comparison of behaviour model generation methods from instructional texts.
Contribution
It provides the first publicly available dataset for benchmarking plan generation approaches from textual instructions, facilitating fair evaluation and comparison.
Findings
Dataset includes 83 instructions with structured refinements and plans
Enables objective benchmarking of behaviour model generation methods
Supports future research in language grounding and planning
Abstract
Recent research in behaviour understanding through language grounding has shown it is possible to automatically generate behaviour models from textual instructions. These models usually have goal-oriented structure and are modelled with different formalisms from the planning domain such as the Planning Domain Definition Language. One major problem that still remains is that there are no benchmark datasets for comparing the different model generation approaches, as each approach is usually evaluated on domain-specific application. To allow the objective comparison of different methods for model generation from textual instructions, in this report we introduce a dataset consisting of 83 textual instructions in English language, their refinement in a more structured form as well as manually developed plans for each of the instructions. The dataset is publicly available to the community.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · AI-based Problem Solving and Planning · Multi-Agent Systems and Negotiation
