ASP-Bench: From Natural Language to Logic Programs
Stefan Szeider

TL;DR
ASP-Bench is a comprehensive benchmark for translating natural language specifications into Answer Set Programs, covering diverse reasoning features and enabling evaluation of different systems' modeling capabilities.
Contribution
Introduces ASP-Bench, a new benchmark with 128 natural language problems and systematic coverage of ASP features for evaluating translation systems.
Findings
ReAct framework achieves full saturation on the benchmark.
Feedback-driven iterative refinement improves natural language to ASP translation.
Analysis reveals key factors influencing problem modeling difficulty.
Abstract
Automating the translation of natural-language specifications into logic programs is a challenging task that affects neurosymbolic engineering. We present ASP-Bench, a benchmark comprising 128 natural language problem instances, 64 base problems with easy and hard variants. It evaluates systems that translate natural-language problems into Answer Set Programs (ASPs), a prominent form of logic programming. It provides systematic coverage of ASP features, including choice rules, aggregates, and optimization. Each problem includes reference validators that check whether solutions satisfy the problem specification. We characterize problems along seven largely independent reasoning aspects (optimization, temporal reasoning, default logic, resource allocation, recursion, spatial reasoning, and quantitative complexity), providing a multidimensional view of modeling difficulty. We test the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, Reasoning, and Knowledge · Multimodal Machine Learning Applications · Topic Modeling
