Archer: A Human-Labeled Text-to-SQL Dataset with Arithmetic, Commonsense   and Hypothetical Reasoning

Danna Zheng; Mirella Lapata; Jeff Z. Pan

arXiv:2402.12554·cs.CL·February 27, 2024·1 cites

Archer: A Human-Labeled Text-to-SQL Dataset with Arithmetic, Commonsense and Hypothetical Reasoning

Danna Zheng, Mirella Lapata, Jeff Z. Pan

PDF

Open Access

TL;DR

Archer is a bilingual, complex reasoning-focused text-to-SQL dataset that challenges current models with its high complexity and diverse reasoning types, highlighting the need for advanced approaches.

Contribution

The paper introduces Archer, a novel bilingual dataset with complex reasoning tasks for text-to-SQL, surpassing existing datasets in difficulty and scope.

Findings

01

Current state-of-the-art models perform poorly on Archer.

02

Archer covers 20 domains and includes arithmetic, commonsense, and hypothetical reasoning.

03

High complexity of Archer demonstrates the need for improved models.

Abstract

We present Archer, a challenging bilingual text-to-SQL dataset specific to complex reasoning, including arithmetic, commonsense and hypothetical reasoning. It contains 1,042 English questions and 1,042 Chinese questions, along with 521 unique SQL queries, covering 20 English databases across 20 domains. Notably, this dataset demonstrates a significantly higher level of complexity compared to existing publicly available datasets. Our evaluation shows that Archer challenges the capabilities of current state-of-the-art models, with a high-ranked model on the Spider leaderboard achieving only 6.73% execution accuracy on Archer test set. Thus, Archer presents a significant challenge for future research in this field.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Mathematics, Computing, and Information Processing