Battle of the Large Language Models: Dolly vs LLaMA vs Vicuna vs Guanaco   vs Bard vs ChatGPT -- A Text-to-SQL Parsing Comparison

Shuo Sun; Yuchen Zhang; Jiahuan Yan; Yuze Gao; Donovan Ong; Bin Chen,; Jian Su

arXiv:2310.10190·cs.CL·October 17, 2023·1 cites

Battle of the Large Language Models: Dolly vs LLaMA vs Vicuna vs Guanaco vs Bard vs ChatGPT -- A Text-to-SQL Parsing Comparison

Shuo Sun, Yuchen Zhang, Jiahuan Yan, Yuze Gao, Donovan Ong, Bin Chen,, Jian Su

PDF

Open Access

TL;DR

This paper systematically compares six open-source large language models on Text-to-SQL tasks across multiple benchmarks, revealing a significant performance gap compared to closed-source models like GPT-3.5.

Contribution

It provides a comprehensive evaluation of open-source LLMs for Text-to-SQL, highlighting their current limitations relative to proprietary models.

Findings

01

Open-source models underperform GPT-3.5 in Text-to-SQL tasks.

02

Performance varies across different prompting strategies.

03

Significant gap remains between open-source and closed-source models.

Abstract

The success of ChatGPT has ignited an AI race, with researchers striving to develop new large language models (LLMs) that can match or surpass the language understanding and generation abilities of commercial ones. In recent times, a number of models have emerged, claiming performance near that of GPT-3.5 or GPT-4 through various instruction-tuning methods. As practitioners of Text-to-SQL parsing, we are grateful for their valuable contributions to open-source research. However, it is important to approach these claims with a sense of scrutiny and ascertain the actual effectiveness of these models. Therefore, we pit six popular large language models against each other, systematically evaluating their Text-to-SQL parsing capability on nine benchmark datasets with five different prompting strategies, covering both zero-shot and few-shot scenarios. Regrettably, the open-sourced models fell…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsMulti-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Absolute Position Encodings · Layer Normalization · Dense Connections · Linear Layer · Attention Dropout · Refunds@Expedia|||How do I get a full refund from Expedia?