TL;DR
PICARD is a method that constrains auto-regressive language model decoding using incremental parsing, significantly improving the validity and performance of models generating structured outputs like SQL.
Contribution
It introduces a novel incremental parsing approach to constrain language model decoding, enhancing validity and state-of-the-art performance in text-to-SQL tasks.
Findings
PICARD improves validity of generated SQL code.
PICARD achieves state-of-the-art results on Spider and CoSQL.
Constrained decoding reduces invalid outputs significantly.
Abstract
Large pre-trained language models for textual data have an unconstrained output space; at each decoding step, they can produce any of 10,000s of sub-word tokens. When fine-tuned to target constrained formal languages like SQL, these models often generate invalid code, rendering it unusable. We propose PICARD (code and trained models available at https://github.com/ElementAI/picard), a method for constraining auto-regressive decoders of language models through incremental parsing. PICARD helps to find valid output sequences by rejecting inadmissible tokens at each decoding step. On the challenging Spider and CoSQL text-to-SQL translation tasks, we show that PICARD transforms fine-tuned T5 models with passable performance into state-of-the-art solutions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗tscholak/1wnr382emodel· 21 dl· ♡ 321 dl♡ 3
- 🤗tscholak/1zha5onomodel· 41 dl· ♡ 441 dl♡ 4
- 🤗tscholak/2e826ioamodel· 2 dl· ♡ 72 dl♡ 7
- 🤗tscholak/2jrayxosmodel· 6 dl· ♡ 26 dl♡ 2
- 🤗tscholak/3vnuv1vfmodel· 5 dl· ♡ 105 dl♡ 10
- 🤗tscholak/cxmefzzimodel· 289 dl· ♡ 32289 dl♡ 32
- 🤗alpineai/cosqlmodel· 7 dl7 dl
- 🤗patrickNLP/Graphix-3Bmodel· 8 dl· ♡ 188 dl♡ 18
- 🤗czurita/tscholak-cxmefzzi-sharded-bf16-2GBmodel· 2 dl2 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGated Linear Unit · Attention Is All You Need · Linear Layer · Parsing Incrementally for Constrained Auto-Regressive Decoding · Inverse Square Root Schedule · Softmax · Byte Pair Encoding · Attention Dropout · SentencePiece · Dropout
