Training Naturalized Semantic Parsers with Very Little Data

Subendhu Rongali; Konstantine Arkoudas; Melanie Rubino; Wael Hamza

arXiv:2204.14243·cs.CL·May 5, 2022·1 cites

Training Naturalized Semantic Parsers with Very Little Data

Subendhu Rongali, Konstantine Arkoudas, Melanie Rubino, Wael Hamza

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel semi-supervised approach combining multiple techniques to significantly improve few-shot semantic parsing performance, especially in low-resource scenarios, by leveraging unannotated data.

Contribution

It presents an automated methodology that integrates auxiliary tasks, constrained decoding, self-training, and paraphrasing to enhance natural language semantic parsers with minimal annotated data.

Findings

01

Achieved new SOTA few-shot results on the Overnight dataset.

02

Demonstrated strong performance in very low-resource settings.

03

Provided compelling results on a new semantic parsing dataset.

Abstract

Semantic parsing is an important NLP problem, particularly for voice assistants such as Alexa and Google Assistant. State-of-the-art (SOTA) semantic parsers are seq2seq architectures based on large language models that have been pretrained on vast amounts of text. To better leverage that pretraining, recent work has explored a reformulation of semantic parsing whereby the output sequences are themselves natural language sentences, but in a controlled fragment of natural language. This approach delivers strong results, particularly for few-shot semantic parsing, which is of key importance in practice and the focus of our paper. We push this line of work forward by introducing an automated methodology that delivers very significant additional improvements by utilizing modest amounts of unannotated data, which is typically easy to obtain. Our method is based on a novel synthesis of four…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amazon-research/resource-constrained-naturalized-semantic-parsing
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence