Addressing Resource and Privacy Constraints in Semantic Parsing Through   Data Augmentation

Kevin Yang; Olivia Deng; Charles Chen; Richard Shin; Subhro Roy,; Benjamin Van Durme

arXiv:2205.08675·cs.CL·May 19, 2022·1 cites

Addressing Resource and Privacy Constraints in Semantic Parsing Through Data Augmentation

Kevin Yang, Olivia Deng, Charles Chen, Richard Shin, Subhro Roy,, Benjamin Van Durme

PDF

Open Access

TL;DR

This paper proposes a data augmentation method to improve low-resource semantic parsing by generating structured utterances and simulating natural language, achieving significant performance gains under realistic constraints.

Contribution

It introduces a novel data augmentation approach tailored for low-resource, privacy-sensitive semantic parsing scenarios with no reliance on related datasets or direct grammar sampling.

Findings

01

33% relative improvement in top-1 match on SMCalFlow dataset

02

Effective data augmentation despite restrictive real-world constraints

03

Demonstrates viability of structured utterance generation for low-resource parsing

Abstract

We introduce a novel setup for low-resource task-oriented semantic parsing which incorporates several constraints that may arise in real-world scenarios: (1) lack of similar datasets/models from a related domain, (2) inability to sample useful logical forms directly from a grammar, and (3) privacy requirements for unlabeled natural utterances. Our goal is to improve a low-resource semantic parser using utterances collected through user interactions. In this highly challenging but realistic setting, we investigate data augmentation approaches involving generating a set of structured canonical utterances corresponding to logical forms, before simulating corresponding natural language and filtering the resulting pairs. We find that such approaches are effective despite our restrictive setup: in a low-resource setting on the complex SMCalFlow calendaring dataset (Andreas et al., 2020), we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis