Compositional generalization in semantic parsing with pretrained   transformers

A. Emin Orhan

arXiv:2109.15101·cs.CL·December 23, 2022

Compositional generalization in semantic parsing with pretrained transformers

A. Emin Orhan

PDF

Open Access 1 Repo

TL;DR

This paper investigates the limits of pretraining benefits in semantic parsing, revealing that pretrained models transfer knowledge broadly across domains but face constraints, and larger models benefit more from pretraining.

Contribution

It demonstrates the transferability of pretrained models across different domains and highlights the importance of pretraining scale and domain similarity for semantic parsing.

Findings

01

Pretraining on non-English and programming languages improves English semantic parsing.

02

Pretraining on protein sequences generally worsens performance on benchmarks.

03

Larger models benefit more from pretraining and are harder to train from scratch.

Abstract

Large-scale pretraining instills large amounts of knowledge in deep neural networks. This, in turn, improves the generalization behavior of these models in downstream tasks. What exactly are the limits to the generalization benefits of large-scale pretraining? Here, we report observations from some simple experiments aimed at addressing this question in the context of two semantic parsing tasks involving natural language, SCAN and COGS. We show that language models pretrained exclusively with non-English corpora, or even with programming language corpora, significantly improve out-of-distribution generalization in these benchmarks, compared with models trained from scratch, even though both benchmarks are English-based. This demonstrates the surprisingly broad transferability of pretrained representations and knowledge. Pretraining with a large-scale protein sequence prediction task, on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eminorhan/parsing-transformers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications