Improving Compositional Generalization in Classification Tasks via   Structure Annotations

Juyong Kim; Pradeep Ravikumar; Joshua Ainslie; Santiago Onta\~n\'on

arXiv:2106.10434·cs.LG·June 22, 2021·1 cites

Improving Compositional Generalization in Classification Tasks via Structure Annotations

Juyong Kim, Pradeep Ravikumar, Joshua Ainslie, Santiago Onta\~n\'on

PDF

Open Access

TL;DR

This paper explores how structural annotations like parse trees and entity links can improve neural models' ability to generalize compositionally in classification tasks, addressing a key challenge in AI.

Contribution

The paper introduces methods to convert sequence-to-sequence datasets into classification datasets requiring compositional generalization and demonstrates that structural hints enhance model performance.

Findings

01

Structural hints improve compositional generalization in Transformers.

02

Conversion methods enable new classification datasets for compositional tasks.

03

Structural annotations lead to better systematic generalization.

Abstract

Compositional generalization is the ability to generalize systematically to a new data distribution by combining known components. Although humans seem to have a great ability to generalize compositionally, state-of-the-art neural models struggle to do so. In this work, we study compositional generalization in classification tasks and present two main contributions. First, we study ways to convert a natural language sequence-to-sequence dataset to a classification dataset that also requires compositional generalization. Second, we show that providing structural hints (specifically, providing parse trees and entity links as attention masks for a Transformer model) helps compositional generalization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Dropout · Layer Normalization · Label Smoothing