Learning compositionally through attentive guidance

Dieuwke Hupkes; Anand Singh; Kris Korrel; German Kruszewski; Elia; Bruni

arXiv:1805.09657·cs.CL·July 8, 2019·19 cites

Learning compositionally through attentive guidance

Dieuwke Hupkes, Anand Singh, Kris Korrel, German Kruszewski, Elia, Bruni

PDF

Open Access

TL;DR

This paper introduces Attentive Guidance, a mechanism that helps sequence-to-sequence models learn more compositional solutions, improving their generalization and addressing the challenge of systematic compositionality in neural networks.

Contribution

The paper proposes Attentive Guidance to enhance the compositional capabilities of neural sequence models without additional components.

Findings

01

Guided models find more compositional solutions than vanilla models.

02

Guided models generalize well even with distribution shifts.

03

Vanilla models tend to overfit the training distribution.

Abstract

While neural network models have been successfully applied to domains that require substantial generalisation skills, recent studies have implied that they struggle when solving the task they are trained on requires inferring its underlying compositional structure. In this paper, we introduce Attentive Guidance, a mechanism to direct a sequence to sequence model equipped with attention to find more compositional solutions. We test it on two tasks, devised precisely to assess the compositional capabilities of neural models, and we show that vanilla sequence to sequence models with attention overfit the training distribution, while the guided versions come up with compositional solutions that fit the training and testing distributions almost equally well. Moreover, the learned solutions generalise even in cases where the training and testing distributions strongly diverge. In this way, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling