The Fine Line between Linguistic Generalization and Failure in   Seq2Seq-Attention Models

Noah Weber; Leena Shekhar; Niranjan Balasubramanian

arXiv:1805.01445·cs.CL·May 10, 2018

The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

Noah Weber, Leena Shekhar, Niranjan Balasubramanian

PDF

2 Repos

TL;DR

This paper investigates the sensitivity of Seq2Seq-Attention models' ability to generalize linguistic structure beyond training data, revealing that performance can vary significantly depending on random seed even when standard metrics are unchanged.

Contribution

It demonstrates that Seq2Seq models' capacity to generalize structured tasks is highly sensitive to initialization, highlighting limitations of standard evaluation methods.

Findings

01

Model generalization varies with random seed

02

Standard test performance may not reflect structural understanding

03

Sensitivity impacts real-world robustness

Abstract

Seq2Seq based neural architectures have become the go-to architecture to apply to sequence to sequence language tasks. Despite their excellent performance on these tasks, recent work has noted that these models usually do not fully capture the linguistic structure required to generalize beyond the dense sections of the data distribution \cite{ettinger2017towards}, and as such, are likely to fail on samples from the tail end of the distribution (such as inputs that are noisy \citep{belkinovnmtbreak} or of different lengths \citep{bentivoglinmtlength}). In this paper, we look at a model's ability to generalize on a simple symbol rewriting task with a clearly defined structure. We find that the model's ability to generalize this structure beyond the training distribution depends greatly on the chosen random seed, even when performance on the standard test set remains the same. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.