Shortcut-Stacked Sentence Encoders for Multi-Domain Inference

Yixin Nie; Mohit Bansal

arXiv:1708.02312·cs.CL·November 29, 2017·37 cites

Shortcut-Stacked Sentence Encoders for Multi-Domain Inference

Yixin Nie, Mohit Bansal

PDF

Open Access 3 Repos

TL;DR

This paper introduces a simple yet effective stacked bidirectional LSTM sentence encoder with shortcut connections, significantly improving multi-domain natural language inference performance and setting new state-of-the-art results.

Contribution

The novel encoder architecture with shortcut connections and fine-tuned embeddings enhances inference accuracy across multiple datasets.

Findings

01

Achieved top non-ensemble single-model results in the EMNLP 2017 Shared Task.

02

Set new state-of-the-art on the SNLI dataset.

03

Demonstrated strong improvements over existing encoders.

Abstract

We present a simple sequential sentence encoder for multi-domain natural language inference. Our encoder is based on stacked bidirectional LSTM-RNNs with shortcut connections and fine-tuning of word embeddings. The overall supervised model uses the above encoder to encode two input sentences into two vectors, and then uses a classifier over the vector combination to label the relationship between these two sentences as that of entailment, contradiction, or neural. Our Shortcut-Stacked sentence encoders achieve strong improvements over existing encoders on matched and mismatched multi-domain natural language inference (top non-ensemble single-model result in the EMNLP RepEval 2017 Shared Task (Nangia et al., 2017)). Moreover, they achieve the new state-of-the-art encoding result on the original SNLI dataset (Bowman et al., 2015).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications