Learning Natural Language Inference with LSTM

Shuohang Wang; Jing Jiang

arXiv:1512.08849·cs.CL·November 11, 2016·62 cites

Learning Natural Language Inference with LSTM

Shuohang Wang, Jing Jiang

PDF

Open Access 4 Repos

TL;DR

This paper introduces a novel match-LSTM architecture for natural language inference that performs word-by-word matching between premise and hypothesis, achieving state-of-the-art accuracy on the SNLI dataset.

Contribution

It presents a new LSTM-based model that emphasizes important word matches without relying on fixed sentence embeddings, improving NLI performance.

Findings

01

Achieved 86.1% accuracy on SNLI dataset.

02

Outperformed previous state-of-the-art models.

03

Effectively captures critical mismatches for inference.

Abstract

Natural language inference (NLI) is a fundamentally important task in natural language processing that has many applications. The recently released Stanford Natural Language Inference (SNLI) corpus has made it possible to develop and evaluate learning-centered methods such as deep neural networks for natural language inference (NLI). In this paper, we propose a special long short-term memory (LSTM) architecture for NLI. Our model builds on top of a recently proposed neural attention model for NLI but is based on a significantly different idea. Instead of deriving sentence embeddings for the premise and the hypothesis to be used for classification, our solution uses a match-LSTM to perform word-by-word matching of the hypothesis with the premise. This LSTM is able to place more emphasis on important word-level matching results. In particular, we observe that this LSTM remembers important…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory