A Decomposable Attention Model for Natural Language Inference

Ankur P. Parikh; Oscar T\"ackstr\"om; Dipanjan Das; Jakob Uszkoreit

arXiv:1606.01933·cs.CL·September 27, 2016·63 cites

A Decomposable Attention Model for Natural Language Inference

Ankur P. Parikh, Oscar T\"ackstr\"om, Dipanjan Das, Jakob Uszkoreit

PDF

Open Access 5 Repos

TL;DR

This paper introduces a simple, attention-based neural model for natural language inference that achieves state-of-the-art results efficiently without relying on word order, and can be further improved with intra-sentence attention.

Contribution

The paper presents a decomposable attention model that simplifies NLI, reduces parameters, and improves performance without using word order information.

Findings

01

Achieves state-of-the-art results on SNLI dataset

02

Uses significantly fewer parameters than previous models

03

Further improvements with intra-sentence attention

Abstract

We propose a simple neural architecture for natural language inference. Our approach uses attention to decompose the problem into subproblems that can be solved separately, thus making it trivially parallelizable. On the Stanford Natural Language Inference (SNLI) dataset, we obtain state-of-the-art results with almost an order of magnitude fewer parameters than previous work and without relying on any word-order information. Adding intra-sentence attention that takes a minimum amount of order into account yields further improvements.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications