Sentence-State LSTM for Text Representation

Yue Zhang; Qi Liu; Linfeng Song

arXiv:1805.02474·cs.CL·May 8, 2018·39 cites

Sentence-State LSTM for Text Representation

Yue Zhang, Qi Liu, Linfeng Song

PDF

Open Access 2 Repos

TL;DR

This paper introduces Sentence-State LSTM, an alternative to traditional BiLSTMs, using parallel states for each word to improve text representation by enabling simultaneous local and global information exchange.

Contribution

The paper proposes a novel Sentence-State LSTM structure that overcomes limitations of sequential BiLSTMs by allowing parallel processing of word states for enhanced text encoding.

Findings

01

Achieves competitive performance on classification benchmarks

02

Demonstrates strong representation power with fewer parameters

03

Outperforms traditional stacked BiLSTM models

Abstract

Bi-directional LSTMs are a powerful tool for text representation. On the other hand, they have been shown to suffer various limitations due to their sequential nature. We investigate an alternative LSTM structure for encoding text, which consists of a parallel state for each word. Recurrent steps are used to perform local and global information exchange between words simultaneously, rather than incremental reading of a sequence of words. Results on various classification and sequence labelling benchmarks show that the proposed model has strong representation power, giving highly competitive performances compared to stacked BiLSTM models with similar parameter numbers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies

MethodsSigmoid Activation · Tanh Activation · Bidirectional LSTM · Long Short-Term Memory