Sentence Compression as Deletion with Contextual Embeddings

Minh-Tien Nguyen; Bui Cong Minh; Dung Tien Le; Le Thai Linh

arXiv:2006.03210·cs.IR·June 8, 2020

Sentence Compression as Deletion with Contextual Embeddings

Minh-Tien Nguyen, Bui Cong Minh, Dung Tien Le, Le Thai Linh

PDF

TL;DR

This paper introduces a sentence compression method using contextual embeddings with bidirectional LSTM and CRF, achieving state-of-the-art results on a benchmark dataset by effectively capturing input context.

Contribution

It presents a novel approach combining contextual embeddings with sequence labeling for sentence compression, outperforming previous non-contextual embedding methods.

Findings

01

Achieved new state-of-the-art F-score on Google dataset

02

Utilizing contextual embeddings improves compression quality

03

Model effectively captures input context for better compression

Abstract

Sentence compression is the task of creating a shorter version of an input sentence while keeping important information. In this paper, we extend the task of compression by deletion with the use of contextual embeddings. Different from prior work usually using non-contextual embeddings (Glove or Word2Vec), we exploit contextual embeddings that enable our model capturing the context of inputs. More precisely, we utilize contextual embeddings stacked by bidirectional Long-short Term Memory and Conditional Random Fields for dealing with sequence labeling. Experimental results on a benchmark Google dataset show that by utilizing contextual embeddings, our model achieves a new state-of-the-art F-score compared to strong methods reported on the leader board.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.