Neural Sequence Segmentation as Determining the Leftmost Segments

Yangming Li; Lemao Liu; Kaisheng Yao

arXiv:2104.07217·cs.CL·April 16, 2021

Neural Sequence Segmentation as Determining the Leftmost Segments

Yangming Li, Lemao Liu, Kaisheng Yao

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel neural framework for sentence segmentation that incrementally identifies leftmost segments, effectively capturing long-term dependencies and outperforming previous token-level methods in syntactic chunking and POS tagging.

Contribution

It proposes a new segment-level segmentation framework using LSTM-minus and RNN, advancing beyond token-level methods to better model long-term dependencies.

Findings

01

Outperforms all baselines in syntactic chunking and POS tagging

02

Achieves new state-of-the-art results on three datasets

03

Effectively models long-term dependencies in long sentences

Abstract

Prior methods to text segmentation are mostly at token level. Despite the adequacy, this nature limits their full potential to capture the long-term dependencies among segments. In this work, we propose a novel framework that incrementally segments natural language sentences at segment level. For every step in segmentation, it recognizes the leftmost segment of the remaining sequence. Implementations involve LSTM-minus technique to construct the phrase representations and recurrent neural networks (RNN) to model the iterations of determining the leftmost segments. We have conducted extensive experiments on syntactic chunking and Chinese part-of-speech (POS) tagging across 3 datasets, demonstrating that our methods have significantly outperformed previous all baselines and achieved new state-of-the-art results. Moreover, qualitative analysis and the study on segmenting long-length…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LeePleased/LeftmostSeg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification