Highly Fast Text Segmentation With Pairwise Markov Chains

Elie Azeraf; Emmanuel Monfrini; Emmanuel Vignon; Wojciech Pieczynski

arXiv:2102.11037·cs.CL·August 12, 2025

Highly Fast Text Segmentation With Pairwise Markov Chains

Elie Azeraf, Emmanuel Monfrini, Emmanuel Vignon, Wojciech Pieczynski

PDF

TL;DR

This paper introduces a Pairwise Markov Chain model for NLP segmentation tasks that requires no extra data, achieves comparable accuracy to CRFs, and significantly reduces training time, addressing computational and environmental concerns.

Contribution

The paper proposes a novel Pairwise Markov Chain approach for NLP segmentation that is fast, data-efficient, and competitive with established models like CRFs.

Findings

01

PMC achieves similar accuracy to CRFs without extra data.

02

PMC's training time is 30 times shorter than CRFs.

03

The method is effective for POS tagging, NER, and Chunking.

Abstract

Natural Language Processing (NLP) models' current trend consists of using increasingly more extra-data to build the best models as possible. It implies more expensive computational costs and training time, difficulties for deployment, and worries about these models' carbon footprint reveal a critical problem in the future. Against this trend, our goal is to develop NLP models requiring no extra-data and minimizing training time. To do so, in this paper, we explore Markov chain models, Hidden Markov Chain (HMC) and Pairwise Markov Chain (PMC), for NLP segmentation tasks. We apply these models for three classic applications: POS Tagging, Named-Entity-Recognition, and Chunking. We develop an original method to adapt these models for text segmentation's specific challenges to obtain relevant performances with very short training and execution times. PMC achieves equivalent results to those…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConditional Random Field