Text Simplification by Tagging

Kostiantyn Omelianchuk; Vipul Raheja; Oleksandr Skurzhanskyi

arXiv:2103.05070·cs.CL·May 11, 2022

Text Simplification by Tagging

Kostiantyn Omelianchuk, Vipul Raheja, Oleksandr Skurzhanskyi

PDF

Open Access 1 Repo

TL;DR

This paper introduces TST, a fast, efficient, and less data-dependent text simplification method based on sequence tagging with pre-trained Transformer encoders, achieving near state-of-the-art results and significantly faster inference.

Contribution

The paper presents a novel sequence tagging approach for text simplification that reduces reliance on large parallel datasets and enhances inference speed using pre-trained models.

Findings

01

Achieves near state-of-the-art performance on benchmark datasets.

02

Over 11 times faster inference than current leading systems.

03

Less dependence on large parallel corpora.

Abstract

Edit-based approaches have recently shown promising results on multiple monolingual sequence transduction tasks. In contrast to conventional sequence-to-sequence (Seq2Seq) models, which learn to generate text from scratch as they are trained on parallel corpora, these methods have proven to be much more effective since they are able to learn to make fast and accurate transformations while leveraging powerful pre-trained language models. Inspired by these ideas, we present TST, a simple and efficient Text Simplification system based on sequence Tagging, leveraging pre-trained Transformer-based encoders. Our system makes simplistic data augmentations and tweaks in training and inference on a pre-existing system, which makes it less reliant on large amounts of parallel training data, provides more control over the outputs and enables faster inference speeds. Our best model achieves near…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

grammarly/gector
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling