MAPGN: MAsked Pointer-Generator Network for sequence-to-sequence   pre-training

Mana Ihori; Naoki Makishima; Tomohiro Tanaka; Akihiko Takashima; Shota; Orihashi; Ryo Masumura

arXiv:2102.07380·cs.CL·February 17, 2021·1 cites

MAPGN: MAsked Pointer-Generator Network for sequence-to-sequence pre-training

Mana Ihori, Naoki Makishima, Tomohiro Tanaka, Akihiko Takashima, Shota, Orihashi, Ryo Masumura

PDF

Open Access

TL;DR

This paper introduces MAPGN, a self-supervised pre-training method for pointer-generator networks that enhances spoken-text normalization, especially when limited paired data is available.

Contribution

The paper proposes a novel self-supervised learning approach, MAPGN, specifically designed for pointer-generator networks to improve spoken-text normalization with limited data.

Findings

01

MAPGN outperforms conventional self-supervised methods in spoken-text normalization tasks.

02

Pre-training with MAPGN improves model performance with limited paired data.

03

The method effectively utilizes unpaired text data for better normalization results.

Abstract

This paper presents a self-supervised learning method for pointer-generator networks to improve spoken-text normalization. Spoken-text normalization that converts spoken-style text into style normalized text is becoming an important technology for improving subsequent processing such as machine translation and summarization. The most successful spoken-text normalization method to date is sequence-to-sequence (seq2seq) mapping using pointer-generator networks that possess a copy mechanism from an input sequence. However, these models require a large amount of paired data of spoken-style text and style normalized text, and it is difficult to prepare such a volume of data. In order to construct spoken-text normalization model from the limited paired data, we focus on self-supervised learning which can utilize unpaired text data to improve seq2seq models. Unfortunately, conventional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence