Discriminative Self-training for Punctuation Prediction

Qian Chen; Wen Wang; Mengzhe Chen; Qinglin Zhang

arXiv:2104.10339·cs.CL·September 2, 2021

Discriminative Self-training for Punctuation Prediction

Qian Chen, Wen Wang, Mengzhe Chen, Qinglin Zhang

PDF

TL;DR

This paper introduces a Discriminative Self-Training method with weighted loss and label smoothing to enhance punctuation prediction in speech transcripts, significantly outperforming existing models and establishing new state-of-the-art results.

Contribution

The paper presents a novel Discriminative Self-Training approach that effectively leverages unlabeled speech data for punctuation prediction, outperforming strong baselines and previous state-of-the-art models.

Findings

01

Achieves significant improvement over baselines on English and Chinese datasets.

02

Establishes a new state-of-the-art on the IWSLT2011 benchmark with a 1.3% absolute F1 gain.

03

Outperforms vanilla self-training and existing models like BERT, RoBERTa, and ELECTRA.

Abstract

Punctuation prediction for automatic speech recognition (ASR) output transcripts plays a crucial role for improving the readability of the ASR transcripts and for improving the performance of downstream natural language processing applications. However, achieving good performance on punctuation prediction often requires large amounts of labeled speech transcripts, which is expensive and laborious. In this paper, we propose a Discriminative Self-Training approach with weighted loss and discriminative label smoothing to exploit unlabeled speech transcripts. Experimental results on the English IWSLT2011 benchmark test set and an internal Chinese spoken language dataset demonstrate that the proposed approach achieves significant improvement on punctuation prediction accuracy over strong baselines including BERT, RoBERTa, and ELECTRA models. The proposed Discriminative Self-Training approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Dropout · Dense Connections · Attention Is All You Need · ELECTRA · Softmax · Linear Warmup With Linear Decay · WordPiece