Layer Pruning on Demand with Intermediate CTC

Jaesong Lee; Jingu Kang; Shinji Watanabe

arXiv:2106.09216·eess.AS·June 18, 2021

Layer Pruning on Demand with Intermediate CTC

Jaesong Lee, Jingu Kang, Shinji Watanabe

PDF

Open Access

TL;DR

This paper introduces a method for dynamically pruning layers in end-to-end speech recognition models based on CTC, enabling on-demand model depth reduction without retraining, suitable for resource-constrained devices.

Contribution

It proposes a novel training and pruning approach using intermediate CTC and stochastic depth, allowing flexible runtime model adaptation without accuracy loss.

Findings

01

Pruned models maintain accuracy comparable to fully trained models of the same depth.

02

Real-time factor improved from 0.005 to 0.002 on GPU.

03

Layer pruning can be performed on demand without additional fine-tuning.

Abstract

Deploying an end-to-end automatic speech recognition (ASR) model on mobile/embedded devices is a challenging task, since the device computational power and energy consumption requirements are dynamically changed in practice. To overcome the issue, we present a training and pruning method for ASR based on the connectionist temporal classification (CTC) which allows reduction of model depth at run-time without any extra fine-tuning. To achieve the goal, we adopt two regularization methods, intermediate CTC and stochastic depth, to train a model whose performance does not degrade much after pruning. We present an in-depth analysis of layer behaviors using singular vector canonical correlation analysis (SVCCA), and efficient strategies for finding layers which are safe to prune. Using the proposed method, we show that a Transformer-CTC model can be pruned in various depth on demand,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing

MethodsPruning