Pruning as Regularization: Sensitivity-Aware One-Shot Pruning in ASR

Julian Irigoyen; Arthur S\"ohler; Andreas S{\o}eborg Kirkedal

arXiv:2511.08092·eess.AS·November 12, 2025

Pruning as Regularization: Sensitivity-Aware One-Shot Pruning in ASR

Julian Irigoyen, Arthur S\"ohler, Andreas S{\o}eborg Kirkedal

PDF

Open Access

TL;DR

This paper demonstrates that one-shot magnitude pruning acts as an effective regularizer in ASR, improving generalization and enabling aggressive compression by identifying architecture-specific redundancies without fine-tuning.

Contribution

It introduces a sensitivity-aware pruning method that reveals architectural asymmetries and enhances one-shot pruning effectiveness in speech recognition models.

Findings

01

Pruning decoder self-attention reduces WER by 2.38% absolute without fine-tuning.

02

Pruning last encoder layers improves WER by 1.72% absolute.

03

Sensitivity-aware pruning enables 40% sparsity with minimal accuracy loss.

Abstract

We challenge the conventional view of neural network pruning as solely a compression technique, demonstrating that one-shot magnitude pruning serves as a powerful implicit regularizer for ASR. Using Whisper-small, we combine gradient- and Fisher-based sensitivity diagnostics with targeted, component-wise pruning. This reveals architectural asymmetries: decoder FFNs are pruning-fragile, whereas decoder self-attention and the last encoder layers contain redundancy that, when removed, improves generalization. Without fine-tuning, pruning 50% of decoder self-attention reduces WER by 2.38% absolute (20.44% relative) on LibriSpeech test-other; pruning the last four encoder layers at 50% instead yields a 1.72% absolute (14.8% relative) improvement. Gains persisted on Common Voice and TED-LIUM datasets. Beyond regularization benefits, our sensitivity-aware approach enables more aggressive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning