DQLoRA: A Lightweight Domain-Aware Denoising ASR via Adapter-guided Distillation

Yiru Yang

arXiv:2507.10313·cs.SD·July 15, 2025

DQLoRA: A Lightweight Domain-Aware Denoising ASR via Adapter-guided Distillation

Yiru Yang

PDF

Open Access

TL;DR

DQLoRA introduces a lightweight, adapter-guided distillation approach for robust speech recognition in low-resource and noisy environments, leveraging a frozen Whisper model as teacher and a Wav2Vec2 student with QLoRA adapters.

Contribution

It proposes a novel framework combining adapter-guided distillation with a frozen teacher model for efficient low-resource speech recognition.

Findings

01

Effective in noisy conditions

02

Maintains high recognition accuracy

03

Uses minimal additional parameters

Abstract

We present a demo of DQLoRA, an Adapter-Guided Distillation framework for robust speech recognition under low-resource and noisy conditions. Our method employs a frozen Whisper model as the teacher to provide semantic supervision, and a lightweight Wav2Vec2 student equipped with QLoRA-based Adapters. Training is conducted on the FLEURS dataset augmented with DNS-style noise. The student is optimized by jointly minimizing CTC loss and KL-based distillation loss, enabling efficient adaptation while preserving recognition accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Chemical Sensor Technologies