HARNESS: Lightweight Distilled Arabic Speech Foundation Models

Vrunda N. Sukhadia; Shammur Absar Chowdhury

arXiv:2604.14186·eess.AS·April 17, 2026

HARNESS: Lightweight Distilled Arabic Speech Foundation Models

Vrunda N. Sukhadia, Shammur Absar Chowdhury

PDF

TL;DR

HArnESS introduces lightweight, self-distilled Arabic speech models that outperform existing models on key tasks while maintaining efficiency, enabling practical deployment in resource-limited environments.

Contribution

The paper presents a novel Arabic-centric self-supervised speech model family with iterative self-distillation and PCA-based compression, improving performance and efficiency over existing models.

Findings

01

HArnESS outperforms HuBERT and XLS-R on Arabic tasks.

02

Compressed models retain competitive accuracy with reduced complexity.

03

PCA-based supervision enhances model capacity matching.

Abstract

Large self-supervised speech (SSL) models achieve strong downstream performance, but their size limits deployment in resource-constrained settings. We present HArnESS, an Arabic-centric self-supervised speech model family trained from scratch with iterative self-distillation, together with lightweight student variants that offer strong accuracy-efficiency trade-offs on Automatic Speech Recognition (ASR), Dialect Identification (DID), and Speech Emotion Recognition (SER). Our approach begins with a large bilingual Arabic-English teacher and progressively distills its knowledge into compressed student models while preserving Arabic-relevant acoustic and paralinguistic representations. We further study PCA-based compression of the teacher supervision signal to better match the capacity of shallow and thin students. Compared with HuBERT and XLS-R, HArnESS consistently improves performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.