A Model for Every User and Budget: Label-Free and Personalized   Mixed-Precision Quantization

Edward Fish; Umberto Michieli; Mete Ozay

arXiv:2307.12659·cs.SD·February 13, 2024

A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision Quantization

Edward Fish, Umberto Michieli, Mete Ozay

PDF

Open Access 1 Repo

TL;DR

This paper introduces myQASR, a personalized mixed-precision quantization method for ASR models that adapts to individual users and constraints without fine-tuning, improving performance across diverse sub-domains.

Contribution

The paper presents a novel label-free, personalized quantization approach that automatically generates tailored schemes for different users and memory budgets in ASR models.

Findings

01

Improves ASR performance for specific genders, languages, and speakers.

02

Operates without fine-tuning using only unlabelled samples.

03

Adapts to any memory requirement with personalized quantization schemes.

Abstract

Recent advancement in Automatic Speech Recognition (ASR) has produced large AI models, which become impractical for deployment in mobile devices. Model quantization is effective to produce compressed general-purpose models, however such models may only be deployed to a restricted sub-domain of interest. We show that ASR models can be personalized during quantization while relying on just a small set of unlabelled samples from the target domain. To this end, we propose myQASR, a mixed-precision quantization method that generates tailored quantization schemes for diverse users under any memory requirement with no fine-tuning. myQASR automatically evaluates the quantization sensitivity of network layers by analysing the full-precision activation values. We are then able to generate a personalised mixed-precision quantization scheme for any pre-determined memory budget. Results for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

samsunglabs/myqasr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing