SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation

Mahi Luthra; Jiayi Shen; Maxime Poli; Angelo Ortiz; Yosuke Higuchi; Youssef Benchekroun; Martin Gleize; Charles-Eric Saint-James; Dongyan Lin; Phillip Rust; Angel Villar; Surya Parimi; Vanessa Stark; Rashel Moritz; Juan Pino; Yann LeCun; Emmanuel Dupoux

arXiv:2512.21204·cs.CL·April 21, 2026

SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation

Mahi Luthra, Jiayi Shen, Maxime Poli, Angelo Ortiz, Yosuke Higuchi, Youssef Benchekroun, Martin Gleize, Charles-Eric Saint-James, Dongyan Lin, Phillip Rust, Angel Villar, Surya Parimi, Vanessa Stark, Rashel Moritz, Juan Pino, Yann LeCun, Emmanuel Dupoux

PDF

1 Repo

TL;DR

SpidR-Adapt is a meta-learning based speech representation model that enables rapid, data-efficient adaptation to new languages, significantly outperforming traditional methods in low-resource scenarios.

Contribution

The paper introduces a novel meta-learning framework with a scalable bi-level optimization approach for fast speech unit adaptation in low-resource settings.

Findings

01

Achieves rapid phonemic discriminability improvements with less than 1 hour of data.

02

Surpasses in-domain toplines in downstream language modeling tasks.

03

Provides 100x greater data efficiency than standard multi-task training.

Abstract

Human infants, with only a few hundred hours of speech exposure, acquire basic units of new languages, highlighting a striking efficiency gap compared to the data-hungry self-supervised speech models. To address this gap, this paper introduces SpidR-Adapt for rapid adaptation of speech units to new languages using minimal unlabeled data. We cast such low-resource speech representation learning as a meta-learning problem and construct a multi-task adaptive pre-training (MAdaPT) protocol which formulates the adaptation process as a bi-level optimization framework. To enable scalable meta-training under this framework, we propose a novel heuristic solution, first-order bi-level optimization (FOBLO), avoiding heavy computation costs. Finally, we stabilize meta-training by using a robust initialization through interleaved supervision which alternates self-supervised and supervised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/spidr-adapt
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.