# Heterogeneous Self-Supervised Acoustic Pre-Training with Local Constraints

**Authors:** Xiaodong Cui, A F M Saif, Brian Kingsbury, Tianyi Chen

arXiv: 2508.19990 · 2025-09-10

## TL;DR

This paper introduces a novel self-supervised pre-training method for speech recognition that uses local constraints and bilevel optimization to better adapt to heterogeneous data sources, improving downstream task performance.

## Contribution

It proposes a new pre-training approach with local constraints formulated as a bilevel optimization problem, enhancing model adaptivity to diverse data.

## Key findings

- Significantly improves model adaptivity to heterogeneous data
- Enhances downstream supervised fine-tuning performance
- Connects to model-agnostic meta learning principles

## Abstract

Self-supervised pre-training using unlabeled data is widely used in automatic speech recognition. In this paper, we propose a new self-supervised pre-training approach to dealing with heterogeneous data. Instead of mixing all the data and minimizing the averaged global loss in the conventional way, we impose additional local constraints to ensure that the model optimizes each source of heterogeneous data to its local optimum after $K$-step gradient descent initialized from the model. We formulate this as a bilevel optimization problem, and use the first-order approximation method to solve the problem. We discuss its connection to model-agnostic meta learning. Experiments are carried out on self-supervised pre-training using multi-domain and multilingual datasets, demonstrating that the proposed approach can significantly improve the adaptivity of the self-supervised pre-trained model for the downstream supervised fine-tuning tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.19990/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/2508.19990/full.md

---
Source: https://tomesphere.com/paper/2508.19990