Improving Low-Resource Dialect Classification Using Retrieval-based Voice Conversion

Lea Fischbach; Akbar Karimi; Caroline Kleen; Alfred Lameli; Lucie Flek

arXiv:2507.03641·cs.CL·July 8, 2025

Improving Low-Resource Dialect Classification Using Retrieval-based Voice Conversion

Lea Fischbach, Akbar Karimi, Caroline Kleen, Alfred Lameli, Lucie Flek

PDF

TL;DR

This paper introduces a retrieval-based voice conversion technique to augment data for low-resource dialect classification, improving model focus on dialect features and enhancing accuracy.

Contribution

It presents a novel use of retrieval-based voice conversion for data augmentation in dialect identification, especially effective in low-resource settings.

Findings

01

RVC improves dialect classification accuracy.

02

Combining RVC with other augmentation methods yields further gains.

03

RVC reduces speaker variability, aiding dialect feature learning.

Abstract

Deep learning models for dialect identification are often limited by the scarcity of dialectal data. To address this challenge, we propose to use Retrieval-based Voice Conversion (RVC) as an effective data augmentation method for a low-resource German dialect classification task. By converting audio samples to a uniform target speaker, RVC minimizes speaker-related variability, enabling models to focus on dialect-specific linguistic and phonetic features. Our experiments demonstrate that RVC enhances classification performance when utilized as a standalone augmentation method. Furthermore, combining RVC with other augmentation methods such as frequency masking and segment removal leads to additional performance gains, highlighting its potential for improving dialect classification in low-resource scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.