FedMLAC: Mutual Learning Driven Heterogeneous Federated Audio Classification

Jun Bai; Rajib Rana; Di Wu; Youyang Qu; Xiaohui Tao; Ji Zhang; Carlos Busso; Shivakumara Palaiahnakote

arXiv:2506.10207·cs.SD·August 5, 2025

FedMLAC: Mutual Learning Driven Heterogeneous Federated Audio Classification

Jun Bai, Rajib Rana, Di Wu, Youyang Qu, Xiaohui Tao, Ji Zhang, Carlos Busso, Shivakumara Palaiahnakote

PDF

Open Access

TL;DR

FedMLAC is a novel federated learning framework for audio classification that effectively handles data and model heterogeneity and defends against data poisoning through mutual learning and a pruning aggregation strategy.

Contribution

It introduces a unified mutual learning approach with personalized models and a pruning-based aggregation to improve robustness and performance in heterogeneous federated audio classification.

Findings

01

Outperforms state-of-the-art methods in accuracy.

02

Demonstrates robustness against noisy and poisoned data.

03

Effective in diverse speech and non-speech tasks.

Abstract

Federated Learning (FL) offers a privacy-preserving framework for training audio classification (AC) models across decentralized clients without sharing raw data. However, Federated Audio Classification (FedAC) faces three major challenges: data heterogeneity, model heterogeneity, and data poisoning, which degrade performance in real-world settings. While existing methods often address these issues separately, a unified and robust solution remains underexplored. We propose FedMLAC, a mutual learning-based FL framework that tackles all three challenges simultaneously. Each client maintains a personalized local AC model and a lightweight, globally shared Plug-in model. These models interact via bidirectional knowledge distillation, enabling global knowledge sharing while adapting to local data distributions, thus addressing both data and model heterogeneity. To counter data poisoning, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Domain Adaptation and Few-Shot Learning