KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio   Generation

Yoonjin Chung; Pilsun Eu; Junwon Lee; Keunwoo Choi; Juhan Nam; Ben; Sangbae Chon

arXiv:2502.15602·cs.SD·March 11, 2025

KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio Generation

Yoonjin Chung, Pilsun Eu, Junwon Lee, Keunwoo Choi, Juhan Nam, Ben, Sangbae Chon

PDF

1 Repo

TL;DR

KAD introduces a distribution-free, efficient, and perceptually aligned evaluation metric for audio generation, overcoming FAD's limitations and enabling reliable assessment with smaller samples and lower computational costs.

Contribution

The paper proposes KAD, a novel evaluation metric based on MMD that is unbiased, scalable, and better aligned with human perception, addressing key limitations of FAD.

Findings

01

KAD converges faster with smaller sample sizes.

02

KAD has lower computational costs and scalable GPU acceleration.

03

KAD aligns more closely with human perceptual judgments.

Abstract

Although being widely adopted for evaluating generated audio signals, the Fr\'echet Audio Distance (FAD) suffers from significant limitations, including reliance on Gaussian assumptions, sensitivity to sample size, and high computational complexity. As an alternative, we introduce the Kernel Audio Distance (KAD), a novel, distribution-free, unbiased, and computationally efficient metric based on Maximum Mean Discrepancy (MMD). Through analysis and empirical validation, we demonstrate KAD's advantages: (1) faster convergence with smaller sample sizes, enabling reliable evaluation with limited data; (2) lower computational cost, with scalable GPU acceleration; and (3) stronger alignment with human perceptual judgments. By leveraging advanced embeddings and characteristic kernels, KAD captures nuanced differences between real and generated audio. Open-sourced in the kadtk toolkit, KAD…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

YoonjinXD/kadtk
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.