Salmon: A Suite for Acoustic Language Model Evaluation

Gallil Maimon; Amit Roth; Yossi Adi

arXiv:2409.07437·cs.SD·January 16, 2025

Salmon: A Suite for Acoustic Language Model Evaluation

Gallil Maimon, Amit Roth, Yossi Adi

PDF

Open Access 1 Repo 3 Models 1 Datasets

TL;DR

SALMon is a comprehensive evaluation suite for speech language models that assesses their ability to recognize and differentiate acoustic features like noise, emotion, and speaker identity, addressing a key gap in current benchmarks.

Contribution

We introduce SALMon, a novel, fast, and comprehensive benchmark suite for evaluating speech models on diverse acoustic aspects beyond spoken content.

Findings

01

Different models show varying strengths in acoustic feature recognition.

02

The benchmark reveals specific weaknesses in emotion and noise robustness.

03

Evaluation results guide future improvements in speech model development.

Abstract

Speech language models have recently demonstrated great potential as universal speech processing systems. Such models have the ability to model the rich acoustic information existing in audio signals, beyond spoken content, such as emotion, background noise, etc. Despite this, evaluation benchmarks which evaluate awareness to a wide range of acoustic aspects, are lacking. To help bridge this gap, we introduce SALMon, a novel evaluation suite encompassing background noise, emotion, speaker identity and room impulse response. The proposed benchmarks both evaluate the consistency of the inspected element and how much it matches the spoken text. We follow a modelling based approach, measuring whether a model gives correct samples higher scores than incorrect ones. This approach makes the benchmark fast to compute even for large models. We evaluated several speech language models on SALMon,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

slp-rl/salmon
pytorchOfficial

Models

Datasets

slprl/SALMon
dataset· 146 dl
146 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis