Multi-agent Auditory Scene Analysis

Caleb Rascon; Luis Gato-Diaz; Eduardo Garc\'ia-Alarc\'on

arXiv:2507.02755·eess.AS·August 21, 2025

Multi-agent Auditory Scene Analysis

Caleb Rascon, Luis Gato-Diaz, Eduardo Garc\'ia-Alarc\'on

PDF

TL;DR

This paper introduces a multi-agent auditory scene analysis system that performs sound source localization, separation, and classification in parallel with feedback loops, reducing response time and error sensitivity for real-time applications.

Contribution

It proposes a novel multi-agent framework for auditory scene analysis that operates in parallel with feedback, improving robustness and efficiency over traditional linear approaches.

Findings

01

The MASA system is robust against local errors.

02

It operates with low response time suitable for real-time applications.

03

The framework is open-source and customizable.

Abstract

Auditory scene analysis (ASA) aims to retrieve information from the acoustic environment, by carrying out three main tasks: sound source location, separation, and classification. These tasks are traditionally executed with a linear data flow, where the sound sources are first located; then, using their location, each source is separated into its own audio stream; from each of which, information is extracted that is relevant to the application scenario (audio event detection, speaker identification, emotion classification, etc.). However, running these tasks linearly increases the overall response time, while making the last tasks (separation and classification) highly sensitive to errors of the first task (location). A considerable amount of effort and computational complexity has been employed in the state-of-the-art to develop techniques that are the least error-prone possible.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.