Multi-agent Auditory Scene Analysis
Caleb Rascon, Luis Gato-Diaz, Eduardo Garc\'ia-Alarc\'on

TL;DR
This paper introduces a multi-agent auditory scene analysis system that performs sound source localization, separation, and classification in parallel with feedback loops, reducing response time and error sensitivity for real-time applications.
Contribution
It proposes a novel multi-agent framework for auditory scene analysis that operates in parallel with feedback, improving robustness and efficiency over traditional linear approaches.
Findings
The MASA system is robust against local errors.
It operates with low response time suitable for real-time applications.
The framework is open-source and customizable.
Abstract
Auditory scene analysis (ASA) aims to retrieve information from the acoustic environment, by carrying out three main tasks: sound source location, separation, and classification. These tasks are traditionally executed with a linear data flow, where the sound sources are first located; then, using their location, each source is separated into its own audio stream; from each of which, information is extracted that is relevant to the application scenario (audio event detection, speaker identification, emotion classification, etc.). However, running these tasks linearly increases the overall response time, while making the last tasks (separation and classification) highly sensitive to errors of the first task (location). A considerable amount of effort and computational complexity has been employed in the state-of-the-art to develop techniques that are the least error-prone possible.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
