L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing

Eric Guizzo; Riccardo F. Gramaccioni; Saeid Jamili; Christian; Marinoni; Edoardo Massaro; Claudia Medaglia; Giuseppe Nachira; Leonardo; Nucciarelli; Ludovica Paglialunga; Marco Pennese; Sveva Pepe; Enrico Rocchi,; Aurelio Uncini; Danilo Comminiello

arXiv:2104.05499·eess.AS·December 16, 2022

L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing

Eric Guizzo, Riccardo F. Gramaccioni, Saeid Jamili, Christian, Marinoni, Edoardo Massaro, Claudia Medaglia, Giuseppe Nachira, Leonardo, Nucciarelli, Ludovica Paglialunga, Marco Pennese, Sveva Pepe, Enrico Rocchi,, Aurelio Uncini, Danilo Comminiello

PDF

1 Repo

TL;DR

The L3DAS21 Challenge promotes research in 3D audio signal processing using a novel dual-mic Ambisonics setup, providing a dataset, baseline models, and a platform for advancing machine learning methods in speech enhancement and sound localization.

Contribution

Introduction of a dual-mic Ambisonics configuration for 3D audio tasks, along with a comprehensive dataset, baseline models, and a challenge framework to foster collaborative research.

Findings

01

First use of dual-mic Ambisonics for 3D audio tasks

02

Baseline models demonstrate effectiveness of proposed setup

03

Dataset and API facilitate research and benchmarking

Abstract

The L3DAS21 Challenge is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results submission stage. Usually, machine learning approaches to 3D audio tasks are based on single-perspective Ambisonics recordings or on arrays of single-capsule microphones. We propose, instead, a novel multichannel audio configuration based multiple-source and multiple-perspective Ambisonics recordings, performed with an array of two first-order Ambisonics microphones. To the best of our knowledge, it is the first time that a dual-mic Ambisonics configuration is used for these tasks. We provide…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

l3das/L3DAS21
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.