AVECL-UMONS database for audio-visual event classification and localization
Mathilde Brousmiche, St\'ephane Dupont, Jean Rouat

TL;DR
The paper introduces the AVECL-UMONS dataset, a new audio-visual dataset for event classification and localization in office environments, featuring realistic recordings and multiple event classes.
Contribution
It provides a publicly accessible, multi-label dataset with 11 event classes recorded in real office settings for audio-visual event analysis.
Findings
Dataset contains 5.24 hours of recordings.
Includes 2662 unilabel and 2724 multilabel sequences.
Accessible online for research use.
Abstract
We introduce the AVECL-UMons dataset for audio-visual event classification and localization in the context of office environments. The audio-visual dataset is composed of 11 event classes recorded at several realistic positions in two different rooms. Two types of sequences are recorded according to the number of events in the sequence. The dataset comprises 2662 unilabel sequences and 2724 multilabel sequences corresponding to a total of 5.24 hours. The dataset is publicly accessible online : https://zenodo.org/record/3965492#.X09wsobgrCI.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing
