AVECL-UMONS database for audio-visual event classification and   localization

Mathilde Brousmiche; St\'ephane Dupont; Jean Rouat

arXiv:2011.01018·cs.IR·November 3, 2020·1 cites

AVECL-UMONS database for audio-visual event classification and localization

Mathilde Brousmiche, St\'ephane Dupont, Jean Rouat

PDF

Open Access

TL;DR

The paper introduces the AVECL-UMONS dataset, a new audio-visual dataset for event classification and localization in office environments, featuring realistic recordings and multiple event classes.

Contribution

It provides a publicly accessible, multi-label dataset with 11 event classes recorded in real office settings for audio-visual event analysis.

Findings

01

Dataset contains 5.24 hours of recordings.

02

Includes 2662 unilabel and 2724 multilabel sequences.

03

Accessible online for research use.

Abstract

We introduce the AVECL-UMons dataset for audio-visual event classification and localization in the context of office environments. The audio-visual dataset is composed of 11 event classes recorded at several realistic positions in two different rooms. Two types of sequences are recorded according to the number of events in the sequence. The dataset comprises 2662 unilabel sequences and 2724 multilabel sequences corresponding to a total of 5.24 hours. The dataset is publicly accessible online : https://zenodo.org/record/3965492#.X09wsobgrCI.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing