The ACM Multimedia 2022 Computational Paralinguistics Challenge:   Vocalisations, Stuttering, Activity, & Mosquitoes

Bj\"orn W. Schuller; Anton Batliner; Shahin Amiriparian; Christian; Bergler; Maurice Gerczuk; Natalie Holz; Pauline Larrouy-Maestri; Sebastian P.; Bayerl; Korbinian Riedhammer; Adria Mallol-Ragolta; Maria Pateraki; Harry; Coppock; Ivan Kiskin; Marianne Sinka; Stephen Roberts

arXiv:2205.06799·cs.SD·May 16, 2022·1 cites

The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes

Bj\"orn W. Schuller, Anton Batliner, Shahin Amiriparian, Christian, Bergler, Maurice Gerczuk, Natalie Holz, Pauline Larrouy-Maestri, Sebastian P., Bayerl, Korbinian Riedhammer, Adria Mallol-Ragolta, Maria Pateraki, Harry, Coppock, Ivan Kiskin, Marianne Sinka, Stephen Roberts

PDF

Open Access

TL;DR

The ACM Multimedia 2022 Computational Paralinguistics Challenge introduces four novel sub-challenges focusing on vocalisations, speech stuttering, human activity recognition from smartwatch data, and mosquito detection, with baseline methods provided.

Contribution

This paper presents the first research competition covering diverse paralinguistic and bioacoustic problems under standardized conditions, including baseline feature extraction and classification methods.

Findings

01

Baseline classifiers established for all sub-challenges.

02

Introduction of deep learning and end-to-end models for each task.

03

Comparison of traditional and deep feature extraction techniques.

Abstract

The ACM Multimedia 2022 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the Vocalisations and Stuttering Sub-Challenges, a classification on human non-verbal vocalisations and speech has to be made; the Activity Sub-Challenge aims at beyond-audio human activity recognition from smartwatch sensor data; and in the Mosquitoes Sub-Challenge, mosquitoes need to be detected. We describe the Sub-Challenges, baseline feature extraction, and classifiers based on the usual ComPaRE and BoAW features, the auDeep toolkit, and deep feature extraction from pre-trained CNNs using the DeepSpectRum toolkit; in addition, we add end-to-end sequential modelling, and a log-mel-128-BNN.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Context-Aware Activity Recognition Systems · Speech Recognition and Synthesis