Analysis of Self-Supervised Learning and Dimensionality Reduction   Methods in Clustering-Based Active Learning for Speech Emotion Recognition

Einari Vaaras; Manu Airaksinen; Okko R\"as\"anen

arXiv:2206.10188·cs.LG·June 22, 2022·1 cites

Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition

Einari Vaaras, Manu Airaksinen, Okko R\"as\"anen

PDF

Open Access 1 Repo

TL;DR

This paper explores combining contrastive predictive coding with dimensionality reduction to enhance clustering-based active learning for speech emotion recognition, demonstrating that low-dimensional features can maintain effective performance.

Contribution

It introduces a novel approach integrating CPC and dimensionality reduction to improve clustering-based active learning in speech emotion recognition tasks.

Findings

01

CPC improves clustering-based active learning performance.

02

Low-dimensional features retain effective active learning performance.

03

Both local and global feature space topology are useful for active learning.

Abstract

When domain experts are needed to perform data annotation for complex machine-learning tasks, reducing annotation effort is crucial in order to cut down time and expenses. For cases when there are no annotations available, one approach is to utilize the structure of the feature space for clustering-based active learning (AL) methods. However, these methods are heavily dependent on how the samples are organized in the feature space and what distance metric is used. Unsupervised methods such as contrastive predictive coding (CPC) can potentially be used to learn organized feature spaces, but these methods typically create high-dimensional features which might be challenging for estimating data density. In this paper, we combine CPC and multiple dimensionality reduction methods in search of functioning practices for clustering-based AL. Our experiments for simulating speech emotion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SPEECHCOG/cpc_pytorch
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Text and Document Classification Technologies

MethodsInfoNCE · Contrastive Predictive Coding