TrustSER: On the Trustworthiness of Fine-tuning Pre-trained Speech Embeddings For Speech Emotion Recognition
Tiantian Feng, Rajat Hebbar, Shrikanth Narayanan

TL;DR
TrustSER is a framework for evaluating the trustworthiness of speech emotion recognition systems that use pre-trained embeddings, focusing on privacy, fairness, safety, and sustainability to facilitate real-world deployment.
Contribution
This paper introduces TrustSER, a novel framework for assessing trustworthiness aspects of SER systems based on pre-trained embeddings, addressing privacy, fairness, and robustness concerns.
Findings
Identifies key trustworthiness issues in pre-trained embedding-based SER.
Provides insights into privacy, fairness, and adversarial vulnerabilities.
Offers a publicly available evaluation framework for future research.
Abstract
Recent studies have explored the use of pre-trained embeddings for speech emotion recognition (SER), achieving comparable performance to conventional methods that rely on low-level knowledge-inspired acoustic features. These embeddings are often generated from models trained on large-scale speech datasets using self-supervised or weakly-supervised learning objectives. Despite the significant advancements made in SER through the use of pre-trained embeddings, there is a limited understanding of the trustworthiness of these methods, including privacy breaches, unfair performance, vulnerability to adversarial attacks, and computational cost, all of which may hinder the real-world deployment of these systems. In response, we introduce TrustSER, a general framework designed to evaluate the trustworthiness of SER systems using deep learning methods, with a focus on privacy, safety, fairness,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Adversarial Robustness in Machine Learning · Topic Modeling
