Human-Imitating Metrics for Training and Evaluating Privacy Preserving Emotion Recognition Models Using Sociolinguistic Knowledge
Mimansa Jaiswal, Emily Mower Provost

TL;DR
This paper introduces a novel, automatic metric based on sociolinguistic biases to evaluate how well privacy-preserving emotion recognition models align with human perceptions of privacy, addressing trust issues in black-box models.
Contribution
It proposes the first automatic, quantifiable metric for assessing privacy preservation in models, utilizing sociolinguistic biases and saliency explanations to reflect human perception.
Findings
Common privacy methods often do not align with human privacy perception.
The proposed metric correlates with human judgments and improves cross-corpus generalization.
Crowdsourcing experiments validate the metric's effectiveness in model evaluation.
Abstract
Privacy preservation is a crucial component of any real-world application. But, in applications relying on machine learning backends, privacy is challenging because models often capture more than what the model was initially trained for, resulting in the potential leakage of sensitive information. In this paper, we propose an automatic and quantifiable metric that allows us to evaluate humans' perception of a model's ability to preserve privacy with respect to sensitive variables. In this paper, we focus on saliency-based explanations, explanations that highlight regions of the input text, to infer internal workings of a black box model. We use the degree with which differences in interpretation of general vs privacy preserving models correlate with sociolinguistic biases to inform metric design. We show how certain commonly-used methods that seek to preserve privacy do not align with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Mental Health via Writing · Sentiment Analysis and Opinion Mining
