Contrastive Clustering: Toward Unsupervised Bias Reduction for Emotion and Sentiment Classification
Jared Mowery

TL;DR
This paper introduces a contrastive clustering method to reduce bias in emotion and sentiment classifiers applied to COVID-19 social media data, improving the accuracy of public health analyses.
Contribution
It presents an unsupervised contrastive clustering algorithm that distinguishes causal tokens from correlational ones, effectively reducing bias in emotion and sentiment classification.
Findings
Contrastive clustering achieves an F1 score of 0.753 in distinguishing correlation from causation.
Bias reduction decreases overall anger and negative sentiment estimates by approximately 8-14%.
Debiasing improves classifier performance on bias-prone sentences by over 10%.
Abstract
Background: When neural network emotion and sentiment classifiers are used in public health informatics studies, biases present in the classifiers could produce inadvertently misleading results. Objective: This study assesses the impact of bias on COVID-19 topics, and demonstrates an automatic algorithm for reducing bias when applied to COVID-19 social media texts. This could help public health informatics studies produce more timely results during crises, with a reduced risk of misleading results. Methods: Emotion and sentiment classifiers were applied to COVID-19 data before and after debiasing the classifiers using unsupervised contrastive clustering. Contrastive clustering approximates the degree to which tokens exhibit a causal versus correlational relationship with emotion or sentiment, by contrasting the tokens' relative salience to topics versus emotions or sentiments.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Computational and Text Analysis Methods · Sentiment Analysis and Opinion Mining
