Designing Evaluations of Machine Learning Models for Subjective   Inference: The Case of Sentence Toxicity

Agathe Balayn; Alessandro Bozzon

arXiv:1911.02471·cs.LG·November 7, 2019·5 cites

Designing Evaluations of Machine Learning Models for Subjective Inference: The Case of Sentence Toxicity

Agathe Balayn, Alessandro Bozzon

PDF

Open Access

TL;DR

This paper emphasizes the importance of evaluating machine learning models for subjective properties like bias and toxicity, proposing initial specifications to guide the creation of evaluation datasets.

Contribution

It introduces a set of specifications for evaluating biases in ML models on subjective tasks, exemplified through sentence toxicity inference.

Findings

01

Proposes specifications for bias evaluation datasets

02

Highlights challenges in instantiating these specifications

03

Suggests future work for crowdsourcing dataset creation

Abstract

Machine Learning (ML) is increasingly applied in real-life scenarios, raising concerns about bias in automatic decision making. We focus on bias as a notion of opinion exclusion, that stems from the direct application of traditional ML pipelines to infer subjective properties. We argue that such ML systems should be evaluated with subjectivity and bias in mind. Considering the lack of evaluation standards yet to create evaluation benchmarks, we propose an initial list of specifications to define prior to creating evaluation datasets, in order to later accurately evaluate the biases. With the example of a sentence toxicity inference system, we illustrate how the specifications support the analysis of biases related to subjectivity. We highlight difficulties in instantiating these specifications and list future work for the crowdsourcing community to help the creation of appropriate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Adversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data