Automating Data Annotation under Strategic Human Agents: Risks and   Potential Solutions

Tian Xie; Xueru Zhang

arXiv:2405.08027·cs.LG·October 14, 2024

Automating Data Annotation under Strategic Human Agents: Risks and Potential Solutions

Tian Xie, Xueru Zhang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper explores the long-term effects of using ML models to annotate data in social domains with strategic human agents, highlighting risks to fairness and proposing solutions for stable retraining.

Contribution

It formalizes the interaction between strategic agents and models, analyzes dynamic evolution, and proposes a refined retraining process to mitigate instability and fairness issues.

Findings

01

Agents tend to receive more positive decisions over time.

02

The proportion of positively labeled agents may decrease in the long run.

03

Enforcing fairness constraints at each round may not benefit disadvantaged groups.

Abstract

As machine learning (ML) models are increasingly used in social domains to make consequential decisions about humans, they often have the power to reshape data distributions. Humans, as strategic agents, continuously adapt their behaviors in response to the learning system. As populations change dynamically, ML systems may need frequent updates to ensure high performance. However, acquiring high-quality human-annotated samples can be highly challenging and even infeasible in social domains. A common practice to address this issue is using the model itself to annotate unlabeled data samples. This paper investigates the long-term impacts when ML models are retrained with model-annotated samples when they incorporate human strategic responses. We first formalize the interactions between strategic agents and the model and then analyze how they evolve under such dynamic interactions. We find…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

osu-srml/automating-data-annotation-under-strategic-human-agents
noneOfficial

Videos

Automating Data Annotation under Strategic Human Agents: Risks and Potential Solutions· slideslive

Taxonomy

TopicsData Quality and Management · Semantic Web and Ontologies · Big Data and Business Intelligence