Back to the Future: On Potential Histories in NLP
Zeerak Talat, Anne Lauscher

TL;DR
This paper explores the political and historical biases in NLP datasets and models, proposing a perspective inspired by historical fiction to better represent marginalized communities and challenge hegemonic narratives.
Contribution
It introduces a novel approach of viewing datasets and models through the lens of historical fiction to surface marginalized histories in NLP.
Findings
Highlighting biases in current datasets and models.
Demonstrating how surfacing marginalized histories improves representation.
Proposing a new perspective for addressing societal biases in NLP.
Abstract
Machine learning and NLP require the construction of datasets to train and fine-tune models. In this context, previous work has demonstrated the sensitivity of these data sets. For instance, potential societal biases in this data are likely to be encoded and to be amplified in the models we deploy. In this work, we draw from developments in the field of history and take a novel perspective on these problems: considering datasets and models through the lens of historical fiction surfaces their political nature, and affords re-configuring how we view the past, such that marginalized discourses are surfaced. Building on such insights, we argue that contemporary methods for machine learning are prejudiced towards dominant and hegemonic histories. Employing the example of neopronouns, we show that by surfacing marginalized histories within contemporary conditions, we can create models that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods
