Analyzing the Limits of Self-Supervision in Handling Bias in Language
Lisa Bauer, Karthik Gopalakrishnan, Spandana Gella, Yang Liu, Mohit, Bansal, Dilek Hakkani-Tur

TL;DR
This paper evaluates how well large language models understand and handle bias-related tasks like diagnosis and extraction, revealing their varying capabilities and limitations in addressing sociological biases.
Contribution
It provides a comprehensive analysis of the effectiveness and limitations of self-supervised prompting methods in detecting and managing bias in language models.
Findings
Models perform variably across bias dimensions like gender and politics.
Prompting efficacy depends on task description class and decoding method.
Current self-supervision objectives have notable limitations in bias-related tasks.
Abstract
Prompting inputs with natural language task descriptions has emerged as a popular mechanism to elicit reasonably accurate outputs from large-scale generative language models with little to no in-context supervision. This also helps gain insight into how well language models capture the semantics of a wide range of downstream tasks purely from self-supervised pre-training on massive corpora of unlabeled text. Such models have naturally also been exposed to a lot of undesirable content like racist and sexist language and there is limited work on awareness of models along these dimensions. In this paper, we define and comprehensively evaluate how well such language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing. We define three broad classes of task descriptions for these tasks: statement, question, and completion, with numerous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Topic Modeling · Natural Language Processing Techniques
