Expert-Guided Extinction of Toxic Tokens for Debiased Generation

Xueyao Sun; Kaize Shi; Haoran Tang; Guandong Xu; Qing Li

arXiv:2405.19299·cs.CL·May 30, 2024

Expert-Guided Extinction of Toxic Tokens for Debiased Generation

Xueyao Sun, Kaize Shi, Haoran Tang, Guandong Xu, Qing Li

PDF

Open Access

TL;DR

This paper introduces EXPOSED, a novel method that effectively reduces social bias in large language models by suppressing toxic tokens without extensive data or complex prompting, improving fairness in generated outputs.

Contribution

EXPOSED is a new expert-guided approach that constructs a debiasing expert from toxic data to suppress harmful tokens in LLM outputs, avoiding extensive fine-tuning or prompt engineering.

Findings

01

Significantly reduces social bias in LLM outputs

02

Balances fairness and generation quality effectively

03

Outperforms existing baselines on fairness benchmarks

Abstract

Large language models (LLMs) can elicit social bias during generations, especially when inference with toxic prompts. Controlling the sensitive attributes in generation encounters challenges in data distribution, generalizability, and efficiency. Specifically, fine-tuning and retrieval demand extensive unbiased corpus, while direct prompting requires meticulously curated instructions for correcting the output in multiple rounds of thoughts but poses challenges on memory and inference latency. In this work, we propose the Expert-Guided Extinction of Toxic Tokens for Debiased Generation (EXPOSED) to eliminate the undesired harmful outputs for LLMs without the aforementioned requirements. EXPOSED constructs a debiasing expert based on the abundant toxic corpus to expose and elicit the potentially dangerous tokens. It then processes the output to the LLMs and constructs a fair distribution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdditive Manufacturing and 3D Printing Technologies