Embedding Arithmetic: A Lightweight, Tuning-Free Framework for Post-hoc Bias Mitigation in Text-to-Image Models
Venkatesh Thirugnana Sambandham, Torsten Sch\"on

TL;DR
This paper presents Embedding Arithmetic, a tuning-free inference method for mitigating social bias in text-to-image models, preserving prompt semantics and visual context while offering controllable bias reduction.
Contribution
It introduces a novel embedding arithmetic approach that corrects bias without altering model weights, datasets, or prompts, and proposes a new metric for semantic preservation.
Findings
Embedding space forms an entangled manifold rather than a grid of concepts.
The method outperforms baselines in diversity and concept coherence.
Proposes the Concept Coherence Score (CCS) for robust semantic evaluation.
Abstract
Modern text-to-image (T2I) models amplify harmful societal biases, challenging their ethical deployment. We introduce an inference-time method that reliably mitigates social bias while keeping prompt semantics and visual context (background, layout, and style) intact. This ensures context persistency and provides a controllable parameter to adjust mitigation strength, giving practitioners fine-grained control over fairness-coherence trade-offs. Using Embedding Arithmetic, we analyze how bias is structured in the embedding space and correct it without altering model weights, prompts, or datasets. Experiments on FLUX 1.0-Dev and Stable Diffusion 3.5-Large show that the conditional embedding space forms a complex, entangled manifold rather than a grid of disentangled concepts. To rigorously assess semantic preservation beyond the circularity and bias limitations of of CLIP scores, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
