SEM: Sparse Embedding Modulation for Post-Hoc Debiasing of Vision-Language Models

Quentin Guimard; Federico Bartsch; Simone Caldarella; Rahaf Aljundi; Elisa Ricci; Massimiliano Mancini

arXiv:2603.19028·cs.CV·March 20, 2026

SEM: Sparse Embedding Modulation for Post-Hoc Debiasing of Vision-Language Models

Quentin Guimard, Federico Bartsch, Simone Caldarella, Rahaf Aljundi, Elisa Ricci, Massimiliano Mancini

PDF

Open Access

TL;DR

This paper introduces SEM, a novel post-hoc debiasing method for vision-language models like CLIP, using sparse autoencoder latent space to disentangle and modulate bias without harming semantic content.

Contribution

SEM is the first approach to operate in a sparse autoencoder space for post-hoc debiasing, enabling precise bias removal while maintaining semantic fidelity.

Findings

01

SEM improves fairness in retrieval tasks across multiple datasets.

02

SEM enhances zero-shot classification fairness without degrading accuracy.

03

Sparse latent representations are effective for debiasing vision-language models.

Abstract

Models that bridge vision and language, such as CLIP, are key components of multimodal AI, yet their large-scale, uncurated training data introduce severe social and spurious biases. Existing post-hoc debiasing methods often operate directly in the dense CLIP embedding space, where bias and task-relevant information are highly entangled. This entanglement limits their ability to remove bias without degrading semantic fidelity. In this work, we propose Sparse Embedding Modulation (SEM), a post-hoc, zero-shot debiasing framework that operates in a Sparse Autoencoder (SAE) latent space. By decomposing CLIP text embeddings into disentangled features, SEM identifies and modulates bias-relevant neurons while preserving query-relevant ones. This enables more precise, non-linear interventions. Across four benchmark datasets and two CLIP backbones, SEM achieves substantial fairness gains in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications