# Safe-LLaVA: A Privacy-Preserving Vision-Language Dataset and Benchmark for Biometric Safety

**Authors:** Younggun Kim, Sirnam Swetha, Fazil Kagdi, Mubarak Shah

arXiv: 2509.00192 · 2025-10-08

## TL;DR

This paper introduces Safe-LLaVA, a privacy-preserving vision-language dataset, and PRISM, a benchmark to evaluate biometric leakage in multimodal models, addressing privacy concerns in sensitive applications.

## Contribution

The paper presents the first privacy-preserving MLLM dataset, Safe-LLaVA, and a benchmark, PRISM, to evaluate and mitigate biometric leakage in vision-language models.

## Key findings

- Extensive biometric leakage found in existing datasets.
- Fine-tuning on Safe-LLaVA reduces biometric leakage.
- PRISM effectively evaluates privacy-related model behaviors.

## Abstract

Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in vision-language tasks. However, these models often infer and reveal sensitive biometric attributes such as race, gender, age, body weight, and eye color; even when such information is not explicitly requested. This raises critical concerns, particularly in real-world applications and socially-sensitive domains. Despite increasing awareness, no publicly available dataset or benchmark exists to comprehensively evaluate or mitigate biometric leakage in MLLMs. To address this gap, we introduce PRISM (Privacy-aware Evaluation of Responses in Sensitive Modalities), a new benchmark designed to assess MLLMs on two fronts: (1) refuse biometric-related queries and (2) implicit biometric leakage in general responses while maintaining semantic faithfulness. Further, we conduct a detailed audit of the widely used LLaVA datasets and uncover extensive biometric leakage across pretraining and instruction data. To address this, we present Safe-LLaVA dataset, the first privacy-preserving MLLM training dataset constructed by systematically removing explicit and implicit biometric information from LLaVA dataset. Our evaluations on PRISM reveal biometric leakages across MLLMs for different attributes, highlighting the detailed privacy-violations. We also fine-tune a model on Safe-LLaVA dataset and show that it substantially reduces the biometric leakages. Together, Safe-LLaVA and PRISM set a new standard for privacy-aligned development and evaluation of MLLMs.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00192/full.md

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/2509.00192/full.md

## References

61 references — full list in the complete paper: https://tomesphere.com/paper/2509.00192/full.md

---
Source: https://tomesphere.com/paper/2509.00192