LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models
Lukas Helff, Felix Friedrich, Manuel Brack, Kristian Kersting, Patrick Schramowski

TL;DR
LlavaGuard is an open framework utilizing vision-language models to enhance safety and moderation in large-scale visual datasets and models, offering customizable safeguards and high accuracy.
Contribution
The paper introduces LlavaGuard, a comprehensive, open-source VLM-based safety framework with a new safety dataset, advanced augmentation techniques, and versatile models for safety evaluation.
Findings
LlavaGuard outperforms existing safeguards in accuracy.
Models effectively handle diverse safety policies.
Demonstrated success in dataset annotation and content moderation.
Abstract
This paper introduces LlavaGuard, a suite of VLM-based vision safeguards that address the critical need for reliable guardrails in the era of large-scale data and models. To this end, we establish a novel open framework, describing a customizable safety taxonomy, data preprocessing, augmentation, and training setup. For teaching a VLM safeguard on safety, we further create a multimodal safety dataset with high-quality human expert annotations, where each image is labeled with a safety rating, category, and rationale. We also employ advanced augmentations to support context-specific assessments. The resulting LlavaGuard models, ranging from 0.5B to 7B, serve as a versatile tool for evaluating the safety compliance of visual content against flexible policies. In comprehensive experiments, LlavaGuard outperforms both state-of-the-art safeguards and VLMs in accuracy and in flexibly handling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗AIML-TUDA/LlavaGuard-v1.0-7Bmodel· 51 dl· ♡ 1151 dl♡ 11
- 🤗AIML-TUDA/LlavaGuard-v1.0-13Bmodel· 13 dl· ♡ 413 dl♡ 4
- 🤗AIML-TUDA/LlavaGuard-v1.0-34Bmodel· 10 dl· ♡ 810 dl♡ 8
- 🤗AIML-TUDA/LlavaGuard-v1.1-7B-hfmodel· 56 dl· ♡ 456 dl♡ 4
- 🤗AIML-TUDA/LlavaGuard-v1.0-13B-hfmodel· 14 dl· ♡ 214 dl♡ 2
- 🤗AIML-TUDA/LlavaGuard-v1.0-7B-hfmodel· 53 dl· ♡ 453 dl♡ 4
- 🤗AIML-TUDA/LlavaGuard-v1.1-13B-hfmodel· 9 dl· ♡ 39 dl♡ 3
- 🤗AIML-TUDA/LlavaGuard-v1.2-7B-OVmodel· 110 dl· ♡ 3110 dl♡ 3
- 🤗AIML-TUDA/LlavaGuard-v1.2-7B-OV-hfmodel· 1.2k dl· ♡ 51.2k dl♡ 5
- 🤗AIML-TUDA/LlavaGuard-v1.2-0.5B-OVmodel· 169 dl· ♡ 2169 dl♡ 2
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Retinal Imaging and Analysis · Image and Object Detection Techniques
MethodsAttention Is All You Need · Softmax · Layer Normalization · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Multi-Head Attention · Position-Wise Feed-Forward Layer
