SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings

Weikai Lu; Hao Peng; Huiping Zhuang; Cen Chen; Ziqian Zeng

arXiv:2502.12562·cs.CL·June 4, 2025

SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings

Weikai Lu, Hao Peng, Huiping Zhuang, Cen Chen, Ziqian Zeng

PDF

Open Access 1 Repo 1 Video

TL;DR

SEA introduces a method to enhance multimodal large language model security by synthesizing embeddings for additional modalities, enabling effective safety alignment with minimal resource requirements.

Contribution

The paper proposes Synthetic Embedding augmented safety Alignment (SEA), a novel approach that optimizes embeddings to facilitate multimodal safety alignment using only textual data.

Findings

01

SEA synthesizes high-quality embeddings within seconds on a single GPU.

02

SEA significantly improves MLLM security against multimodal threats.

03

The VA-SafetyBench benchmark reveals high attack success rates, validating security challenges.

Abstract

Multimodal Large Language Models (MLLMs) have serious security vulnerabilities.While safety alignment using multimodal datasets consisting of text and data of additional modalities can effectively enhance MLLM's security, it is costly to construct these datasets. Existing low-resource security alignment methods, including textual alignment, have been found to struggle with the security risks posed by additional modalities. To address this, we propose Synthetic Embedding augmented safety Alignment (SEA), which optimizes embeddings of additional modality through gradient updates to expand textual datasets. This enables multimodal safety alignment training even when only textual data is available. Extensive experiments on image, video, and audio-based MLLMs demonstrate that SEA can synthesize a high-quality embedding on a single RTX3090 GPU within 24 seconds. SEA significantly improves the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zeronlp/sea
pytorchOfficial

Videos

SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques