Odysseus: Jailbreaking Commercial Multimodal LLM-integrated Systems via Dual Steganography

Songze Li; Jiameng Cheng; Yiming Li; Xiaojun Jia; Dacheng Tao

arXiv:2512.20168·cs.CR·December 24, 2025

Odysseus: Jailbreaking Commercial Multimodal LLM-integrated Systems via Dual Steganography

Songze Li, Jiameng Cheng, Yiming Li, Xiaojun Jia, Dacheng Tao

PDF

Open Access

TL;DR

Odysseus introduces a novel dual steganography method to covertly embed malicious content in images, effectively bypassing safety filters in commercial multimodal LLM systems and exposing security vulnerabilities.

Contribution

This paper presents Odysseus, a new jailbreak approach using dual steganography to evade safety filters in multimodal LLMs, revealing a critical security blind spot.

Findings

01

Achieves up to 99% success rate in bypassing safety filters

02

Reveals limitations of current defenses relying on explicit visibility of malicious content

03

Demonstrates effectiveness across multiple real-world MLLM systems

Abstract

By integrating language understanding with perceptual modalities such as images, multimodal large language models (MLLMs) constitute a critical substrate for modern AI systems, particularly intelligent agents operating in open and interactive environments. However, their increasing accessibility also raises heightened risks of misuse, such as generating harmful or unsafe content. To mitigate these risks, alignment techniques are commonly applied to align model behavior with human values. Despite these efforts, recent studies have shown that jailbreak attacks can circumvent alignment and elicit unsafe outputs. Currently, most existing jailbreak methods are tailored for open-source models and exhibit limited effectiveness against commercial MLLM-integrated systems, which often employ additional filters. These filters can detect and prevent malicious input and output content, significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Authorship Attribution and Profiling · Topic Modeling