Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models
Junjie Xiong, Changjia Zhu, Shuhang Lin, Chong Zhang, Yongfeng Zhang, Yao Liu, Lingyao Li

TL;DR
This paper uncovers security vulnerabilities in Large Language Models caused by malicious font injections in external web resources, enabling hidden prompts and data leaks that bypass safety measures.
Contribution
It systematically investigates font-based adversarial prompts in external resources and demonstrates their ability to bypass LLM safety mechanisms.
Findings
Malicious font injections can bypass safety filters.
External resources can leak sensitive data.
Success varies with prompt design and data sensitivity.
Abstract
Large Language Models (LLMs) are increasingly equipped with capabilities of real-time web search and integrated with protocols like Model Context Protocol (MCP). This extension could introduce new security vulnerabilities. We present a systematic investigation of LLM vulnerabilities to hidden adversarial prompts through malicious font injection in external resources like webpages, where attackers manipulate code-to-glyph mapping to inject deceptive content which are invisible to users. We evaluate two critical attack scenarios: (1) "malicious content relay" and (2) "sensitive data leakage" through MCP-enabled tools. Our experiments reveal that indirect prompts with injected malicious font can bypass LLM safety mechanisms through external resources, achieving varying success rates based on data sensitivity and prompt design. Our research underscores the urgent need for enhanced security…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsWeb Application Security Vulnerabilities · Spam and Phishing Detection · Adversarial Robustness in Machine Learning
