Impact of Non-Standard Unicode Characters on Security and Comprehension in Large Language Models
Johan S Daniel, Anand Pal

TL;DR
This paper investigates how non-standard Unicode characters affect the security and comprehension of large language models, revealing increased vulnerabilities and suggesting improvements in training data to mitigate risks.
Contribution
It provides a comparative analysis of fifteen models' vulnerabilities to Unicode-based manipulations, highlighting the impact on safety mechanisms and proposing the inclusion of non-standard Unicode in training.
Findings
Non-standard Unicode reduces guardrail effectiveness.
Models become more vulnerable to content policy breaches.
Inclusion of non-standard Unicode in training can improve model robustness.
Abstract
The advancement of large language models has significantly improved natural language processing. However, challenges such as jailbreaks (prompt injections that cause an LLM to follow instructions contrary to its intended use), hallucinations (generating incorrect or misleading information), and comprehension errors remain prevalent. In this report, we present a comparative analysis of the performance of fifteen distinct models, with each model undergoing a standardized test comprising 38 queries across three key metrics: jailbreaks, hallucinations, and comprehension errors. The models are assessed based on the total occurrences of jailbreaks, hallucinations, and comprehension errors. Our work exposes these models' inherent vulnerabilities and challenges the notion of human-level language comprehension of these models. We have empirically analysed the impact of non-standard Unicode…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques
MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Multi-Head Attention · Residual Connection · Byte Pair Encoding · Label Smoothing · Adam · Absolute Position Encodings · Dropout
