Evaluation empirique de la s\'ecurisation et de l'alignement de ChatGPT et Gemini: analyse comparative des vuln\'erabilit\'es par exp\'erimentations de jailbreaks
Rafa\"el Nouailles (GdR)

TL;DR
This paper compares the security and alignment of ChatGPT and Gemini LLMs, analyzing their vulnerabilities to jailbreak attacks through experimental evaluations and proposing a taxonomy of jailbreak techniques.
Contribution
It provides a novel comparative analysis of two major LLMs' vulnerabilities and introduces a taxonomy of jailbreak methods based on experimental results.
Findings
Gemini shows different vulnerability patterns compared to ChatGPT.
Jailbreak techniques vary in effectiveness across models.
The study highlights specific security weaknesses in both models.
Abstract
Large Language models (LLMs) are transforming digital usage, particularly in text generation, image creation, information retrieval and code development. ChatGPT, launched by OpenAI in November 2022, quickly became a reference, prompting the emergence of competitors such as Google's Gemini. However, these technological advances raise new cybersecurity challenges, including prompt injection attacks, the circumvention of regulatory measures (jailbreaking), the spread of misinformation (hallucinations) and risks associated with deep fakes. This paper presents a comparative analysis of the security and alignment levels of ChatGPT and Gemini, as well as a taxonomy of jailbreak techniques associated with experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education
