REVEAL: Multi-turn Evaluation of Image-Input Harms for Vision LLM
Madhur Jindal, Saurabh Deshpande

TL;DR
The paper introduces REVEAL, a comprehensive framework for evaluating multi-turn image-input harms in vision LLMs, revealing vulnerabilities and performance differences across models in safety-critical scenarios.
Contribution
It presents a scalable, automated evaluation pipeline specifically designed for multi-turn image-input safety assessment in vision LLMs, addressing limitations of prior single-turn, text-only frameworks.
Findings
Multi-turn interactions increase defect rates in VLLMs.
GPT-4o has the best safety-usability balance among evaluated models.
Misinformation detection remains a critical challenge.
Abstract
Vision Large Language Models (VLLMs) represent a significant advancement in artificial intelligence by integrating image-processing capabilities with textual understanding, thereby enhancing user interactions and expanding application domains. However, their increased complexity introduces novel safety and ethical challenges, particularly in multi-modal and multi-turn conversations. Traditional safety evaluation frameworks, designed for text-based, single-turn interactions, are inadequate for addressing these complexities. To bridge this gap, we introduce the REVEAL (Responsible Evaluation of Vision-Enabled AI LLMs) Framework, a scalable and automated pipeline for evaluating image-input harms in VLLMs. REVEAL includes automated image mining, synthetic adversarial data generation, multi-turn conversational expansion using crescendo attack strategies, and comprehensive harm assessment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Hate Speech and Cyberbullying Detection · Multimodal Machine Learning Applications
