Hijacking Context in Large Multi-modal Models
Joonhyun Jeong

TL;DR
This paper reveals a vulnerability in large multi-modal models where irrelevant or misleading contexts can bias their outputs, and proposes a pre-filtering approach using GPT-4V to mitigate this issue.
Contribution
The study introduces a novel pre-filtering method with GPT-4V to detect and remove irrelevant contexts, enhancing the robustness of multi-modal models against hijacked inputs.
Findings
Pre-filtering with GPT-4V reduces bias from irrelevant contexts.
Replacing hijacked contexts with correlated ones improves response coherence.
The approach enhances model robustness against distribution shifts.
Abstract
Recently, Large Multi-modal Models (LMMs) have demonstrated their ability to understand the visual contents of images given the instructions regarding the images. Built upon the Large Language Models (LLMs), LMMs also inherit their abilities and characteristics such as in-context learning where a coherent sequence of images and texts are given as the input prompt. However, we identify a new limitation of off-the-shelf LMMs where a small fraction of incoherent images or text descriptions mislead LMMs to only generate biased output about the hijacked context, not the originally intended context. To address this, we propose a pre-filtering method that removes irrelevant contexts via GPT-4V, based on its robustness towards distribution shift within the contexts. We further investigate whether replacing the hijacked visual and textual contexts with the correlated ones via GPT-4V and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
