TL;DR
This paper introduces a novel adversarial-guided dual-injection framework for embedding verifiable ownership triggers into multimodal large language models, aiding in intellectual property protection.
Contribution
It proposes a dual-injection method to generate ownership triggers that are effective in fine-tuned models and robust against model modifications.
Findings
Effective in embedding ownership triggers in MLLMs.
Triggers elicit ownership responses only in derivatives.
Enhanced robustness against model fine-tuning and domain shifts.
Abstract
With the rapid deployment of multimodal large language models (MLLMs), disputes regarding model ownership have become increasingly frequent, raising significant concerns about intellectual property protection. In this paper, we propose a framework for generating copyright triggers for MLLMs, enabling model publishers to embed verifiable ownership information into the model. The goal is to construct trigger images that elicit ownership-related textual responses exclusively in fine-tuned derivatives, while remaining inert in other non-derivative models. Our method constructs a tracking trigger image by treating the image as a learnable tensor, performing adversarial optimization with dual-injection of ownership-relevant semantic information. The first injection is achieved by enforcing textual consistency between the output of an auxiliary MLLM and a predefined ownership-relevant target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
