Attribution Analysis Meets Model Editing: Advancing Knowledge Correction   in Vision Language Models with VisEdit

Qizhou Chen; Taolin Zhang; Chengyu Wang; Xiaofeng He; Dakan Wang,; Tingting Liu

arXiv:2408.09916·cs.CV·January 24, 2025

Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit

Qizhou Chen, Taolin Zhang, Chengyu Wang, Xiaofeng He, Dakan Wang,, Tingting Liu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces VisEdit, a novel method for editing visual representations in vision-language models to correct knowledge errors, demonstrating significant improvements over existing approaches through attribution analysis and benchmark evaluations.

Contribution

The paper pioneers the application of model editing to VLLMs by developing VisEdit, which edits intermediate visual representations to correct knowledge without retraining.

Findings

01

VisEdit outperforms state-of-the-art baselines in VLLM editing tasks.

02

Visual representations in mid-to-later layers significantly influence predictions.

03

Attribution methods effectively identify regions relevant to knowledge edits.

Abstract

Model editing aims to correct outdated or erroneous knowledge in large models without costly retraining. Recent research discovered that the mid-layer representation of the subject's final token in a prompt has a strong influence on factual predictions, and developed Large Language Model (LLM) editing techniques based on this observation. However, for Vision-LLMs (VLLMs), how visual representations impact the predictions from a decoder-only language model remains largely unexplored. To the best of our knowledge, model editing for VLLMs has not been extensively studied in the literature. In this work, we employ the contribution allocation and noise perturbation methods to measure the contributions of visual representations for token predictions. Our attribution analysis shows that visual representations in mid-to-later layers that are highly relevant to the prompt contribute…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qizhou000/visedit
pytorchOfficial

Videos

Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit· underline

Taxonomy

TopicsSemantic Web and Ontologies