VACoDe: Visual Augmented Contrastive Decoding

Sihyeon Kim; Boryeong Cho; Sangmin Bae; Sumyeong Ahn; Se-Young Yun

arXiv:2408.05337·cs.CV·August 13, 2024

VACoDe: Visual Augmented Contrastive Decoding

Sihyeon Kim, Boryeong Cho, Sangmin Bae, Sumyeong Ahn, Se-Young Yun

PDF

Open Access

TL;DR

VACoDe is a novel method that adaptively selects the most contrasting image augmentation to improve the accuracy of vision-language models without extra training or external data.

Contribution

It introduces VACoDe, which utilizes multiple augmentations and a softmax distance metric to enhance contrastive decoding in vision-language models.

Findings

01

Outperforms previous contrastive decoding methods.

02

Improves output quality across various vision-language tasks.

03

Universal applicability without additional training or external data.

Abstract

Despite the astonishing performance of recent Large Vision-Language Models (LVLMs), these models often generate inaccurate responses. To address this issue, previous studies have focused on mitigating hallucinations by employing contrastive decoding (CD) with augmented images, which amplifies the contrast with the original image. However, these methods have limitations, including reliance on a single augmentation, which is restrictive for certain tasks, as well as the high cost of using external knowledge. In this study, we address these limitations by exploring how to utilize multiple image augmentations. Through extensive experiments, we observed that different augmentations produce varying levels of contrast depending on the task. Based on this observation, we introduce a novel method called VACoDe, Visual Augmented Contrastive Decoding. This method adaptively selects the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Visualization and Analytics

MethodsSoftmax