Automated Multi-level Preference for MLLMs

Mengxi Zhang; Wenhao Wu; Yu Lu; Yuxin Song; Kang Rong; Huanjin Yao,; Jianbo Zhao; Fanglong Liu; Yifan Sun; Haocheng Feng; Jingdong Wang

arXiv:2405.11165·cs.CV·May 30, 2024

Automated Multi-level Preference for MLLMs

Mengxi Zhang, Wenhao Wu, Yu Lu, Yuxin Song, Kang Rong, Huanjin Yao,, Jianbo Zhao, Fanglong Liu, Yifan Sun, Haocheng Feng, Jingdong Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces the AMP framework for improving multimodal large language models by utilizing automated multi-level preferences and a new preference optimization algorithm to reduce hallucinations and enhance response quality.

Contribution

It proposes a novel multi-level preference learning approach with an automated dataset pipeline and a direct preference optimization algorithm for MLLMs.

Findings

01

Improved MLLM performance on hallucination benchmarks.

02

Effective reduction of hallucinations in multimodal responses.

03

Enhanced subtlety detection in model responses.

Abstract

Current multimodal Large Language Models (MLLMs) suffer from ``hallucination'', occasionally generating responses that are not grounded in the input images. To tackle this challenge, one promising path is to utilize reinforcement learning from human feedback (RLHF), which steers MLLMs towards learning superior responses while avoiding inferior ones. We rethink the common practice of using binary preferences (i.e., superior, inferior), and find that adopting multi-level preferences (e.g., superior, medium, inferior) is better for two benefits: 1) It narrows the gap between adjacent levels, thereby encouraging MLLMs to discern subtle differences. 2) It further integrates cross-level comparisons (beyond adjacent-level comparisons), thus providing a broader range of comparisons with hallucination examples. To verify our viewpoint, we present the Automated Multi-level Preference (AMP)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

takomc/amp
pytorchOfficial

Videos

Automated Multi-level Preference for MLLMs· slideslive

Taxonomy

TopicsSpeech and dialogue systems · Semantic Web and Ontologies · Natural Language Processing Techniques