Value-Spectrum: Quantifying Preferences of Vision-Language Models via Value Decomposition in Social Media Contexts
Jingxuan Li, Yuning Yang, Shengqi Yang, Linfan Zhang, Ying Nian Wu

TL;DR
This paper introduces Value-Spectrum, a new benchmark for evaluating vision-language models on their understanding of human values and preferences through a large-scale video dataset and value-based questions.
Contribution
It presents a novel VQA benchmark based on Schwartz's value dimensions and a pipeline for simulating video browsing with diverse social media content.
Findings
VLMs show significant variation in handling value-oriented content.
VLMs can adopt specific personas when prompted.
Value-Spectrum effectively tracks VLM preferences in value-based tasks.
Abstract
The recent progress in Vision-Language Models (VLMs) has broadened the scope of multimodal applications. However, evaluations often remain limited to functional tasks, neglecting abstract dimensions such as personality traits and human values. To address this gap, we introduce Value-Spectrum, a novel Visual Question Answering (VQA) benchmark aimed at assessing VLMs based on Schwartz's value dimensions that capture core human values guiding people's preferences and actions. We design a VLM agent pipeline to simulate video browsing and construct a vector database comprising over 50,000 short videos from TikTok, YouTube Shorts, and Instagram Reels. These videos span multiple months and cover diverse topics, including family, health, hobbies, society, technology, etc. Benchmarking on Value-Spectrum highlights notable variations in how VLMs handle value-oriented content. Beyond identifying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsVisual Attention and Saliency Detection
MethodsSparse Evolutionary Training · ADaptive gradient method with the OPTimal convergence rate · Focus
