Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models
Yongsheng Yu, Jiebo Luo

TL;DR
This paper investigates using large multimodal models for demographic inference, introducing a benchmark and a Chain-of-Thought prompting method to improve zero-shot performance and interpretability in diverse, real-world scenarios.
Contribution
It introduces a new benchmark for demographic inference with LMMs and proposes a Chain-of-Thought prompting approach to enhance their accuracy and interpretability.
Findings
LMMs excel in zero-shot learning and interpretability.
Chain-of-Thought prompting reduces off-target predictions.
LMMs handle uncurated 'in-the-wild' inputs effectively.
Abstract
Conventional demographic inference methods have predominantly operated under the supervision of accurately labeled data, yet struggle to adapt to shifting social landscapes and diverse cultural contexts, leading to narrow specialization and limited accuracy in applications. Recently, the emergence of large multimodal models (LMMs) has shown transformative potential across various research tasks, such as visual comprehension and description. In this study, we explore the application of LMMs to demographic inference and introduce a benchmark for both quantitative and qualitative evaluation. Our findings indicate that LMMs possess advantages in zero-shot learning, interpretability, and handling uncurated 'in-the-wild' inputs, albeit with a propensity for off-target predictions. To enhance LMM performance and achieve comparability with supervised learning baselines, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health Research Topics
