Mobilizing Waldo: Evaluating Multimodal AI for Public Mobilization

Manuel Cebrian; Petter Holme; and Niccolo Pescetelli

arXiv:2412.14210·cs.HC·December 20, 2024

Mobilizing Waldo: Evaluating Multimodal AI for Public Mobilization

Manuel Cebrian, Petter Holme, and Niccolo Pescetelli

PDF

Open Access

TL;DR

This paper introduces a novel framework using 'Where's Waldo?' images to ethically evaluate multimodal LLMs' abilities in social influence and mobilization scenarios, highlighting their strengths and limitations.

Contribution

It presents a controlled, replicable testing environment for assessing multimodal LLMs' social understanding and strategic capabilities in public mobilization contexts.

Findings

01

Models generate creative strategies and vivid descriptions.

02

Models struggle to accurately identify individuals.

03

Models cannot reliably assess social dynamics.

Abstract

Advancements in multimodal Large Language Models (LLMs), such as OpenAI's GPT-4o, offer significant potential for mediating human interactions across various contexts. However, their use in areas such as persuasion, influence, and recruitment raises ethical and security concerns. To evaluate these models ethically in public influence and persuasion scenarios, we developed a prompting strategy using "Where's Waldo?" images as proxies for complex, crowded gatherings. This approach provides a controlled, replicable environment to assess the model's ability to process intricate visual information, interpret social dynamics, and propose engagement strategies while avoiding privacy concerns. By positioning Waldo as a hypothetical agent tasked with face-to-face mobilization, we analyzed the model's performance in identifying key individuals and formulating mobilization tactics. Our results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSmart Cities and Technologies