Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain   Beliefs of Others in Conversation Forecasting

Anthony Sicilia; Malihe Alikhani

arXiv:2409.14986·cs.CL·September 24, 2024

Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain Beliefs of Others in Conversation Forecasting

Anthony Sicilia, Malihe Alikhani

PDF

Open Access

TL;DR

This paper introduces new tasks for language models to predict and quantify the uncertainty of others' beliefs in dialogue, emphasizing the complexity of modeling uncertain mental states.

Contribution

It presents a novel suite of conversation forecasting tasks that challenge LMs to model uncertain beliefs, incorporating rescaling, variance reduction, and demographic context.

Findings

01

LMs explain up to 7% variance in uncertainty prediction

02

Tasks are challenging, indicating room for improvement

03

Experiments conducted on three diverse dialogue corpora

Abstract

Typically, when evaluating Theory of Mind, we consider the beliefs of others to be binary: held or not held. But what if someone is unsure about their own beliefs? How can we quantify this uncertainty? We propose a new suite of tasks, challenging language models (LMs) to model the uncertainty of others in dialogue. We design these tasks around conversation forecasting, wherein an agent forecasts an unobserved outcome to a conversation. Uniquely, we view interlocutors themselves as forecasters, asking an LM to predict the uncertainty of the interlocutors (a probability). We experiment with re-scaling methods, variance reduction strategies, and demographic context, for this regression task, conducting experiments on three dialogue corpora (social, negotiation, task-oriented) with eight LMs. While LMs can explain up to 7% variance in the uncertainty of others, we highlight the difficulty…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForecasting Techniques and Applications