Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models
Nunzio Lore, Sepehr Ilami, Babak Heydari

TL;DR
This paper demonstrates that fine-tuning smaller language models with data from larger models significantly enhances their strategic Theory of Mind capabilities, reducing performance gaps and improving out-of-sample generalization in social and game-theoretic contexts.
Contribution
The authors introduce a fine-tuning approach that transfers strategic social reasoning from large to small language models, improving their performance in Theory of Mind tasks.
Findings
Fine-tuned small models achieved 46% alignment with larger models.
Performance improvements extended to out-of-sample scenarios.
Fine-tuning reduced the gap between small and large model behavior.
Abstract
As the performance of larger, newer Large Language Models continues to improve for strategic Theory of Mind (ToM) tasks, the demand for these state-of-the-art models increases commensurately. However, their deployment is costly both in terms of processing power and time. In this paper, we investigate the feasibility of creating smaller, highly-performing specialized algorithms by way of fine-tuning. To do this, we first present a large pre-trained model with 20 unique scenarios that combine different social contexts with games of varying social dilemmas, record its answers, and use them for Q&A fine-tuning on a smaller model of the same family. Our focus is on in-context game-theoretic decision-making, the same domain within which human interaction occurs and that requires both a theory of mind (or a semblance thereof) and an understanding of social dynamics. The smaller model is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsFocus
