TeamLLM: Exploring the Capabilities of LLMs for Multimodal Group Interaction Prediction
Diana Romero, Xin Gao, Daniel Khalkhali, Salma Elmalaki

TL;DR
This paper explores the use of Large Language Models for predicting group coordination in collaborative Mixed Reality environments using multimodal sensor data, demonstrating significant performance improvements and identifying their limitations.
Contribution
It introduces a hierarchical encoding of multimodal sensor data as natural language and evaluates LLM adaptation methods for group behavior prediction, establishing new benchmarks and guidelines.
Findings
LLMs outperform LSTM baselines by 3.2× in linguistically-grounded behavior prediction.
Fine-tuning achieves 96% accuracy in conversation prediction with sub-35ms latency.
Text-based LLMs succeed in turn-taking prediction but struggle with spatial and visual attention tasks.
Abstract
Predicting group behavior, how individuals coordinate, communicate, and interact during collaborative tasks, is essential for designing systems that can support team performance through real-time prediction and realistic simulation of collaborative scenarios. Large Language Models (LLMs) have shown promise for processing sensor data for human-activity recognition (HAR), yet their capabilities for team dynamics or group-level multimodal sensing remain unexplored. This paper investigates whether LLMs can predict group coordination patterns from multimodal sensor data in collaborative Mixed Reality (MR) environments. We encode hierarchical context -- individual behavioral profiles, group structural properties, and temporal activity context -- as natural language and evaluate three LLM adaptation paradigms (zero-shot, few-shot, and supervised fine-tuning) against statistical baselines. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
