M-CALLM: Multi-level Context Aware LLM Framework for Group Interaction Prediction
Diana Romero, Xin Gao, Daniel Khalkhali, Salma Elmalaki

TL;DR
This paper introduces M-CALLM, a hierarchical LLM framework that leverages multi-level contextual information from multimodal sensors to predict group interactions with high accuracy and low latency, advancing collaborative environment modeling.
Contribution
The paper presents M-CALLM, a novel framework that encodes multimodal sensor data into hierarchical context for LLM-based group interaction prediction, surpassing statistical models in accuracy.
Findings
Achieves 96% accuracy in conversation prediction, 3.2x better than LSTM baselines.
Maintains sub-35ms latency for real-time prediction.
Reveals limitations in simulation mode with 83% performance degradation.
Abstract
This paper explores how large language models can leverage multi-level contextual information to predict group coordination patterns in collaborative mixed reality environments. We demonstrate that encoding individual behavioral profiles, group structural properties, and temporal dynamics as natural language enables LLMs to break through the performance ceiling of statistical models. We build M-CALLM, a framework that transforms multimodal sensor streams into hierarchical context for LLM-based prediction, and evaluate three paradigms (zero-shot prompting, few-shot learning, and supervised fine-tuning) against statistical baselines across intervention mode (real-time prediction) and simulation mode (autoregressive forecasting) Head-to-head comparison on 16 groups (64 participants, ~25 hours) demonstrates that context-aware LLMs achieve 96% accuracy for conversation prediction, a 3.2x…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Mobile Crowdsensing and Crowdsourcing · Social Robot Interaction and HRI
