Assessing the feasibility of Large Language Models for detecting micro-behaviors in team interactions during space missions

Ankush Raut; Projna Paromita; Sydney Begerowski; Suzanne Bell; Theodora Chaspari

arXiv:2506.22679·cs.CL·July 1, 2025

Assessing the feasibility of Large Language Models for detecting micro-behaviors in team interactions during space missions

Ankush Raut, Projna Paromita, Sydney Begerowski, Suzanne Bell, Theodora Chaspari

PDF

Open Access

TL;DR

This study evaluates the potential of large language models to detect micro-behaviors in team conversations during simulated space missions, highlighting the superior performance of instruction fine-tuned decoder-only models over encoder-only models.

Contribution

It demonstrates that instruction fine-tuned decoder-only LLMs outperform encoder-only models in identifying subtle micro-behaviors in team dialogue analysis.

Findings

01

Decoder-only Llama-3.1 achieved macro F1 of 44% for 3-way classification.

02

Encoder-only models like RoBERTa struggled with underrepresented micro-behaviors.

03

Fine-tuning improves micro-behavior detection in team communication analysis.

Abstract

We explore the feasibility of large language models (LLMs) in detecting subtle expressions of micro-behaviors in team conversations using transcripts collected during simulated space missions. Specifically, we examine zero-shot classification, fine-tuning, and paraphrase-augmented fine-tuning with encoder-only sequence classification LLMs, as well as few-shot text generation with decoder-only causal language modeling LLMs, to predict the micro-behavior associated with each conversational turn (i.e., dialogue). Our findings indicate that encoder-only LLMs, such as RoBERTa and DistilBERT, struggled to detect underrepresented micro-behaviors, particularly discouraging speech, even with weighted fine-tuning. In contrast, the instruction fine-tuned version of Llama-3.1, a decoder-only LLM, demonstrated superior performance, with the best models achieving macro F1-scores of 44% for 3-way…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpaceflight effects on biology · Neurobiology of Language and Bilingualism · Language Development and Disorders