# A Multimodal Adaptive Framework for Social Interaction with the MiRo-E Robot

**Authors:** Yufeng Yang, Pei Shan Yap, Sobanawartiny Wijeakumar, Aly Magassouba, Nikhil Deshpande

PMC · DOI: 10.3390/s26041209 · 2026-02-12

## TL;DR

This paper introduces a robot interaction system that uses real-time user engagement and AI to improve natural and engaging human-robot communication.

## Contribution

The novel contribution is an adaptive framework combining real-time emotion estimation and large language models for more natural human-robot interaction.

## Key findings

- Adapting interactions based on user engagement significantly improves user experience.
- The MiRo-E platform effectively integrates verbal and nonverbal communication for social HRI.
- The framework enhances task completion rates, engagement, and perceived naturalness in user studies.

## Abstract

This study explores how robots can interact with people in a more engaging way. By combining real-time user engagement estimation with advanced language models, the system allows robots to respond consistently through both speech and body language. Tests show that this approach makes interactions feel more natural while improving user engagement and task success.

What are the main findings?
Adapting the interaction based on user engagement significantly enhances user experience.The MiRo-E social HRI platform lends itself well to integrating verbal and nonverbal HRI.

Adapting the interaction based on user engagement significantly enhances user experience.

The MiRo-E social HRI platform lends itself well to integrating verbal and nonverbal HRI.

What are the implications of the main findings?
Enhancing perceived naturalness is an important goal in social human–robot interaction.Generative AI and multimodality offer a credible pathway to achieving this goal.

Enhancing perceived naturalness is an important goal in social human–robot interaction.

Generative AI and multimodality offer a credible pathway to achieving this goal.

Adaptivity is a key component of social human–robot interaction (HRI) towards achieving more natural and human-like interactions. Current interactive systems tend to rely on preset and repetitive verbal communication and isolated nonverbal interactions, which results in unappealing engagement. This study proposes an integrated framework that combines a coordinated nonverbal interaction system based on real-time emotion expression with a fine-tuned large language model-based verbal communication system, resulting in more engaging and context-aware interaction. The design utilises the MiRo-E as the zoomorphic social interaction platform, with the aim of enhancing the consistency across verbal and nonverbal modalities and improving user engagement through adaptive and emotionally aligned responses. To evaluate the effectiveness of the approach, a user study was conducted with tasks designed to assess user engagement, task performance, and the perceived naturalness of interaction. Task performance metrics and subjective questionnaire responses indicate that the framework significantly enhances user experience, improving task completion rates, engagement, and perceived naturalness.

## Full-text entities

- **Genes:** GPT (glutamic--pyruvic transaminase) [NCBI Gene 609914]
- **Diseases:** HRI (MESH:C563663), anxiety (MESH:D001007), injury to (MESH:D014947), LLM (MESH:D007806), dementia (MESH:D003704)
- **Species:** Mus musculus (house mouse, species) [taxon 10090], Homo sapiens (human, species) [taxon 9606], Canis lupus familiaris (dog, subspecies) [taxon 9615], Oryctolagus cuniculus (domestic rabbit, species) [taxon 9986]

## Figures

17 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12944184/full.md

---
Source: https://tomesphere.com/paper/PMC12944184