# A Multimodal Agentic AI Framework for Intuitive Human–Robot Collaboration

**Authors:** Xiaoyun Liang, Jiannan Cai

PMC · DOI: 10.3390/s26061958 · Sensors (Basel, Switzerland) · 2026-03-20

## TL;DR

This paper introduces a new AI framework that makes human-robot collaboration more intuitive by using natural user interfaces and large language models.

## Contribution

The novel contribution is a multimodal agentic AI framework that integrates natural user interfaces and task reasoning for intuitive human-robot collaboration.

## Key findings

- The framework allows users to guide robots using plain language and gaze.
- The system reduces user workload by handling motion planning and task decomposition.
- Experiments showed improved intuitiveness and efficiency in human-robot wood assembly tasks.

## Abstract

Widespread acceptance of collaborative robots in human-involved scenarios requires accessible and intuitive interfaces for lay workers and non-expert users. Existing interfaces often rely on users to plan and issue low-level commands, necessitating extensive knowledge of robot control. This study proposes a multimodal agentic AI framework integrating natural user interfaces (NUIs) to foster effortless human-like partnerships in human–robot collaboration (HRC), which enhance intuitiveness and operational efficiency. First, it allows users to instruct robots using plain language verbally, coupled with gaze, revealing objects precisely. Second, it offloads users’ workload for robot motion planning by understanding context and reasoning task decomposition. Third, coordinating with AI agents built on large language models (LLMs), the system interprets users’ requests effectively and provides feedback to establish transparent communication. This proof-of-concept study included experiments to demonstrate a practical implementation of the agentic AI framework on a mobile manipulation robot in the collaborative task of human–robot wood assembly. Seven participants were recruited to interact with this AI-integrated agentic robotic system. Task performance and user experience metrics were measured in terms of completion time, intervention rate, NASA TLX survey for workload, and valuable insights of practical applications were summarized through a qualitative analysis. This study highlights the potential of NUIs and agentic AI-embodied robots to overcome existing HRC barriers and contributes to improving HRC intuitiveness and efficiency.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13030326/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13030326/full.md

## References

45 references — full list in the complete paper: https://tomesphere.com/paper/PMC13030326/full.md

---
Source: https://tomesphere.com/paper/PMC13030326