Open-Ended Goal Inference through Actions and Language for Human-Robot Collaboration

Debasmita Ghose; Oz Gitelson; Marynel Vazquez; Brian Scassellati

arXiv:2512.04453·cs.RO·December 5, 2025

Open-Ended Goal Inference through Actions and Language for Human-Robot Collaboration

Debasmita Ghose, Oz Gitelson, Marynel Vazquez, Brian Scassellati

PDF

Open Access

TL;DR

This paper introduces BALI, a goal inference method for human-robot collaboration that combines language and actions, asks clarifying questions strategically, and improves goal prediction stability in complex tasks.

Contribution

BALI is a novel approach that integrates language and action cues for goal inference, allowing robots to handle unbounded and novel goals in collaborative tasks.

Findings

01

BALI achieves more stable goal predictions than baselines.

02

BALI makes significantly fewer mistakes in goal inference.

03

BALI effectively combines language and action cues for improved inference.

Abstract

To collaborate with humans, robots must infer goals that are often ambiguous, difficult to articulate, or not drawn from a fixed set. Prior approaches restrict inference to a predefined goal set, rely only on observed actions, or depend exclusively on explicit instructions, making them brittle in real-world interactions. We present BALI (Bidirectional Action-Language Inference) for goal prediction, a method that integrates natural language preferences with observed human actions in a receding-horizon planning tree. BALI combines language and action cues from the human, asks clarifying questions only when the expected information gain from the answer outweighs the cost of interruption, and selects supportive actions that align with inferred goals. We evaluate the approach in collaborative cooking tasks, where goals may be novel to the robot and unbounded. Compared to baselines, BALI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Social Robot Interaction and HRI · Robot Manipulation and Learning