# Accurately and Efficiently Interpreting Human-Robot Instructions of   Varying Granularities

**Authors:** Dilip Arumugam, Siddharth Karamcheti, Nakul Gopalan, Lawson L.S. Wong,, and Stefanie Tellex

arXiv: 1704.06616 · 2018-06-20

## TL;DR

This paper presents a hierarchical language grounding method enabling robots to interpret and execute human commands at varying levels of abstraction, improving accuracy and efficiency in task planning and execution.

## Contribution

The work introduces a multi-level grounding approach that infers command abstraction levels and enhances planning efficiency in human-robot interaction.

## Key findings

- Achieves 90% response within one second for most tasks
- Improves grounding accuracy by inferring command abstraction levels
- Enables real robot to interpret and execute multi-level commands

## Abstract

Humans can ground natural language commands to tasks at both abstract and fine-grained levels of specificity. For instance, a human forklift operator can be instructed to perform a high-level action, like "grab a pallet" or a low-level action like "tilt back a little bit." While robots are also capable of grounding language commands to tasks, previous methods implicitly assume that all commands and tasks reside at a single, fixed level of abstraction. Additionally, methods that do not use multiple levels of abstraction encounter inefficient planning and execution times as they solve tasks at a single level of abstraction with large, intractable state-action spaces closely resembling real world complexity. In this work, by grounding commands to all the tasks or subtasks available in a hierarchical planning framework, we arrive at a model capable of interpreting language at multiple levels of specificity ranging from coarse to more granular. We show that the accuracy of the grounding procedure is improved when simultaneously inferring the degree of abstraction in language used to communicate the task. Leveraging hierarchy also improves efficiency: our proposed approach enables a robot to respond to a command within one second on 90% of our tasks, while baselines take over twenty seconds on half the tasks. Finally, we demonstrate that a real, physical robot can ground commands at multiple levels of abstraction allowing it to efficiently plan different subtasks within the same planning hierarchy.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.06616/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1704.06616/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/1704.06616/full.md

---
Source: https://tomesphere.com/paper/1704.06616