Multi-Level Compositional Reasoning for Interactive Instruction   Following

Suvaansh Bhambri; Byeonghwi Kim; Jonghyun Choi

arXiv:2308.09387·cs.RO·March 14, 2024

Multi-Level Compositional Reasoning for Interactive Instruction Following

Suvaansh Bhambri, Byeonghwi Kim, Jonghyun Choi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a multi-level reasoning framework for robotic agents to perform complex domestic tasks by breaking down instructions into subgoals, improving efficiency without relying on rule-based planning.

Contribution

It presents the MCR-Agent, a three-level policy system that infers subgoals, controls navigation, and executes manipulation, advancing interactive instruction understanding.

Findings

01

Achieves 2.03% improvement in efficiency metric (PLWSR) on unseen tasks.

02

Generates human-interpretable subgoals for complex tasks.

03

Does not rely on rule-based planning or semantic spatial memory.

Abstract

Robotic agents performing domestic chores by natural language directives are required to master the complex job of navigating environment and interacting with objects in the environments. The tasks given to the agents are often composite thus are challenging as completing them require to reason about multiple subtasks, e.g., bring a cup of coffee. To address the challenge, we propose to divide and conquer it by breaking the task into multiple subgoals and attend to them individually for better navigation and interaction. We call it Multi-level Compositional Reasoning Agent (MCR-Agent). Specifically, we learn a three-level action policy. At the highest level, we infer a sequence of human-interpretable subgoals to be executed based on language instructions by a high-level policy composition controller. At the middle level, we discriminatively control the agent's navigation by a master…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yonseivnl/mcr-agent
pytorchOfficial

Videos

Multi-level Compositional Reasoning for Interactive Instruction Following· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling