Learning to Reason and Navigate: Parameter Efficient Action Planning with Large Language Models
Bahram Mohammadi, Ehsan Abbasnejad, Yuankai Qi, Qi Wu, Anton Van Den Hengel, and Javen Qinfeng Shi

TL;DR
This paper introduces PEAP-LLM, a parameter-efficient, two-module approach using large language models for goal-oriented navigation in complex indoor environments, significantly improving performance on the REVERIE task.
Contribution
It presents a novel two-stage fine-tuning method for LLMs and a modular action planning framework for embodied navigation without pre-exploration.
Findings
PEAP-LLM outperforms previous state-of-the-art on REVERIE.
The two-stage fine-tuning improves instruction quality and environmental adaptability.
Modular LLM-based planning enhances navigation success in complex scenarios.
Abstract
The remote embodied referring expression (REVERIE) task requires an agent to navigate through complex indoor environments and localize a remote object specified by high-level instructions, such as "bring me a spoon", without pre-exploration. Hence, an efficient navigation plan is essential for the final success. This paper proposes a novel parameter-efficient action planner using large language models (PEAP-LLM) to generate a single-step instruction at each location. The proposed model consists of two modules, LLM goal planner (LGP) and LoRA action planner (LAP). Initially, LGP extracts the goal-oriented plan from REVERIE instructions, including the target object and room. Then, LAP generates a single-step instruction with the goal-oriented plan, high-level instruction, and current visual observation as input. PEAP-LLM enables the embodied agent to interact with LAP as the path planner…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · Social Robot Interaction and HRI
MethodsShrink and Fine-Tune · Direct Preference Optimization
