iMotion-LLM: Instruction-Conditioned Trajectory Generation

Abdulwahab Felemban; Nussair Hroub; Jian Ding; Eslam Abdelrahman; Xiaoqian Shen; Abduallah Mohamed; Mohamed Elhoseiny

arXiv:2406.06211·cs.CV·December 8, 2025

iMotion-LLM: Instruction-Conditioned Trajectory Generation

Abdulwahab Felemban, Nussair Hroub, Jian Ding, Eslam Abdelrahman, Xiaoqian Shen, Abduallah Mohamed, Mohamed Elhoseiny

PDF

Open Access

TL;DR

iMotion-LLM is a novel large language model integrated with trajectory prediction modules that generates safe, feasible, and instruction-aligned driving trajectories based on textual commands, advancing interactive and interpretable autonomous driving systems.

Contribution

It introduces a new multimodal framework combining LLMs with trajectory prediction, along with two datasets for instruction-based trajectory generation in autonomous driving.

Findings

01

Achieves 84% accuracy in direction feasibility detection.

02

Achieves 96% accuracy in safety evaluation of instructions.

03

Demonstrates effective context-aware trajectory generation.

Abstract

We introduce iMotion-LLM, a large language model (LLM) integrated with trajectory prediction modules for interactive motion generation. Unlike conventional approaches, it generates feasible, safety-aligned trajectories based on textual instructions, enabling adaptable and context-aware driving behavior. It combines an encoder-decoder multimodal trajectory prediction model with a pre-trained LLM fine-tuned using LoRA, projecting scene features into the LLM input space and mapping special tokens to a trajectory decoder for text-based interaction and interpretable driving. To support this framework, we introduce two datasets: 1) InstructWaymo, an extension of the Waymo Open Motion Dataset with direction-based motion instructions, and 2) Open-Vocabulary InstructNuPlan, which features safety-aligned instruction-caption pairs and corresponding safe trajectory scenarios. Our experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Human Pose and Action Recognition

MethodsALIGN