AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots
Zhaxizhuoma Zhaxizhuoma, Pengan Chen, Ziniu Wu, Jiawei Sun, Dong Wang,, Peng Zhou, Nieqing Cao, Yan Ding, Bin Zhao, Xuelong Li

TL;DR
AlignBot is a fine-tuned multimodal framework that enhances household robot task planning by effectively integrating user reminders, leading to significant improvements over baseline models in real-world settings.
Contribution
The paper introduces AlignBot, a novel fine-tuning approach that aligns VLM-powered task planning with user reminders using a dynamic retrieval mechanism and structured prompts.
Findings
Achieved 86.8% success rate in real-world household tasks.
Outperformed baseline GPT-4o by 65% in task success.
Demonstrated effectiveness with a multimodal dataset of over 1,500 entries.
Abstract
This paper presents AlignBot, a novel framework designed to optimize VLM-powered customized task planning for household robots by effectively aligning with user reminders. In domestic settings, aligning task planning with user reminders poses significant challenges due to the limited quantity, diversity, and multimodal nature of the reminders. To address these challenges, AlignBot employs a fine-tuned LLaVA-7B model, functioning as an adapter for GPT-4o. This adapter model internalizes diverse forms of user reminders-such as personalized preferences, corrective guidance, and contextual assistance-into structured instruction-formatted cues that prompt GPT-4o in generating customized task plans. Additionally, AlignBot integrates a dynamic retrieval mechanism that selects task-relevant historical successes as prompts for GPT-4o, further enhancing task planning accuracy. To validate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTransportation and Mobility Innovations · Advanced Manufacturing and Logistics Optimization
MethodsAdapter
