CoFineLLM: Conformal Finetuning of LLMs for Language-Instructed Robot Planning
Jun Wang, Yevgeniy Vorobeychik, Yiannis Kantaros

TL;DR
CoFineLLM is a novel finetuning framework that makes language models more reliable for robot planning by reducing unnecessary human interventions through conformal prediction awareness.
Contribution
It introduces the first CP-aware finetuning method for LLMs, explicitly minimizing prediction set size and intervention frequency in language-instructed robot planning.
Findings
Reduces prediction set size compared to baselines.
Improves help request rates in planning tasks.
Demonstrates robustness in out-of-distribution scenarios.
Abstract
Large Language Models (LLMs) have recently emerged as planners for language-instructed agents, generating sequences of actions to accomplish natural language tasks. However, their reliability remains a challenge, especially in long-horizon tasks, since they often produce overconfident yet wrong outputs. Conformal Prediction (CP) has been leveraged to address this issue by wrapping LLM outputs into prediction sets that contain the correct action with a user-defined confidence. When the prediction set is a singleton, the planner executes that action; otherwise, it requests help from a user. This has led to LLM-based planners that can ensure plan correctness with a user-defined probability. However, as LLMs are trained in an uncertainty-agnostic manner, without awareness of prediction sets, they tend to produce unnecessarily large sets, particularly at higher confidence levels, resulting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Robot Manipulation and Learning
