CoFineLLM: Conformal Finetuning of LLMs for Language-Instructed Robot Planning

Jun Wang; Yevgeniy Vorobeychik; Yiannis Kantaros

arXiv:2511.06575·cs.RO·November 11, 2025

CoFineLLM: Conformal Finetuning of LLMs for Language-Instructed Robot Planning

Jun Wang, Yevgeniy Vorobeychik, Yiannis Kantaros

PDF

Open Access

TL;DR

CoFineLLM is a novel finetuning framework that makes language models more reliable for robot planning by reducing unnecessary human interventions through conformal prediction awareness.

Contribution

It introduces the first CP-aware finetuning method for LLMs, explicitly minimizing prediction set size and intervention frequency in language-instructed robot planning.

Findings

01

Reduces prediction set size compared to baselines.

02

Improves help request rates in planning tasks.

03

Demonstrates robustness in out-of-distribution scenarios.

Abstract

Large Language Models (LLMs) have recently emerged as planners for language-instructed agents, generating sequences of actions to accomplish natural language tasks. However, their reliability remains a challenge, especially in long-horizon tasks, since they often produce overconfident yet wrong outputs. Conformal Prediction (CP) has been leveraged to address this issue by wrapping LLM outputs into prediction sets that contain the correct action with a user-defined confidence. When the prediction set is a singleton, the planner executes that action; otherwise, it requests help from a user. This has led to LLM-based planners that can ensure plan correctness with a user-defined probability. However, as LLMs are trained in an uncertainty-agnostic manner, without awareness of prediction sets, they tend to produce unnecessarily large sets, particularly at higher confidence levels, resulting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Robot Manipulation and Learning