Bootstrapping Human-Like Planning via LLMs
David Porfirio, Vincent Hsiao, Morgan Fine-Morris, Leslie Smith, and Laura M. Hiatt

TL;DR
This paper explores combining natural language and drag-and-drop interfaces for robot programming by using large language models to generate human-like action sequences, comparing model sizes and performance.
Contribution
It introduces an LLM-based pipeline that converts natural language into human-like robot action sequences, bridging intuitive and precise programming methods.
Findings
Larger models outperform smaller ones in generating human-like actions
Smaller models still achieve satisfactory performance
The approach enables more accessible robot task specification
Abstract
Robot end users increasingly require accessible means of specifying tasks for robots to perform. Two common end-user programming paradigms include drag-and-drop interfaces and natural language programming. Although natural language interfaces harness an intuitive form of human communication, drag-and-drop interfaces enable users to meticulously and precisely dictate the key actions of the robot's task. In this paper, we investigate the degree to which both approaches can be combined. Specifically, we construct a large language model (LLM)-based pipeline that accepts natural language as input and produces human-like action sequences as output, specified at a level of granularity that a human would produce. We then compare these generated action sequences to another dataset of hand-specified action sequences. Although our results reveal that larger models tend to outperform smaller ones…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Artificial Intelligence in Games
