RoboCoder: Robotic Learning from Basic Skills to General Tasks with Large Language Models
Jingyao Li, Pengguang Chen, Sitong Wu, Chuanyang Zheng, Hong Xu, Jiaya, Jia

TL;DR
This paper introduces RoboCoder, a comprehensive benchmark and autonomous learning framework that enhances robotic generalization by leveraging large language models and real-time feedback to learn complex tasks from basic skills.
Contribution
It presents a new benchmark with diverse tasks and a novel adaptive learning framework that significantly improves robot generalization capabilities using LLMs.
Findings
GPT-4 achieved 47% pass rate in three-shot scenarios
RoboCoder framework improved performance by 36% relative
Benchmark includes 80 tasks across 7 entities
Abstract
The emergence of Large Language Models (LLMs) has improved the prospects for robotic tasks. However, existing benchmarks are still limited to single tasks with limited generalization capabilities. In this work, we introduce a comprehensive benchmark and an autonomous learning framework, RoboCoder aimed at enhancing the generalization capabilities of robots in complex environments. Unlike traditional methods that focus on single-task learning, our research emphasizes the development of a general-purpose robotic coding algorithm that enables robots to leverage basic skills to tackle increasingly complex tasks. The newly proposed benchmark consists of 80 manually designed tasks across 7 distinct entities, testing the models' ability to learn from minimal initial mastery. Initial testing revealed that even advanced models like GPT-4 could only achieve a 47% pass rate in three-shot scenarios…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems · AI in Service Interactions · Natural Language Processing Techniques
MethodsAttention Is All You Need · Softmax · Focus · Layer Normalization · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection
