Can Large Language Models Invent Algorithms to Improve Themselves?: Algorithm Discovery for Recursive Self-Improvement through Reinforcement Learning
Yoichi Ishibashi, Taro Yano, Masafumi Oyamada

TL;DR
This paper introduces a framework where Large Language Models autonomously discover and refine their own algorithms, leading to improved performance on benchmarks and out-of-domain tasks, surpassing human-designed methods.
Contribution
The paper presents a novel self-improvement framework enabling LLMs to invent and optimize their own algorithms, demonstrating significant performance gains without human intervention.
Findings
Discovered algorithms outperform existing human-designed algorithms.
Models show strong generalization to out-of-domain tasks.
Performance on benchmarks improved by up to 6%.
Abstract
Large Language Models (LLMs) have achieved remarkable capabilities, yet their improvement methods remain fundamentally constrained by human design. We present Self-Developing, a framework that enables LLMs to autonomously discover, implement, and refine their own improvement algorithms. Our approach employs an iterative cycle where a seed model generates algorithmic candidates as executable code, evaluates their effectiveness, and uses Direct Preference Optimization to recursively improve increasingly sophisticated improvement strategies. We demonstrate this framework through model merging, a practical technique for combining specialized models. Self-Developing successfully discovered novel merging algorithms that outperform existing human-designed algorithms. On mathematical reasoning benchmarks, the autonomously discovered algorithms improve the seed model's GSM8k performance by 6\%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
