LLaMoCo: Instruction Tuning of Large Language Models for Optimization Code Generation
Zeyuan Ma, Hongshu Guo, Jiacheng Chen, Guojun Peng, Zhiguang Cao,, Yining Ma, Yue-Jiao Gong

TL;DR
LLaMoCo is an instruction-tuning framework that enhances large language models for optimization tasks in code generation, achieving superior performance over existing models through a novel two-phase learning strategy.
Contribution
The paper introduces LLaMoCo, a new instruction-tuning approach with a contrastive warm-up phase, specifically designed for optimizing LLMs in code-based problem solving.
Findings
Fine-tuned CodeGen (350M) outperforms GPT-4 Turbo in optimization tasks.
The two-phase learning strategy improves convergence during model fine-tuning.
LLaMoCo achieves superior results on both synthetic and real-world problem sets.
Abstract
Recent research explores optimization using large language models (LLMs) by either iteratively seeking next-step solutions from LLMs or directly prompting LLMs for an optimizer. However, these approaches exhibit inherent limitations, including low operational efficiency, high sensitivity to prompt design, and a lack of domain-specific knowledge. We introduce LLaMoCo, the first instruction-tuning framework designed to adapt LLMs for solving optimization problems in a code-to-code manner. Specifically, we establish a comprehensive instruction set containing well-described problem prompts and effective optimization codes. We then develop a novel two-phase learning strategy that incorporates a contrastive learning-based warm-up procedure before the instruction-tuning phase to enhance the convergence behavior during model fine-tuning. The experiment results demonstrate that a CodeGen (350M)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Natural Language Processing Techniques
MethodsAttention Is All You Need · Sparse Evolutionary Training · Linear Layer · Byte Pair Encoding · Multi-Head Attention · Layer Normalization · Dropout · Softmax · Dense Connections · Label Smoothing
