A Survey on Large Language Models for Code Generation

Juyong Jiang; Fan Wang; Jiasi Shen; Sungju Kim; Sunghun Kim

arXiv:2406.00515·cs.CL·October 28, 2025·56 cites

A Survey on Large Language Models for Code Generation

Juyong Jiang, Fan Wang, Jiasi Shen, Sungju Kim, Sunghun Kim

PDF

Open Access 2 Repos 1 Models

TL;DR

This survey comprehensively reviews recent progress, challenges, and benchmarks in large language models for code generation, highlighting advancements, ethical considerations, and practical applications in software development.

Contribution

It provides a systematic literature review, introduces a taxonomy, and offers empirical comparisons across multiple benchmarks for LLMs in code generation.

Findings

01

Progressive improvements in LLM capabilities for code generation.

02

Identification of key challenges and opportunities.

03

Benchmark performance analysis across different tasks.

Abstract

Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks, known as Code LLMs, particularly in code generation that generates source code with LLM from natural language descriptions. This burgeoning field has captured significant interest from both academic researchers and industry professionals due to its practical significance in software development, e.g., GitHub Copilot. Despite the active exploration of LLMs for a variety of code tasks, either from the perspective of natural language processing (NLP) or software engineering (SE) or both, there is a noticeable absence of a comprehensive and up-to-date literature review dedicated to LLM for code generation. In this survey, we aim to bridge this gap by providing a systematic literature review that serves as a valuable reference for researchers investigating the cutting-edge progress in LLMs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
juyongjiang/CodeUp-Llama-3-8B
model· 45 dl· ♡ 5
45 dl♡ 5

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques