Code LLMs: A Taxonomy-based Survey

Nishat Raihan; Christian Newman; Marcos Zampieri

arXiv:2412.08291·cs.CL·December 12, 2024

Code LLMs: A Taxonomy-based Survey

Nishat Raihan, Christian Newman, Marcos Zampieri

PDF

Open Access

TL;DR

This survey systematically categorizes large language models used in coding tasks, analyzing their architectures, methodologies, and applications to provide a comprehensive understanding of their current state and future prospects.

Contribution

It introduces a taxonomy-based framework for classifying LLMs in coding, unifying concepts and facilitating understanding of their methodologies and applications.

Findings

01

Provides a comprehensive taxonomy of LLMs in coding tasks

02

Analyzes current methodologies and architectures

03

Discusses future directions and limitations

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities across various NLP tasks and have recently expanded their impact to coding tasks, bridging the gap between natural languages (NL) and programming languages (PL). This taxonomy-based survey provides a comprehensive analysis of LLMs in the NL-PL domain, investigating how these models are utilized in coding tasks and examining their methodologies, architectures, and training processes. We propose a taxonomy-based framework that categorizes relevant concepts, providing a unified classification system to facilitate a deeper understanding of this rapidly evolving field. This survey offers insights into the current state and future directions of LLMs in coding tasks, including their applications and limitations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Translation Studies and Practices