Code LLMs: A Taxonomy-based Survey
Nishat Raihan, Christian Newman, Marcos Zampieri

TL;DR
This survey systematically categorizes large language models used in coding tasks, analyzing their architectures, methodologies, and applications to provide a comprehensive understanding of their current state and future prospects.
Contribution
It introduces a taxonomy-based framework for classifying LLMs in coding, unifying concepts and facilitating understanding of their methodologies and applications.
Findings
Provides a comprehensive taxonomy of LLMs in coding tasks
Analyzes current methodologies and architectures
Discusses future directions and limitations
Abstract
Large language models (LLMs) have demonstrated remarkable capabilities across various NLP tasks and have recently expanded their impact to coding tasks, bridging the gap between natural languages (NL) and programming languages (PL). This taxonomy-based survey provides a comprehensive analysis of LLMs in the NL-PL domain, investigating how these models are utilized in coding tasks and examining their methodologies, architectures, and training processes. We propose a taxonomy-based framework that categorizes relevant concepts, providing a unified classification system to facilitate a deeper understanding of this rapidly evolving field. This survey offers insights into the current state and future directions of LLMs in coding tasks, including their applications and limitations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Translation Studies and Practices
