Do Code LLMs Understand Design Patterns?

Zhenyu Pan; Xuefeng Song; Yunkun Wang; Rongyu Cao; Binhua Li; Yongbin; Li; Han Liu

arXiv:2501.04835·cs.SE·January 10, 2025

Do Code LLMs Understand Design Patterns?

Zhenyu Pan, Xuefeng Song, Yunkun Wang, Rongyu Cao, Binhua Li, Yongbin, Li, Han Liu

PDF

Open Access

TL;DR

This paper investigates whether Code Large Language Models understand software design patterns by empirically evaluating their recognition, comprehension, and generation capabilities, revealing significant biases that impact their reliability.

Contribution

It provides the first comprehensive empirical analysis of Code LLMs' understanding of design patterns, highlighting biases affecting downstream software development tasks.

Findings

01

Code LLMs often fail to recognize design patterns accurately.

02

Biases in models influence code generation quality.

03

Understanding of design patterns by LLMs is limited.

Abstract

Code Large Language Models (LLMs) demonstrate great versatility in adapting to various downstream tasks, including code generation and completion, as well as bug detection and fixing. However, Code LLMs often fail to capture existing coding standards, leading to the generation of code that conflicts with the required design patterns for a given project. As a result, developers must post-process to adapt the generated code to the project's design norms. In this work, we empirically investigate the biases of Code LLMs in software development. Through carefully designed experiments, we assess the models' understanding of design patterns across recognition, comprehension, and generation. Our findings reveal that biases in Code LLMs significantly affect the reliability of downstream tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Rights Management and Security · Natural Language Processing Techniques · Law, AI, and Intellectual Property