Mechanistic Understanding of Language Models in Syntactic Code   Completion

Samuel Miller; Daking Rai; Ziyu Yao

arXiv:2502.18499·cs.SE·February 27, 2025

Mechanistic Understanding of Language Models in Syntactic Code Completion

Samuel Miller, Daking Rai, Ziyu Yao

PDF

Open Access

TL;DR

This paper investigates the internal decision-making processes of Code Llama-7b, a language model used for syntactic code completion, revealing layer-specific roles and attention head functions that influence its performance.

Contribution

It provides one of the first mechanistic interpretability analyses of Code LMs, focusing on how they perform syntactic completion tasks like closing parentheses.

Findings

01

Middle-later layers are crucial for confident predictions.

02

Multi-head attention is particularly important for performance.

03

Attention heads track parentheses count, affecting accuracy.

Abstract

Recently, language models (LMs) have shown impressive proficiency in code generation tasks, especially when fine-tuned on code-specific datasets, commonly known as Code LMs. However, our understanding of the internal decision-making processes of Code LMs, such as how they use their (syntactic or semantic) knowledge, remains limited, which could lead to unintended harm as they are increasingly used in real life. This motivates us to conduct one of the first Mechanistic Interpretability works to understand how Code LMs perform a syntactic completion task, specifically the closing parenthesis task, on the CodeLlama-7b model (Roziere et al. 2023). Our findings reveal that the model requires middle-later layers until it can confidently predict the correct label for the closing parenthesis task. Additionally, we identify that while both multi-head attention (MHA) and feed-forward (FF)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Software Engineering Research · Natural Language Processing Techniques

MethodsSoftmax · Linear Layer · Attention Is All You Need · Multi-Head Attention