LLM attribution analysis across different fine-tuning strategies and model scales for automated code compliance

Jack Wei Lun Shi; Minghao Dang; Wawan Solihin; Justin K.W. Yeoh

arXiv:2604.15589·cs.CL·April 20, 2026

LLM attribution analysis across different fine-tuning strategies and model scales for automated code compliance

Jack Wei Lun Shi, Minghao Dang, Wawan Solihin, Justin K.W. Yeoh

PDF

TL;DR

This study investigates how different fine-tuning strategies and model sizes influence the interpretive behavior of large language models in automated code compliance tasks, revealing key differences and interpretive strategies.

Contribution

It introduces a perturbation-based attribution analysis to compare interpretive behaviors across fine-tuning methods and model scales, providing insights into model transparency.

Findings

01

FFT yields more focused attribution patterns than parameter-efficient fine-tuning.

02

Larger models develop specific interpretive strategies such as prioritizing numerical constraints.

03

Performance gains plateau for models larger than 7B in semantic similarity.

Abstract

Existing research on large language models (LLMs) for automated code compliance has primarily focused on performance, treating the models as black boxes and overlooking how training decisions affect their interpretive behavior. This paper addresses this gap by employing a perturbation-based attribution analysis to compare the interpretive behaviors of LLMs across different fine-tuning strategies such as full fine-tuning (FFT), low-rank adaptation (LoRA) and quantized LoRA fine-tuning, as well as the impact of model scales which include varying LLM parameter sizes. Our results show that FFT produces attribution patterns that are statistically different and more focused than those from parameter-efficient fine-tuning methods. Furthermore, we found that as model scale increases, LLMs develop specific interpretive strategies such as prioritizing numerical constraints and rule identifiers in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.