code_transformed: The Influence of Large Language Models on Code
Yuliang Xu, Siming Huang, Mingmeng Geng, Yao Wan, Xuanhua Shi, Dongping Chen

TL;DR
This study analyzes how large language models influence coding styles by examining over 20,000 GitHub repositories, revealing measurable trends such as increased use of snake_case in Python and changes in code complexity and maintainability.
Contribution
It provides the first large-scale empirical evidence of LLMs' impact on real-world programming style, focusing on naming conventions, complexity, and similarity in code.
Findings
Increase in snake_case function names from 40.7% to 49.8% in Python.
Measurable trends in code complexity and maintainability.
Evidence that LLMs influence coding style in practice.
Abstract
Coding remains one of the most fundamental modes of interaction between humans and machines. With the rapid advancement of Large Language Models (LLMs), code generation capabilities have begun to significantly reshape programming practices. This development prompts a central question: Have LLMs transformed code style, and how can such transformation be characterized? In this paper, we present a pioneering study that investigates the impact of LLMs on code style, with a focus on naming conventions, complexity, maintainability, and similarity. By analyzing code from over 20,000 GitHub repositories linked to arXiv papers published between 2020 and 2025, we identify measurable trends in the evolution of coding style that align with characteristics of LLM-generated code. For instance, the proportion of snake_case function names in Python code increased from 40.7% in Q1 2023 to 49.8% in Q3…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Computational Physics and Python Applications
MethodsFocus · ALIGN
