Can LLMs Replace Humans During Code Chunking?

Christopher Glasz; Emily Escamilla; Eric O. Scott; Anand Patel; Jacob Zimmer; Colin Diggs; Michael Doyle; Scott Rosen; Nitin Naik; Justin F. Brunelle; Samruddhi Thaker; Parthav Poudel; Arun Sridharan; Amit Madan; Doug Wendt; William Macke; Thomas Schill

arXiv:2506.19897·cs.SE·June 26, 2025

Can LLMs Replace Humans During Code Chunking?

Christopher Glasz, Emily Escamilla, Eric O. Scott, Anand Patel, Jacob Zimmer, Colin Diggs, Michael Doyle, Scott Rosen, Nitin Naik, Justin F. Brunelle, Samruddhi Thaker, Parthav Poudel, Arun Sridharan, Amit Madan, Doug Wendt, William Macke, Thomas Schill

PDF

TL;DR

This paper explores the use of large language models to automate code chunking in legacy government software, demonstrating that LLMs can effectively replace humans in partitioning large codebases for documentation and modernization.

Contribution

It introduces novel code-chunking methods tailored for legacy languages and empirically evaluates their effectiveness across multiple LLMs, highlighting improvements in documentation quality.

Findings

01

LLMs can select partition points similar to human experts

02

Chunking methods significantly influence documentation quality

03

LLM-generated partitions improve factual accuracy and usefulness of comments

Abstract

Large language models (LLMs) have become essential tools in computer science, especially for tasks involving code understanding and generation. However, existing work does not address many of the unique challenges presented by code written for government applications. In particular, government enterprise software is often written in legacy languages like MUMPS or assembly language code (ALC) and the overall token lengths of these systems exceed the context window size for current commercially available LLMs. Additionally, LLMs are primarily trained on modern software languages and have undergone limited testing with legacy languages, making their ability to understand legacy languages unknown and, hence, an area for empirical study. This paper examines the application of LLMs in the modernization of legacy government code written in ALC and MUMPS, addressing the challenges of input…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLLaMA