Hierarchical Repository-Level Code Summarization for Business Applications Using Local LLMs
Nilesh Dhulshette, Sapan Shah, Vinay Kulkarni

TL;DR
This paper introduces a hierarchical repository-level code summarization method for business applications, leveraging local LLMs and domain-aware prompts to improve understanding of large codebases.
Contribution
It presents a novel two-step hierarchical approach that summarizes large code artifacts by combining syntax analysis, local LLMs, and business context grounding.
Findings
Hierarchical summarization improves coverage of large codebases.
Business-context grounding enhances summary relevance.
Evaluation on a telecommunications BSS shows effectiveness.
Abstract
In large-scale software development, understanding the functionality and intent behind complex codebases is critical for effective development and maintenance. While code summarization has been widely studied, existing methods primarily focus on smaller code units, such as functions, and struggle with larger code artifacts like files and packages. Additionally, current summarization models tend to emphasize low-level implementation details, often overlooking the domain and business context that are crucial for real-world applications. This paper proposes a two-step hierarchical approach for repository-level code summarization, tailored to business applications. First, smaller code units such as functions and variables are identified using syntax analysis and summarized with local LLMs. These summaries are then aggregated to generate higher-level file and package summaries. To ensure the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Service-Oriented Architecture and Web Services · Semantic Web and Ontologies
MethodsFocus
