Context Matters: Evaluating Context Strategies for Automated ADR Generation Using LLMs
Aviral Gupta, Rudra Dhar, Daniel Feitosa, Karthik Vaidhyanathan

TL;DR
This study evaluates how different context presentation strategies influence the quality of automated Architecture Decision Record (ADR) generation using Large Language Models, emphasizing the importance of context engineering over model size.
Contribution
It systematically compares five context selection strategies for ADR generation with LLMs, highlighting the effectiveness of small recency windows and providing practical guidelines.
Findings
Context-aware prompting improves ADR generation fidelity.
A recency window of 3-5 records balances quality and efficiency.
Retrieval-based context offers limited benefits in typical workflows.
Abstract
Architecture Decision Records (ADRs) play a critical role in preserving the rationale behind system design, yet their creation and maintenance are often neglected due to the associated authoring overhead. This paper investigates whether Large Language Models (LLMs) can mitigate this burden and, more importantly, how different strategies for presenting historical ADRs as context influence generation quality. We curate and validate a large corpus of sequential ADRs drawn from 750 open-source repositories and systematically evaluate five context selection strategies (no context, All-history, First-K, Last-K, and RAFG) across multiple model families. Our results show that context-aware prompting substantially improves ADR generation fidelity, with a small recency window (typically 3-5 prior records) providing the best balance between quality and efficiency. Retrieval-based context selection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
