Deterministic vs. Probabilistic Summarisation: An Empirical Trade-off Study in Design Pattern Centric Java Code
Najam Nazar, Christoph Treude

TL;DR
This study empirically compares deterministic and probabilistic code summarisation methods in Java, revealing a trade-off between semantic depth and reproducibility.
Contribution
It provides a controlled empirical analysis of these paradigms using design-pattern-centric Java code, highlighting their respective strengths and limitations.
Findings
Probabilistic summaries have better semantic alignment and contextual coverage.
Deterministic approaches produce more concise and reproducible summaries.
Variability exists in LLM outputs, but overall trends are consistent.
Abstract
Background: Automated code summarisation supports program comprehension and documentation, yet the relative strengths and limitations of deterministic (heuristic-based) and probabilistic (LLM-based) pipelines remain unclear. Aims: This paper presents a controlled empirical comparison of these paradigms for intent-oriented design-pattern code summarisation. Method: Using design-pattern-centric Java code as a structured testbed (150 files from three open-source repositories covering nine patterns), we compare a rule-based natural language generation (NLG) pipeline, a Software Word Usage Model (SWUM)-based approach, and a probabilistic pipeline based on the Mixtral LLM. Summaries are evaluated against human references using BERTScore and cosine similarity, complemented by rubric-based judgements produced by Llama 3 across five dimensions: accuracy, conciseness, adequacy, code-context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
