Combining Large Language Models with Static Analyzers for Code Review Generation
Imen Jaoua, Oussama Ben Sghaier, Houari Sahraoui

TL;DR
This paper presents a hybrid approach combining knowledge-based systems and language models to improve automated code review quality, relevance, and completeness.
Contribution
It introduces a novel hybrid method integrating rule-based and learning-based systems at multiple pipeline stages for better code review generation.
Findings
Hybrid strategies improve review relevance and completeness.
Combining KBS and LBS outperforms standalone systems.
Empirical results show enhanced review quality.
Abstract
Code review is a crucial but often complex, subjective, and time-consuming activity in software development. Over the past decades, significant efforts have been made to automate this process. Early approaches focused on knowledge-based systems (KBS) that apply rule-based mechanisms to detect code issues, providing precise feedback but struggling with complex, context-dependent cases. More recent work has shifted toward fine-tuning pre-trained language models for code review, enabling broader issue coverage but often at the expense of precision. In this paper, we propose a hybrid approach that combines the strengths of KBS and learning-based systems (LBS) to generate high-quality, comprehensive code reviews. Our method integrates knowledge at three distinct stages of the language model pipeline: during data preparation (Data-Augmented Training, DAT), at inference (Retrieval-Augmented…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Software Reliability and Analysis Research
