DocChecker: Bootstrapping Code Large Language Model for Detecting and Resolving Code-Comment Inconsistencies
Anh T. V. Dau, Jin L. C. Guo, Nghi D. Q. Bui

TL;DR
DocChecker is a deep learning-based tool that detects and corrects code-comment inconsistencies, outperforming existing models in accuracy and BLEU scores, thereby improving code documentation reliability.
Contribution
This paper introduces DocChecker, a novel deep learning approach for detecting and resolving code-comment inconsistencies, surpassing prior heuristic-based methods and existing large language models.
Findings
Achieves 72.3% accuracy on ICCD task
Attains 33.64 BLEU-4 on code summarization
Outperforms GPT 3.5 and CodeLlama in experiments
Abstract
Comments within source code are essential for developers to comprehend the code's purpose and ensure its correct usage. However, as codebases evolve, maintaining an accurate alignment between the comments and the code becomes increasingly challenging. Recognizing the growing interest in automated solutions for detecting and correcting differences between code and its accompanying comments, current methods rely primarily on heuristic rules. In contrast, this paper presents DocChecker, a tool powered by deep learning. DocChecker is adept at identifying inconsistencies between code and comments, and it can also generate synthetic comments. This capability enables the tool to detect and correct instances where comments do not accurately reflect their corresponding code segments. We demonstrate the effectiveness of DocChecker using the Just-In-Time and CodeXGlue datasets in different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Software Engineering Techniques and Practices
