Analyzing Code Comments to Boost Program Comprehension
Yusuke Shinyama, Yoshitaka Arahori, Katsuhiko Gondow

TL;DR
This paper presents a method to identify explanatory code comments within source code, using a decision-tree classifier, to enhance program understanding by focusing on local comments.
Contribution
It introduces eleven comment categories and a classifier achieving 60% precision and 80% recall for identifying explanatory comments in Java and Python projects.
Findings
Preconditional and postconditional comments are most common.
English comments exhibit consistent grammatical structures.
The method analyzed 2,000 GitHub projects.
Abstract
We are trying to find source code comments that help programmers understand a nontrivial part of source code. One of such examples would be explaining to assign a zero as a way to "clear" a buffer. Such comments are invaluable to programmers and identifying them correctly would be of great help. Toward this goal, we developed a method to discover explanatory code comments in a source code. We first propose eleven distinct categories of code comments. We then developed a decision-tree based classifier that can identify explanatory comments with 60% precision and 80% recall. We analyzed 2,000 GitHub projects that are written in two languages: Java and Python. This task is novel in that it focuses on a microscopic comment ("local comment") within a method or function, in contrast to the prior efforts that focused on API- or method-level comments. We also investigated how different category…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Software Engineering Techniques and Practices
