Enhancing Code Annotation Reliability: Generative AI's Role in Comment   Quality Assessment Models

Seetharam Killivalavan; Durairaj Thenmozhi

arXiv:2410.22323·cs.SE·October 30, 2024

Enhancing Code Annotation Reliability: Generative AI's Role in Comment Quality Assessment Models

Seetharam Killivalavan, Durairaj Thenmozhi

PDF

Open Access

TL;DR

This paper demonstrates how integrating generative AI to produce additional labeled code comment data significantly improves the performance of code comment quality assessment models, advancing software engineering tools.

Contribution

The study introduces a novel approach of using generative AI to augment training data, leading to notable performance improvements in code comment classification models.

Findings

01

5.78% precision increase in SVM model

02

2.17% recall boost in ANN model

03

Enhanced model accuracy with generated data

Abstract

This paper explores a novel method for enhancing binary classification models that assess code comment quality, leveraging Generative Artificial Intelligence to elevate model performance. By integrating 1,437 newly generated code-comment pairs, labeled as "Useful" or "Not Useful" and sourced from various GitHub repositories, into an existing C-language dataset of 9,048 pairs, we demonstrate substantial model improvements. Using an advanced Large Language Model, our approach yields a 5.78% precision increase in the Support Vector Machine (SVM) model, improving from 0.79 to 0.8478, and a 2.17% recall boost in the Artificial Neural Network (ANN) model, rising from 0.731 to 0.7527. These results underscore Generative AI's value in advancing code comment classification models, offering significant potential for enhanced accuracy in software development and quality control. This study…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Software Engineering Research