Automatic Detection of Five API Documentation Smells: Practitioners' Perspectives
Junaed Younus Khan, Md. Tawkat Islam Khondaker, Gias Uddin and, Anindya Iqbal

TL;DR
This paper identifies five common API documentation smells, creates a dataset to detect them, and develops machine learning classifiers, including BERT, to automatically identify these documentation issues, aiming to improve API usability.
Contribution
It introduces a catalog of five API documentation smells, a benchmark dataset, and machine learning models for automatic detection, filling a research gap in documentation quality assurance.
Findings
Developed a catalog of five API documentation smells.
Created a benchmark dataset of 1,000 documentation units.
Achieved high F1-scores (0.75-0.97) with BERT classifiers.
Abstract
The learning and usage of an API is supported by official documentation. Like source code, API documentation is itself a software product. Several research results show that bad design in API documentation can make the reuse of API features difficult. Indeed, similar to code smells or code antipatterns, poorly designed API documentation can also exhibit 'smells'. Such documentation smells can be described as bad documentation styles that do not necessarily produce an incorrect documentation but nevertheless make the documentation difficult to properly understand and to use. Recent research on API documentation has focused on finding content inaccuracies in API documentation and to complement API documentation with external resources (e.g., crowd-shared code examples). We are aware of no research that focused on the automatic detection of API documentation smells. This paper makes two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
