A More Accurate Model for Finding Tutorial Segments Explaining APIs
He Jiang, Jingxuan Zhang, Xiaochen Li, Zhilei Ren, David Lo

TL;DR
This paper introduces a novel, more accurate model for locating tutorial segments that explain specific APIs, combining domain-specific features, co-occurrence indicators, API extensions, and semantic similarity to improve retrieval precision.
Contribution
The study presents a new model integrating co-occurrence, API extensions, and Word2Vec to enhance tutorial segment identification explaining APIs, outperforming previous methods.
Findings
Achieves up to 90% accuracy in finding API explanation segments.
Improves F-measure by up to 30% over previous models.
Effective across multiple tutorial datasets.
Abstract
Developers prefer to utilize third-party libraries when they implement some functionalities and Application Programming Interfaces (APIs) are frequently used by them. Facing an unfamiliar API, developers tend to consult tutorials as learning resources. Unfortunately, the segments explaining a specific API scatter across tutorials. Hence, it remains a challenging issue to find the relevant segments. In this study, we propose a more accurate model to find the exact tutorial fragments explaining APIs. This new model consists of a text classifier with domain specific features. More specifically, we discover two important indicators to complement traditional text based features, namely co-occurrence APIs and knowledge based API extensions. In addition, we incorporate Word2Vec, a semantic similarity metric to enhance the new model. Extensive experiments over two publicly available tutorial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
