An Unsupervised Approach for Discovering Relevant Tutorial Fragments for APIs
He Jiang, Jingxuan Zhang, Zhilei Ren, Tao Zhang

TL;DR
This paper introduces FRAPT, an unsupervised method combining PageRank and topic modeling to effectively identify relevant API tutorial fragments, reducing manual effort and improving accuracy over existing supervised approaches.
Contribution
The study presents a novel unsupervised framework, FRAPT, that addresses pronoun resolution, non-explanatory fragment detection, and relevance scoring for API tutorial fragments.
Findings
FRAPT outperforms state-of-the-art methods by 8.77% and 12.32% in F-Measure.
Key components of FRAPT are validated through extensive experiments.
The approach effectively addresses pronoun resolution and non-explanatory fragment detection.
Abstract
Developers increasingly rely on API tutorials to facilitate software development. However, it remains a challenging task for them to discover relevant API tutorial fragments explaining unfamiliar APIs. Existing supervised approaches suffer from the heavy burden of manually preparing corpus-specific annotated data and features. In this study, we propose a novel unsupervised approach, namely Fragment Recommender for APIs with PageRank and Topic model (FRAPT). FRAPT can well address two main challenges lying in the task and effectively determine relevant tutorial fragments for APIs. In FRAPT, a Fragment Parser is proposed to identify APIs in tutorial fragments and replace ambiguous pronouns and variables with related ontologies and API names, so as to address the pronoun and variable resolution challenge. Then, a Fragment Filter employs a set of nonexplanatory detection rules to remove…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Web Data Mining and Analysis
