FACOS: Finding API Relevant Contents on Stack Overflow with Semantic and Syntactic Analysis
Kien Luong, Mohammad Hadi, Ferdian Thung, Fatemeh Fard, and David Lo

TL;DR
FACOS is a novel algorithm that effectively identifies relevant API discussions on Stack Overflow by combining semantic and syntactic analysis, outperforming previous methods.
Contribution
The paper introduces FACOS, a new method that integrates syntactic scoring with a fine-tuned CodeBERT model to improve API content retrieval accuracy.
Findings
FACOS achieves a 13.9% higher F1-score than previous approaches.
Combines syntactic word scores with semantic analysis from CodeBERT.
Effectively captures API relevance in mixed textual and code content.
Abstract
Collecting API examples, usages, and mentions relevant to a specific API method over discussions on venues such as Stack Overflow is not a trivial problem. It requires efforts to correctly recognize whether the discussion refers to the API method that developers/tools are searching for. The content of the thread, which consists of both text paragraphs describing the involvement of the API method in the discussion and the code snippets containing the API invocation, may refer to the given API method. Leveraging this observation, we develop FACOS, a context-specific algorithm to capture the semantic and syntactic information of the paragraphs and code snippets in a discussion. FACOS combines a syntactic word-based score with a score from a predictive model fine-tuned from CodeBERT. FACOS beats the state-of-the-art approach by 13.9% in terms of F1-score.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Web Data Mining and Analysis
MethodsCodeBERT
