FACOS: Finding API Relevant Contents on Stack Overflow with Semantic and   Syntactic Analysis

Kien Luong; Mohammad Hadi; Ferdian Thung; Fatemeh Fard; and David Lo

arXiv:2111.07238·cs.SE·November 16, 2021

FACOS: Finding API Relevant Contents on Stack Overflow with Semantic and Syntactic Analysis

Kien Luong, Mohammad Hadi, Ferdian Thung, Fatemeh Fard, and David Lo

PDF

Open Access

TL;DR

FACOS is a novel algorithm that effectively identifies relevant API discussions on Stack Overflow by combining semantic and syntactic analysis, outperforming previous methods.

Contribution

The paper introduces FACOS, a new method that integrates syntactic scoring with a fine-tuned CodeBERT model to improve API content retrieval accuracy.

Findings

01

FACOS achieves a 13.9% higher F1-score than previous approaches.

02

Combines syntactic word scores with semantic analysis from CodeBERT.

03

Effectively captures API relevance in mixed textual and code content.

Abstract

Collecting API examples, usages, and mentions relevant to a specific API method over discussions on venues such as Stack Overflow is not a trivial problem. It requires efforts to correctly recognize whether the discussion refers to the API method that developers/tools are searching for. The content of the thread, which consists of both text paragraphs describing the involvement of the API method in the discussion and the code snippets containing the API invocation, may refer to the given API method. Leveraging this observation, we develop FACOS, a context-specific algorithm to capture the semantic and syntactic information of the paragraphs and code snippets in a discussion. FACOS combines a syntactic word-based score with a score from a predictive model fine-tuned from CodeBERT. FACOS beats the state-of-the-art approach by 13.9% in terms of F1-score.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Topic Modeling · Web Data Mining and Analysis

MethodsCodeBERT