Deep API Learning Revisited

James Martin; Jin L.C. Guo

arXiv:2205.01254·cs.SE·May 4, 2022

Deep API Learning Revisited

James Martin, Jin L.C. Guo

PDF

Open Access 1 Repo

TL;DR

This paper compares deep learning methods, including RNN and Transformer-based CodeBERT, for predicting API usage sequences from natural language queries, highlighting the impact of data cleaning and the superior performance of CodeBERT.

Contribution

It reproduces prior RNN-based API sequence prediction results and demonstrates that CodeBERT significantly outperforms previous methods on Python APIs.

Findings

01

Data cleaning reduces model performance.

02

CodeBERT outperforms RNN-based methods.

03

Pretraining on source code enhances API sequence prediction.

Abstract

Understanding the correct API usage sequences is one of the most important tasks for programmers when they work with unfamiliar libraries. However, programmers often encounter obstacles to finding the appropriate information due to either poor quality of API documentation or ineffective query-based searching strategy. To help solve this issue, researchers have proposed various methods to suggest the sequence of APIs given natural language queries representing the information needs from programmers. Among such efforts, Gu et al. adopted a deep learning method, in particular an RNN Encoder-Decoder architecture, to perform this task and obtained promising results on common APIs in Java. In this work, we aim to reproduce their results and apply the same methods for APIs in Python. Additionally, we compare the performance with a more recent Transformer-based method, i.e., CodeBERT, for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hapsby/deepapirevisited
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Software System Performance and Reliability