Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit
Yao Wan, Yang He, Zhangqian Bi, Jianguo Zhang, Hongyu Zhang, Yulei, Sui, Guandong Xu, Hai Jin, Philip S. Yu

TL;DR
This paper provides a comprehensive survey, benchmark, and toolkit for deep learning-based code intelligence, facilitating research and development in understanding and improving code analysis tools.
Contribution
It offers a detailed literature review, benchmarks state-of-the-art models, and releases an open-source toolkit and dataset for rapid prototyping and evaluation.
Findings
Benchmark results of neural models for code tasks
Open-source toolkit for code intelligence research
Public dataset for model evaluation
Abstract
Code intelligence leverages machine learning techniques to extract knowledge from extensive code corpora, with the aim of developing intelligent tools to improve the quality and productivity of computer programming. Currently, there is already a thriving research community focusing on code intelligence, with efforts ranging from software engineering, machine learning, data mining, natural language processing, and programming languages. In this paper, we conduct a comprehensive literature review on deep learning for code intelligence, from the aspects of code representation learning, deep learning techniques, and application tasks. We also benchmark several state-of-the-art neural models for code intelligence, and provide an open-source toolkit tailored for the rapid prototyping of deep-learning-based code intelligence models. In particular, we inspect the existing code intelligence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Software Reliability and Analysis Research
