Empirical Study of Transformers for Source Code

Nadezhda Chirkova; Sergey Troshin

arXiv:2010.07987·cs.LG·June 25, 2021

Empirical Study of Transformers for Source Code

Nadezhda Chirkova, Sergey Troshin

PDF

1 Repo

TL;DR

This paper empirically evaluates how Transformers utilize syntactic information in source code across tasks like code completion, function naming, and bug fixing, highlighting best practices for leveraging syntax to enhance model performance.

Contribution

It provides a comprehensive empirical comparison of syntax-capturing Transformer modifications across multiple source code tasks within a unified framework.

Findings

01

Transformers can effectively use syntactic information for source code tasks.

02

Certain syntax-capturing modifications outperform others in specific tasks.

03

Best practices for incorporating syntax improve Transformer performance.

Abstract

Initially developed for natural language processing (NLP), Transformers are now widely used for source code processing, due to the format similarity between source code and text. In contrast to natural language, source code is strictly structured, i.e., it follows the syntax of the programming language. Several recent works develop Transformer modifications for capturing syntactic information in source code. The drawback of these works is that they do not compare to each other and consider different tasks. In this work, we conduct a thorough empirical study of the capabilities of Transformers to utilize syntactic information in different tasks. We consider three tasks (code completion, function naming and bug fixing) and re-implement different syntax-capturing modifications in a unified framework. We show that Transformers are able to make meaningful predictions based purely on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bayesgroup/code_transformers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Softmax · Layer Normalization · Dense Connections · Multi-Head Attention · Label Smoothing