Can Machines Read Coding Manuals Yet? -- A Benchmark for Building Better   Language Models for Code Understanding

Ibrahim Abdelaziz; Julian Dolby; Jamie McCusker; and Kavitha Srinivas

arXiv:2109.07452·cs.CL·September 16, 2021

Can Machines Read Coding Manuals Yet? -- A Benchmark for Building Better Language Models for Code Understanding

Ibrahim Abdelaziz, Julian Dolby, Jamie McCusker, and Kavitha Srinivas

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces BLANCA, a benchmark suite for evaluating language models on code-related textual artifacts, revealing current limitations and improvements through fine-tuning and multi-task training.

Contribution

The paper presents BLANCA, the first systematic benchmark for assessing language models on code documentation and forum discussions, and demonstrates how fine-tuning enhances their performance.

Findings

01

Fine-tuning significantly improves model performance on BLANCA tasks.

02

Multi-task training over BLANCA tasks leads to better code understanding models.

03

Current state-of-the-art models still have room for improvement on code-related text understanding.

Abstract

Code understanding is an increasingly important application of Artificial Intelligence. A fundamental aspect of understanding code is understanding text about code, e.g., documentation and forum discussions. Pre-trained language models (e.g., BERT) are a popular approach for various NLP tasks, and there are now a variety of benchmarks, such as GLUE, to help improve the development of such models for natural language understanding. However, little is known about how well such models work on textual artifacts about code, and we are unaware of any systematic set of downstream tasks for such an evaluation. In this paper, we derive a set of benchmarks (BLANCA - Benchmarks for LANguage models on Coding Artifacts) that assess code understanding based on tasks such as predicting the best answer to a question in a forum post, finding related forum posts, or predicting classes related in a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wala/blanca
noneOfficial

Videos

Can Machines Read Coding Manuals Yet? – A Benchmark for Building Better Language Models for Code Understanding· underline

Taxonomy

TopicsSoftware Engineering Research · Text Readability and Simplification · Natural Language Processing Techniques