Out of style: Misadventures with LLMs and code style transfer

Karl Munson; Chih-Kai Ting; Serenity Wade; Anish Savla; Julian Dolby,; Kiran Kate; Kavitha Srinivas

arXiv:2406.10320·cs.SE·June 18, 2024

Out of style: Misadventures with LLMs and code style transfer

Karl Munson, Chih-Kai Ting, Serenity Wade, Anish Savla, Julian Dolby,, Kiran Kate, Kavitha Srinivas

PDF

Open Access

TL;DR

This paper evaluates the ability of large pre-trained code language models to perform automated code style transfer, revealing their limitations in understanding and modifying code styles accurately.

Contribution

It introduces CSB, a comprehensive benchmark suite for code style transfer tasks, and systematically assesses the performance of existing models, highlighting their current shortcomings.

Findings

01

Language models failed to perform style transfer tasks accurately.

02

Models struggled with tasks requiring deep code understanding.

03

The paper provides large-scale corpora for future research.

Abstract

Like text, programs have styles, and certain programming styles are more desirable than others for program readability, maintainability, and performance. Code style transfer, however, is difficult to automate except for trivial style guidelines such as limits on line length. Inspired by the success of using language models for text style transfer, we investigate if code language models can perform code style transfer. Code style transfer, unlike text transfer, has rigorous requirements: the system needs to identify lines of code to change, change them correctly, and leave the rest of the program untouched. We designed CSB (Code Style Benchmark), a benchmark suite of code style transfer tasks across five categories including converting for-loops to list comprehensions, eliminating duplication in code, adding decorators to methods, etc. We then used these tests to see if large pre-trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques