Exploring and Evaluating Personalized Models for Code Generation
Andrei Zlotchevski, Dawn Drain, Alexey Svyatkovskiy, Colin Clement,, Neel Sundaresan, Michele Tufano

TL;DR
This paper investigates personalized transformer models for code generation, comparing fine-tuning, lightweight tuning, and prefix tuning to enhance project-specific performance efficiently.
Contribution
It introduces and evaluates three personalization techniques for transformer-based code generation, analyzing their trade-offs in performance and computational cost.
Findings
Custom fine-tuning yields the highest accuracy but is computationally intensive.
Lightweight fine-tuning offers a balance between performance and efficiency.
Prefix tuning provides a cost-effective approach with competitive results.
Abstract
Large Transformer models achieved the state-of-the-art status for Natural Language Understanding tasks and are increasingly becoming the baseline model architecture for modeling source code. Transformers are usually pre-trained on large unsupervised corpora, learning token representations and transformations relevant to modeling generally available text, and are then fine-tuned on a particular downstream task of interest. While fine-tuning is a tried-and-true method for adapting a model to a new domain -- for example, question-answering on a given topic -- generalization remains an on-going challenge. In this paper, we explore and evaluate transformer model fine-tuning for personalization. In the context of generating unit tests for Java methods, we evaluate learning to personalize to a specific software project using several personalization techniques. We consider three key approaches:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Linear Layer · Label Smoothing · Absolute Position Encodings · Layer Normalization · Dropout · Dense Connections · Adam · Position-Wise Feed-Forward Layer · Multi-Head Attention
