Optimizing Large Language Model Hyperparameters for Code Generation

Chetan Arora; Ahnaf Ibn Sayeed; Sherlock Licorish; Fanyu Wang,; Christoph Treude

arXiv:2408.10577·cs.SE·August 21, 2024·3 cites

Optimizing Large Language Model Hyperparameters for Code Generation

Chetan Arora, Ahnaf Ibn Sayeed, Sherlock Licorish, Fanyu Wang,, Christoph Treude

PDF

Open Access

TL;DR

This paper systematically explores how hyperparameters like temperature, top_p, frequency penalty, and presence penalty affect the quality of code generated by large language models across multiple Python tasks, providing insights for optimal settings.

Contribution

It presents an exhaustive analysis of hyperparameter impacts on LLM code generation performance, offering practical guidelines for tuning models effectively.

Findings

01

Optimal performance with temperature below 0.5

02

Best results with top_p below 0.75

03

Frequency penalty between -1 and 1.5

Abstract

Large Language Models (LLMs), such as GPT models, are increasingly used in software engineering for various tasks, such as code generation, requirements management, and debugging. While automating these tasks has garnered significant attention, a systematic study on the impact of varying hyperparameters on code generation outcomes remains unexplored. This study aims to assess LLMs' code generation performance by exhaustively exploring the impact of various hyperparameters. Hyperparameters for LLMs are adjustable settings that affect the model's behaviour and performance. Specifically, we investigated how changes to the hyperparameters: temperature, top probability (top_p), frequency penalty, and presence penalty affect code generation outcomes. We systematically adjusted all hyperparameters together, exploring every possible combination by making small increments to each hyperparameter…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems