Go-UT-Bench: A Fine-Tuning Dataset for LLM-Based Unit Test Generation in Go

Yashshi Pipalani; Hritik Raj; Rajat Ghosh; Vaishnavi Bhargava; Debojyoti Dutta

arXiv:2511.10868·cs.LG·November 17, 2025

Go-UT-Bench: A Fine-Tuning Dataset for LLM-Based Unit Test Generation in Go

Yashshi Pipalani, Hritik Raj, Rajat Ghosh, Vaishnavi Bhargava, Debojyoti Dutta

PDF

Open Access

TL;DR

This paper introduces Go-UT-Bench, a dataset of Golang code and unit tests, to improve LLM performance on unit test generation, addressing data imbalance issues in low-resource languages.

Contribution

It provides a new benchmark dataset for fine-tuning LLMs on unit test generation in Go, demonstrating improved performance over base models.

Findings

01

Finetuned models outperform base models on over 75% of tasks.

02

The dataset covers 5264 code-test pairs from 10 repositories.

03

Effective for enhancing LLMs in software engineering tasks.

Abstract

Training data imbalance poses a major challenge for code LLMs. Most available data heavily over represents raw opensource code while underrepresenting broader software engineering tasks, especially in low resource languages like Golang. As a result, models excel at code autocompletion but struggle with real world developer workflows such as unit test generation. To address this gap, we introduce GO UT Bench, a benchmark dataset of 5264 pairs of code and unit tests, drawn from 10 permissively licensed Golang repositories spanning diverse domain. We evaluate its effectiveness as a fine tuning dataset across two LLM families i.e. mixture of experts and dense decoders. Our results show that finetuned models outperform their base counterparts on more than 75% of benchmark tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Logic, programming, and type systems