Testing Framework Migration with Large Language Models
Altino Alves, Jo\~ao Eduardo Montandon, Andre Hora

TL;DR
This paper explores using Large Language Models like GPT-4 and Claude Sonnet 4 to automate migrating Python test suites from unittest to Pytest, aiming to reduce manual effort and improve test modernization.
Contribution
It introduces a dataset of real-world test migrations and evaluates LLMs' effectiveness in automating this process with detailed prompting strategies.
Findings
Approximately 48.5% of LLM-generated migrations passed tests
Claude Sonnet 4 tends to preserve legacy unittest features
GPT-4o favors more extensive test transformations
Abstract
Python developers rely on two major testing frameworks: \texttt{unittest} and \texttt{Pytest}. While \texttt{Pytest} offers simpler assertions, reusable fixtures, and better interoperability, migrating existing suites from \texttt{unittest} remains a manual and time-consuming process. Automating this migration could substantially reduce effort and accelerate test modernization. In this paper, we investigate the capability of Large Language Models (LLMs) to automate test framework migrations from \texttt{unittest} to \texttt{Pytest}. We evaluate GPT 4o and Claude Sonnet 4 under three prompting strategies (Zero-shot, One-shot, and Chain-of-Thought) and two temperature settings (0.0 and 1.0). To support this analysis, we first introduce a curated dataset of real-world migrations extracted from the top 100 Python open-source projects. Next, we actually execute the LLM-generated test…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Scientific Computing and Data Management · Software Engineering Research
