Finetuning LLMs for Automatic Form Interaction on Web-Browser in Selenium Testing Framework

Nguyen-Khang Le; Hiep Nguyen; Ngoc-Minh Nguyen; Son T. Luu; Trung Vo; Quan Minh Bui; Shoshin Nomura; Le-Minh Nguyen

arXiv:2511.15168·cs.SE·November 21, 2025

Finetuning LLMs for Automatic Form Interaction on Web-Browser in Selenium Testing Framework

Nguyen-Khang Le, Hiep Nguyen, Ngoc-Minh Nguyen, Son T. Luu, Trung Vo, Quan Minh Bui, Shoshin Nomura, Le-Minh Nguyen

PDF

Open Access

TL;DR

This paper presents a novel method for fine-tuning large language models to generate effective Selenium scripts for web form testing, addressing a previously underexplored area with new datasets and evaluation metrics.

Contribution

It introduces a new approach for training LLMs to automate web form interaction testing in Selenium, including curated datasets and comprehensive evaluation metrics.

Findings

01

Our method outperforms GPT-4o and other LLMs in syntax correctness and input coverage.

02

The approach achieves higher script executability and form coverage in diverse real-world scenarios.

03

Empirical results demonstrate significant improvements over baseline models.

Abstract

Automated web application testing is a critical component of modern software development, with frameworks like Selenium widely adopted for validating functionality through browser automation. Among the essential aspects of such testing is the ability to interact with and validate web forms, a task that requires syntactically correct, executable scripts with high coverage of input fields. Despite its importance, this task remains underexplored in the context of large language models (LLMs), and no public benchmark or dataset exists to evaluate LLMs on form interaction generation systematically. This paper introduces a novel method for training LLMs to generate high-quality test cases in Selenium, specifically targeting form interaction testing. We curate both synthetic and human-annotated datasets for training and evaluation, covering diverse real-world forms and testing scenarios. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Software System Performance and Reliability