Unveiling Assumptions: Exploring the Decisions of AI Chatbots and Human Testers
Francisco Gomes de Oliveira Neto

TL;DR
This paper investigates how LLM-based chatbots and human testers make testing decisions, focusing on their assumptions and preferences for test scenario diversity, revealing similarities and differences in their decision-making processes.
Contribution
It explores the decision-making behavior of LLM chatbots versus human testers in software testing, highlighting their alignment and divergence in test scenario selection.
Findings
Most testers prefer diverse test scenarios (96%).
Some chatbots mirror human preferences, others do not.
Initial insights support enhancing collaboration between testers and chatbots.
Abstract
The integration of Large Language Models (LLMs) and chatbots introduces new challenges and opportunities for decision-making in software testing. Decision-making relies on a variety of information, including code, requirements specifications, and other software artifacts that are often unclear or exist solely in the developer's mind. To fill in the gaps left by unclear information, we often rely on assumptions, intuition, or previous experiences to make decisions. This paper explores the potential of LLM-based chatbots like Bard, Copilot, and ChatGPT, to support software testers in test decisions such as prioritizing test cases effectively. We investigate whether LLM-based chatbots and human testers share similar "assumptions" or intuition in prohibitive testing scenarios where exhaustive execution of test cases is often impractical. Preliminary results from a survey of 127 testers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
