AgentA/B: Automated and Scalable Web A/BTesting with Interactive LLM Agents
Yuxuan Lu, Ting-Yao Hsu, Hansu Gu, Limeng Cui, Yaochen Xie, William Headden, Bingsheng Yao, Akash Veeragouni, Jiapeng Liu, Sreyashi Nag, Jessie Wang, Dakuo Wang

TL;DR
This paper introduces AgentA/B, a system using autonomous LLM agents to simulate user interactions for scalable, automated web A/B testing, reducing reliance on live traffic and speeding up evaluation processes.
Contribution
The paper presents a novel LLM-based system that automates and scales web A/B testing by simulating diverse user behaviors interactively on real webpages.
Findings
AgentA/B can emulate human shopping behaviors.
The system enables scalable A/B testing with 1,000 LLM agents.
AgentA/B reduces dependence on live user traffic.
Abstract
A/B testing experiment is a widely adopted method for evaluating UI/UX design decisions in modern web applications. Yet, traditional A/B testing remains constrained by its dependence on the large-scale and live traffic of human participants, and the long time of waiting for the testing result. Through formative interviews with six experienced industry practitioners, we identified critical bottlenecks in current A/B testing workflows. In response, we present AgentA/B, a novel system that leverages Large Language Model-based autonomous agents (LLM Agents) to automatically simulate user interaction behaviors with real webpages. AgentA/B enables scalable deployment of LLM agents with diverse personas, each capable of navigating the dynamic webpage and interactively executing multi-step interactions like search, clicking, filtering, and purchasing. In a demonstrative controlled experiment,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Rights Management and Security · Multi-Agent Systems and Negotiation · Semantic Web and Ontologies
