Cybernaut: Towards Reliable Web Automation
Ankur Tomar, Hengyue Liang, Indranil Bhattacharya, Natalia Larios, Francesco Carbone

TL;DR
Cybernaut is a framework that enhances the reliability and consistency of AI-driven web automation in complex enterprise environments by introducing SOP generation, high-precision element recognition, and a new performance metric.
Contribution
It introduces a comprehensive framework with SOP generation, precise DOM recognition, and a new metric, addressing challenges in automating complex internal web interfaces.
Findings
23.2% improvement in task success rate
84.7% accuracy in identifying consistent execution patterns
Effective in enterprise-scale web automation
Abstract
The emergence of AI-driven web automation through Large Language Models (LLMs) offers unprecedented opportunities for optimizing digital workflows. However, deploying such systems within industry's real-world environments presents four core challenges: (1) ensuring consistent execution, (2) accurately identifying critical HTML elements, (3) meeting human-like accuracy in order to automate operations at scale and (4) the lack of comprehensive benchmarking data on internal web applications. Existing solutions are primarily tailored for well-designed, consumer-facing websites (e.g., Amazon.com, Apple.com) and fall short in addressing the complexity of poorly-designed internal web interfaces. To address these limitations, we present Cybernaut, a novel framework to ensure high execution consistency in web automation agents designed for robust enterprise use. Our contributions are threefold:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
