CI4A: Semantic Component Interfaces for Agents Empowering Web Automation
Zhi Qiu, Jiazheng Sun, Chenxiao Xia, Jun Zheng, Xin Peng

TL;DR
This paper introduces CI4A, a semantic interface for web agents that simplifies interaction with UI components, leading to improved performance and efficiency in web automation tasks.
Contribution
We propose CI4A, a novel semantic encapsulation mechanism for UI components, integrated into Ant Design, enabling more effective agent interaction and surpassing existing benchmarks.
Findings
Achieved a new state-of-the-art success rate of 86.3%.
Significantly improved execution efficiency.
Enhanced agent flexibility with dynamic action space.
Abstract
While Large Language Models demonstrate remarkable proficiency in high-level semantic planning, they remain limited in handling fine-grained, low-level web component manipulations. To address this limitation, extensive research has focused on enhancing model grounding capabilities through techniques such as Reinforcement Learning. However, rather than compelling agents to adapt to human-centric interfaces, we propose constructing interaction interfaces specifically optimized for agents. This paper introduces Component Interface for Agent (CI4A), a semantic encapsulation mechanism that abstracts the complex interaction logic of UI components into a set of unified tool primitives accessible to agents. We implemented CI4A within Ant Design, an industrial-grade front-end framework, covering 23 categories of commonly used UI components. Furthermore, we developed a hybrid agent featuring an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Artificial Intelligence in Games · Reinforcement Learning in Robotics
