PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC
Haowei Liu, Xi Zhang, Haiyang Xu, Yuyang Wanyan, Junyang Wang, Ming, Yan, Ji Zhang, Chunfeng Yuan, Changsheng Xu, Weiming Hu, Fei Huang

TL;DR
This paper introduces PC-Agent, a hierarchical multi-agent framework for automating complex tasks on PCs, featuring an active perception module and a multi-level decision architecture, significantly improving task success rates.
Contribution
The paper presents a novel hierarchical multi-agent system with an active perception module and a new benchmark, enhancing complex PC task automation beyond existing methods.
Findings
Achieves 32% higher success rate on PC-Eval benchmark.
Effectively decomposes complex instructions into manageable subtasks.
Demonstrates improved perception and decision-making in PC environments.
Abstract
In the field of MLLM-based GUI agents, compared to smartphones, the PC scenario not only features a more complex interactive environment, but also involves more intricate intra- and inter-app workflows. To address these issues, we propose a hierarchical agent framework named PC-Agent. Specifically, from the perception perspective, we devise an Active Perception Module (APM) to overcome the inadequate abilities of current MLLMs in perceiving screenshot content. From the decision-making perspective, to handle complex user instructions and interdependent subtasks more effectively, we propose a hierarchical multi-agent collaboration architecture that decomposes decision-making processes into Instruction-Subtask-Action levels. Within this architecture, three agents (i.e., Manager, Progress and Decision) are set up for instruction decomposition, progress tracking and step-by-step…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Scheduling and Optimization Algorithms · Simulation Techniques and Applications
MethodsSparse Evolutionary Training
