OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Zhiyong Wu, Chengcheng Han, Zichen Ding, Zhenmin Weng, Zhoumianze Liu,, Shunyu Yao, Tao Yu, Lingpeng Kong

TL;DR
OS-Copilot introduces a framework for creating generalist computer agents capable of interacting with diverse OS elements, demonstrating self-improvement and strong performance on general AI benchmarks and specific applications.
Contribution
The paper presents OS-Copilot, a novel framework enabling the development of versatile, self-improving computer agents that can handle a wide range of OS interactions and tasks.
Findings
FRIDAY outperforms previous methods by 35% on GAIA benchmark.
FRIDAY demonstrates effective self-improvement on Excel and PowerPoint.
OS-Copilot provides a scalable infrastructure for general-purpose computer agents.
Abstract
Autonomous interaction with the computer has been a longstanding challenge with great potential, and the recent proliferation of large language models (LLMs) has markedly accelerated progress in building digital agents. However, most of these agents are designed to interact with a narrow domain, such as a specific software or website. This narrow focus constrains their applicability for general computer tasks. To this end, we introduce OS-Copilot, a framework to build generalist agents capable of interfacing with comprehensive elements in an operating system (OS), including the web, code terminals, files, multimedia, and various third-party applications. We use OS-Copilot to create FRIDAY, a self-improving embodied agent for automating general computer tasks. On GAIA, a general AI assistants benchmark, FRIDAY outperforms previous methods by 35%, showcasing strong generalization to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Semantic Web and Ontologies
MethodsFocus
