Infant Agent: A Tool-Integrated, Logic-Driven Agent with Cost-Effective   API Usage

Bin Lei; Yuchen Li; Yiming Zeng; Tao Ren; Yi Luo; Tianyu Shi; Zitian; Gao; Zeyu Hu; Weitai Kang; Qiuwu Chen

arXiv:2411.01114·cs.AI·November 5, 2024

Infant Agent: A Tool-Integrated, Logic-Driven Agent with Cost-Effective API Usage

Bin Lei, Yuchen Li, Yiming Zeng, Tao Ren, Yi Luo, Tianyu Shi, Zitian, Gao, Zeyu Hu, Weitai Kang, Qiuwu Chen

PDF

Open Access

TL;DR

The Infant Agent enhances large language models' ability to solve complex real-world problems and logic tasks by integrating task-aware functions, hierarchical management, and memory retrieval, significantly improving accuracy and reducing API costs.

Contribution

This paper introduces the Infant Agent, a novel framework that enables LLMs to perform extended reasoning and complex tasks more efficiently and cost-effectively.

Findings

01

GPT-4o accuracy on SWE-bench-lite increased from 0.33% to 30%.

02

GPT-4o accuracy on AIME-2024 increased from 13.3% to 37%.

03

The framework reduces API costs while improving reasoning capabilities.

Abstract

Despite the impressive capabilities of large language models (LLMs), they currently exhibit two primary limitations, \textbf{\uppercase\expandafter{\romannumeral 1}}: They struggle to \textbf{autonomously solve the real world engineering problem}. \textbf{\uppercase\expandafter{\romannumeral 2}}: They remain \textbf{challenged in reasoning through complex logic problems}. To address these challenges, we developed the \textsc{Infant Agent}, integrating task-aware functions, operators, a hierarchical management system, and a memory retrieval mechanism. Together, these components enable large language models to sustain extended reasoning processes and handle complex, multi-step tasks efficiently, all while significantly reducing API costs. Using the \textsc{Infant Agent}, GPT-4o's accuracy on the SWE-bench-lite dataset rises from $0.33%$ to $30%$ , and in the AIME-2024…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation