EffGen: Enabling Small Language Models as Capable Autonomous Agents
Gaurav Srivastava, Aafiya Hussain, Chi Wang, Yingyan Celine Lin, Xuan Wang

TL;DR
EffGen is an open-source framework that enables small language models to function as capable autonomous agents by optimizing tool use, task decomposition, routing, and memory management, outperforming existing solutions.
Contribution
EffGen introduces novel techniques for tool optimization, task decomposition, complexity routing, and memory integration tailored for small language models, enhancing their autonomy and efficiency.
Findings
Outperforms LangChain, AutoGen, and Smolagents on 13 benchmarks.
Achieves higher success rates, faster execution, and lower memory usage.
Prompt optimization benefits smaller models more, routing benefits larger models.
Abstract
Most existing language model agentic systems today are built and optimized for large language models (e.g., GPT, Claude, Gemini) via API calls. While powerful, this approach faces several limitations including high token costs and privacy concerns for sensitive applications. We introduce effGen, an open-source agentic framework optimized for small language models (SLMs) that enables effective, efficient, and secure local deployment (pip install effgen). effGen makes four major contributions: (1) Enhanced tool-calling with prompt optimization that compresses contexts by 70-80% while preserving task semantics, (2) Intelligent task decomposition that breaks complex queries into parallel or sequential subtasks based on dependencies, (3) Complexity-based routing using five factors to make smart pre-execution decisions, and (4) Unified memory system combining short-term, long-term, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Topic Modeling · Multimodal Machine Learning Applications
