MemTool: Optimizing Short-Term Memory Management for Dynamic Tool Calling in LLM Agent Multi-Turn Conversations

Elias Lumer; Anmol Gulati; Vamse Kumar Subbiah; Pradeep Honaganahalli Basavaraju; James A. Burke

arXiv:2507.21428·cs.CL·July 30, 2025

MemTool: Optimizing Short-Term Memory Management for Dynamic Tool Calling in LLM Agent Multi-Turn Conversations

Elias Lumer, Anmol Gulati, Vamse Kumar Subbiah, Pradeep Honaganahalli Basavaraju, James A. Burke

PDF

TL;DR

MemTool introduces a flexible short-term memory framework for LLM agents, enhancing multi-turn tool management and improving task accuracy across various models and interaction modes.

Contribution

This work presents MemTool, a novel memory management framework with three operational modes, enabling dynamic tool handling in LLM agents during multi-turn conversations.

Findings

01

High tool-removal efficiency in Autonomous Mode for reasoning LLMs (90-94%)

02

Medium-sized models show lower efficiency (0-60%)

03

Workflow and Hybrid modes effectively manage tool removal and task completion

Abstract

Large Language Model (LLM) agents have shown significant autonomous capabilities in dynamically searching and incorporating relevant tools or Model Context Protocol (MCP) servers for individual queries. However, fixed context windows limit effectiveness in multi-turn interactions requiring repeated, independent tool usage. We introduce MemTool, a short-term memory framework enabling LLM agents to dynamically manage tools or MCP server contexts across multi-turn conversations. MemTool offers three agentic architectures: 1) Autonomous Agent Mode, granting full tool management autonomy, 2) Workflow Mode, providing deterministic control without autonomy, and 3) Hybrid Mode, combining autonomous and deterministic control. Evaluating each MemTool mode across 13+ LLMs on the ScaleMCP benchmark, we conducted experiments over 100 consecutive user interactions, measuring tool removal ratios…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.