Tool Unlearning for Tool-Augmented LLMs
Jiali Cheng, Hadi Amiri

TL;DR
This paper introduces ToolDelete, a novel method for unlearning specific tools from tool-augmented large language models, addressing security, privacy, and deprecation issues while maintaining overall model performance.
Contribution
The paper presents the first approach for unlearning tools in LLMs, including a new evaluation metric and extensive experiments demonstrating effectiveness.
Findings
ToolDelete effectively unlearns selected tools.
It preserves knowledge of non-deleted tools.
Maintains performance on general tasks.
Abstract
Tool-augmented large language models (LLMs) are often trained on datasets of query-response pairs, which embed the ability to use tools or APIs directly into the parametric knowledge of LLMs. Tool-augmented LLMs need the ability to forget learned tools due to security vulnerabilities, privacy regulations, or tool deprecations. However, ``tool unlearning'' has not been investigated in unlearning literature. We introduce this novel task, which requires addressing distinct challenges compared to traditional unlearning: knowledge removal rather than forgetting individual samples, the high cost of optimizing LLMs, and the need for principled evaluation metrics. To bridge these gaps, we propose ToolDelete, the first approach for unlearning tools from tool-augmented LLMs. It implements three key properties to address the above challenges for effective tool unlearning and introduces a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMathematics, Computing, and Information Processing · Open Education and E-Learning · Digital Rights Management and Security
