RRTL: Red Teaming Reasoning Large Language Models in Tool Learning

Yifei Liu; Yu Cui; Haibin Zhang

arXiv:2505.17106·cs.CL·May 26, 2025

RRTL: Red Teaming Reasoning Large Language Models in Tool Learning

Yifei Liu, Yu Cui, Haibin Zhang

PDF

TL;DR

This paper introduces RRTL, a red teaming method to evaluate the safety of reasoning large language models (RLLMs) in tool learning, revealing safety strengths and vulnerabilities across models.

Contribution

It presents a novel red teaming approach with two strategies to assess RLLMs' safety and uncovers key safety challenges and disparities among models.

Findings

01

RLLMs generally outperform traditional LLMs in safety.

02

Substantial safety disparities exist across models.

03

Deceptive risks and multilingual vulnerabilities are prevalent.

Abstract

While tool learning significantly enhances the capabilities of large language models (LLMs), it also introduces substantial security risks. Prior research has revealed various vulnerabilities in traditional LLMs during tool learning. However, the safety of newly emerging reasoning LLMs (RLLMs), such as DeepSeek-R1, in the context of tool learning remains underexplored. To bridge this gap, we propose RRTL, a red teaming approach specifically designed to evaluate RLLMs in tool learning. It integrates two novel strategies: (1) the identification of deceptive threats, which evaluates the model's behavior in concealing the usage of unsafe tools and their potential risks; and (2) the use of Chain-of-Thought (CoT) prompting to force tool invocation. Our approach also includes a benchmark for traditional LLMs. We conduct a comprehensive evaluation on seven mainstream RLLMs and uncover three key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsChain-of-thought prompting