InData: Towards Secure Multi-Step, Tool-Based Data Analysis
Karthikeyan K, Raghuveer Thirukovalluru, Bhuwan Dhingra, David Edwin Carlson

TL;DR
This paper introduces InData, a dataset to evaluate large language models' ability to perform secure, multi-step, tool-based data analysis reasoning, highlighting current limitations in complex task performance.
Contribution
The paper presents InData, a novel dataset designed to assess LLMs' multi-step reasoning with secure tools, addressing security concerns and gaps in existing benchmarks.
Findings
Large models perform well on easy tasks (97.3%)
Performance drops significantly on hard tasks (69.6%)
Current LLMs lack robust multi-step tool-based reasoning ability
Abstract
Large language model agents for data analysis typically generate and execute code directly on databases. However, when applied to sensitive data, this approach poses significant security risks. To address this issue, we propose a security-motivated alternative: restrict LLMs from direct code generation and data access, and require them to interact with data exclusively through a predefined set of secure, verified tools. Although recent tool-use benchmarks exist, they primarily target tool selection and simple execution rather than the compositional, multi-step reasoning needed for complex data analysis. To reduce this gap, we introduce Indirect Data Engagement (InData), a dataset designed to assess LLMs' multi-step tool-based reasoning ability. InData includes data analysis questions at three difficulty levels--Easy, Medium, and Hard--capturing increasing reasoning complexity. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Security and Verification in Computing
