Sola-Visibility-ISPM: Benchmarking Agentic AI for Identity Security Posture Management Visibility
Gal Engelberg, Konstantin Koutsyi, Leon Goldberg, Reuven Elezra, Idan Pinto, Tal Moalem, Shmuel Cohen, Yoni Weintrob

TL;DR
This paper introduces the first benchmark for evaluating agentic AI systems in Identity Security Posture Management, demonstrating strong performance on real enterprise data and providing a foundation for future identity security tools.
Contribution
It presents the Sola Visibility ISPM Benchmark and the Sola AI Agent, enabling standardized evaluation of agentic AI in enterprise identity security tasks.
Findings
Agentic AI achieves 0.84 expert accuracy on benchmark questions.
Performance is highest on AWS hygiene tasks with 0.94 accuracy.
The benchmark covers real enterprise data across AWS, Okta, and Google Workspace.
Abstract
Identity Security Posture Management (ISPM) is a core challenge for modern enterprises operating across cloud and SaaS environments. Answering basic ISPM visibility questions, such as understanding identity inventory and configuration hygiene, requires interpreting complex identity data, motivating growing interest in agentic AI systems. Despite this interest, there is currently no standardized way to evaluate how well such systems perform ISPM visibility tasks on real enterprise data. We introduce the Sola Visibility ISPM Benchmark, the first benchmark designed to evaluate agentic AI systems on foundational ISPM visibility tasks using a live, production-grade identity environment spanning AWS, Okta, and Google Workspace. The benchmark focuses on identity inventory and hygiene questions and is accompanied by the Sola AI Agent, a tool-using agent that translates natural-language queries…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Data Quality and Management · Big Data and Digital Economy
