Risk-Adjusted Harm Scoring for Automated Red Teaming for LLMs in Financial Services
Fabrizio Dimino, Bhaskarjit Sarmah, Stefano Pasquali

TL;DR
This paper introduces a risk-aware evaluation framework and a novel score for assessing security failures of LLMs in financial services, emphasizing operational severity and domain-specific harms.
Contribution
It presents the Risk-Adjusted Harm Score (RAHS), a new metric that quantifies operational risk in LLM red-teaming within BFSI, incorporating domain-specific harms and multi-round interactions.
Findings
Higher stochasticity and adaptive interaction increase severity of disclosures.
Single-turn, domain-agnostic evaluations are insufficient for real-world BFSI security assessment.
Risk-sensitive metrics reveal limitations of existing evaluation methods.
Abstract
The rapid adoption of large language models (LLMs) in financial services introduces new operational, regulatory, and security risks. Yet most red-teaming benchmarks remain domain-agnostic and fail to capture failure modes specific to regulated BFSI settings, where harmful behavior can be elicited through legally or professionally plausible framing. We propose a risk-aware evaluation framework for LLM security failures in Banking, Financial Services, and Insurance (BFSI), combining a domain-specific taxonomy of financial harms, an automated multi-round red-teaming pipeline, and an ensemble-based judging protocol. We introduce the Risk-Adjusted Harm Score (RAHS), a risk-sensitive metric that goes beyond success rates by quantifying the operational severity of disclosures, accounting for mitigation signals, and leveraging inter-judge agreement. Across diverse models, we find that higher…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Advanced Graph Neural Networks
