Swiss-Bench 003: Evaluating LLM Reliability and Adversarial Security for Swiss Regulatory Contexts

Fatih Uenal

arXiv:2604.05872·cs.CR·April 8, 2026

Swiss-Bench 003: Evaluating LLM Reliability and Adversarial Security for Swiss Regulatory Contexts

Fatih Uenal

PDF

TL;DR

This paper introduces Swiss-Bench 003, an evaluation framework for assessing the reliability and adversarial security of LLMs in Swiss financial and regulatory contexts, with extensive multilingual benchmarking.

Contribution

It extends existing evaluation frameworks by adding reliability and security dimensions, and provides comprehensive benchmarking of ten models across Swiss-specific tasks and regulations.

Findings

01

Self-graded reliability scores (73-94%) are higher than security scores (20-61%).

02

System prompt leakage resistance varies from 24.8% to 88.2%.

03

PII extraction defense remains weak at 14-42%.

Abstract

The deployment of large language models (LLMs) in Swiss financial and regulatory contexts demands empirical evidence of both production reliability and adversarial security, dimensions not jointly operationalized in existing Swiss-focused evaluation frameworks. This paper introduces Swiss-Bench 003 (SBP-003), extending the HAAS (Helvetic AI Assessment Score) from six to eight dimensions by adding D7 (Self-Graded Reliability Proxy) and D8 (Adversarial Security). I evaluate ten frontier models across 808 Swiss-specific items in four languages (German, French, Italian, English), comprising seven Swiss-adapted benchmarks (Swiss TruthfulQA, Swiss IFEval, Swiss SimpleQA, Swiss NIAH, Swiss PII-Scope, System Prompt Leakage, and Swiss German Comprehension) targeting FINMA Guidance 08/2024, the revised Federal Act on Data Protection (nDSG), and OWASP Top 10 for LLMs. Self-graded D7 scores…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.