OrgForge-IT: A Verifiable Synthetic Benchmark for LLM-Based Insider Threat Detection
Jeffrey Flynt

TL;DR
OrgForge-IT introduces a verifiable synthetic benchmark for insider threat detection using a deterministic simulation engine, enabling consistent ground truth and realistic detection scenario evaluation for LLM-based models.
Contribution
It presents a novel, verifiable synthetic benchmark with architectural guarantees for cross-artifact consistency, addressing limitations of existing static datasets and enabling comprehensive threat detection evaluation.
Findings
Models show varied verdict accuracy despite similar triage performance.
False-positive rates significantly impact verdict accuracy and model noise resilience.
Victim attribution distinguishes threat detection tiers and informs response strategies.
Abstract
Synthetic insider threat benchmarks face a consistency problem: corpora generated without an external factual constraint cannot rule out cross-artifact contradictions. The CERT dataset -- the field's canonical benchmark -- is also static, lacks cross-surface correlation scenarios, and predates the LLM era. We present OrgForge-IT, a verifiable synthetic benchmark in which a deterministic simulation engine maintains ground truth and language models generate only surface prose, making cross-artifact consistency an architectural guarantee. The corpus spans 51 simulated days, 2,904 telemetry records at a 96.4% noise rate, and four detection scenarios designed to defeat single-surface and single-day triage strategies across three threat classes and eight injectable behaviors. A ten-model leaderboard reveals several findings: (1) triage and verdict accuracy dissociate - eight models achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Adversarial Robustness in Machine Learning · Information and Cyber Security
