OrgForge: A Multi-Agent Simulation Framework for Verifiable Synthetic Corporate Corpora
Jeffrey Flynt

TL;DR
OrgForge is an open-source multi-agent simulation framework that creates verifiable synthetic organizational data by modeling processes rather than documents, improving fidelity over LLM-generated corpora.
Contribution
It introduces a deterministic simulation engine enforcing a physics-cognition boundary, generating traceable organizational artifacts without hallucination artifacts.
Findings
Achieves 0.46 improvement in prose-to-ground-truth fidelity over LLM baselines.
Simulates organizational processes producing cross-artifact causal cascades.
Identifies a hallucination failure mode in chained LLM document generation.
Abstract
Building and evaluating enterprise AI systems requires synthetic organizational corpora that are internally consistent, temporally structured, and cross-artifact traceable. Existing corpora either carry legal constraints or inherit hallucination artifacts from the generating LLMs, silently corrupting results when timestamps or facts contradict across documents and reinforcing those errors during training. We present OrgForge, an open-source multi-agent simulation framework that enforces a strict physics-cognition boundary: a deterministic Python engine maintains a SimEvent ground-truth bus while LLMs generate only surface prose. OrgForge simulates the organizational processes that produce documents, not the documents themselves. Engineers leave mid-sprint, triggering incident handoffs and CRM ownership lapses. Knowledge gaps emerge when under-documented systems break and recover through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
