The Six Sigma Agent: Achieving Enterprise-Grade Reliability in LLM Systems Through Consensus-Driven Decomposed Execution

Khush Patel; Siva Surendira; Jithin George; Shreyas Kapale

arXiv:2601.22290·cs.AI·February 2, 2026

The Six Sigma Agent: Achieving Enterprise-Grade Reliability in LLM Systems Through Consensus-Driven Decomposed Execution

Khush Patel, Siva Surendira, Jithin George, Shreyas Kapale

PDF

Open Access

TL;DR

The paper introduces the Six Sigma Agent architecture that enhances the reliability of large language models in enterprise settings through task decomposition, parallel sampling, and consensus voting, achieving significant error reduction and cost savings.

Contribution

It presents a novel architecture combining task decomposition, micro-agent sampling, and consensus voting, with theoretical error bounds and practical validation for enterprise reliability.

Findings

01

Achieves exponential error reduction with increased sampling

02

Reduces error to 0.11% using cheap models and consensus voting

03

Demonstrates 14,700x reliability improvement and 80% cost reduction

Abstract

Large Language Models demonstrate remarkable capabilities yet remain fundamentally probabilistic, presenting critical reliability challenges for enterprise deployment. We introduce the Six Sigma Agent, a novel architecture that achieves enterprise-grade reliability through three synergistic components: (1) task decomposition into a dependency tree of atomic actions; (2) micro-agent sampling where each task is executed n times in parallel across diverse LLMs to generate independent outputs; and (3) consensus voting with dynamic scaling, clustering outputs and selecting the answer from the winning cluster with maximum votes. We prove that sampling n independent outputs with error rate p achieves system error O(p^{ceil(n/2)}), enabling exponential reliability gains. Even using cheaper models with 5% per-action error, consensus voting with 5 agents reduces error to 0.11%; dynamic scaling to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBig Data and Digital Economy · Software System Performance and Reliability · Ethics and Social Impacts of AI