Generative Artificial Intelligence Reproducibility and Consensus
Edward Kim, Isamu Isozaki, Naomi Sirkin, Michael Robson

TL;DR
This paper investigates the reproducibility and verification of generative AI outputs using large-scale comparisons, consensus methods, and empirical analysis to enhance trust, transparency, and scientific rigor in AI research.
Contribution
It introduces a comprehensive methodology for verifying generative AI results through large-scale comparisons, consensus techniques, and analysis of stochasticity sources, advancing reproducibility standards.
Findings
Over 99.89% probability of detecting perceptual collisions in images
Achieved 100% consensus in large language models using greedy and beam search methods
Provided empirical bounds for verification error and stochasticity sources
Abstract
We performed a billion locality sensitive hash comparisons between artificially generated data samples to answer the critical question - can we reproduce the results of generative AI models? Reproducibility is one of the pillars of scientific research for verifiability, benchmarking, trust, and transparency. Futhermore, we take this research to the next level by verifying the "correctness" of generative AI output in a non-deterministic, trustless, decentralized network. We generate millions of data samples from a variety of open source diffusion and large language models and describe the procedures and trade-offs between generating more verses less deterministic output. Additionally, we analyze the outputs to provide empirical evidence of different parameterizations of tolerance and error bounds for verification. For our results, we show that with a majority vote between three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Topic Modeling · Artificial Intelligence in Healthcare and Education
