Reactor Mk.1 performances: MMLU, HumanEval and BBH test results
TJ Dunham, Henry Syahputra

TL;DR
Reactor Mk.1, a less than 100-billion-parameter large language model utilizing Lychee AI, demonstrates top-tier performance on benchmark tests like MMLU, HumanEval, and BBH, surpassing several prominent models.
Contribution
This paper introduces Reactor Mk.1, showcasing its high efficiency and performance, and provides benchmark results that position it as a leading AI model.
Findings
Achieved 92% on MMLU dataset
Scored 91% on HumanEval dataset
Reached 88% on BBH dataset
Abstract
The paper presents the performance results of Reactor Mk.1, ARCs flagship large language model, through a benchmarking process analysis. The model utilizes the Lychee AI engine and possesses less than 100 billion parameters, resulting in a combination of efficiency and potency. The Reactor Mk.1 outperformed models such as GPT-4o, Claude Opus, and Llama 3, with achieved scores of 92% on the MMLU dataset, 91% on HumanEval dataset, and 88% on BBH dataset. It excels in both managing difficult jobs and reasoning, establishing as a prominent AI solution in the present cutting-edge AI technology.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNuclear reactor physics and engineering · Nuclear and radioactivity studies · Nuclear Engineering Thermal-Hydraulics
MethodsLLaMA
