TL;DR
Confucius3-Math is a 14-billion-parameter open-source language model optimized for Chinese K-12 mathematics education, achieving state-of-the-art reasoning performance efficiently on consumer hardware.
Contribution
The paper introduces a lightweight, high-performance reasoning LLM tailored for Chinese K-12 math education, with novel training techniques and open-source release.
Findings
Achieves SOTA performance on Chinese K-12 math tasks
Runs efficiently on a single consumer GPU
Introduces new RL training techniques for stability and performance
Abstract
We introduce Confucius3-Math, an open-source large language model with 14B parameters that (1) runs efficiently on a single consumer-grade GPU; (2) achieves SOTA performances on a range of mathematical reasoning tasks, outperforming many models with significantly larger sizes. In particular, as part of our mission to enhancing education and knowledge dissemination with AI, Confucius3-Math is specifically committed to mathematics learning for Chinese K-12 students and educators. Built via post-training with large-scale reinforcement learning (RL), Confucius3-Math aligns with national curriculum and excels at solving main-stream Chinese K-12 mathematical problems with low cost. In this report we share our development recipe, the challenges we encounter and the techniques we develop to overcome them. In particular, we introduce three technical innovations: Targeted Entropy Regularization,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
