TL;DR
Jupiter-N is a hybrid reasoning model post-trained from Nemotron 3 Super, enhancing agentic, cultural, and language capabilities while preserving the base model's abilities, with publicly released resources.
Contribution
Introduces a reproducible post-training framework for sovereign models, integrating cultural norms and language support without losing original capabilities.
Findings
Jupiter-N outperforms Nemotron in Welsh language tasks (+18 on ARC-Easy)
Achieves significant improvements in terminal-use (+9.1) and instruction following (+4.4) benchmarks
All model weights and datasets are publicly available.
Abstract
We present Jupiter-N, a hybrid reasoning model post-trained from Nemotron 3 Super, a fully open-source 120 billion parameter LLM. We target three objectives: (1) agentic capability via uncertainty-curated trajectories; (2) UK cultural alignment via synthetic data grounded in cultural norms; and (3) Welsh language support via parallel corpora and LLM-translated Welsh conversations. Our data curation strategy carefully preserves the base model's capabilities: using our Forget-Me-Not framework, we mix on-policy synthetic replay with off-policy task data to mitigate catastrophic forgetting, and include a mixture of reasoning and non-reasoning traces to maintain Nemotron's hybrid reasoning ability. Jupiter-N achieves standout gains over Nemotron in Welsh (+18 on ARC-Easy, +5.25 on MMLU-Lite), terminal-use (+9.1 on Terminal Bench 2) and instruction following (+4.4 on IFBench), while retaining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
