TetraJet-v2: Accurate NVFP4 Training for Large Language Models with Oscillation Suppression and Outlier Control

Yuxiang Chen; Yifan Liu; Xiaoming Xu; Pengle Zhang; Michael Beyer; Martin Rapp; Jun Zhu; Jianfei Chen

arXiv:2510.27527·cs.LG·May 12, 2026

TetraJet-v2: Accurate NVFP4 Training for Large Language Models with Oscillation Suppression and Outlier Control

Yuxiang Chen, Yifan Liu, Xiaoming Xu, Pengle Zhang, Michael Beyer, Martin Rapp, Jun Zhu, Jianfei Chen

PDF

1 Repo

TL;DR

TetraJet-v2 introduces a novel 4-bit fully-quantized training method for large language models, effectively addressing oscillation and outliers to enable efficient low-precision training with minimal performance loss.

Contribution

The paper presents TetraJet-v2, a comprehensive 4-bit training approach that includes new algorithms for quantization, oscillation suppression, and outlier control, advancing low-precision LLM training.

Findings

01

Outperforms prior FP4 methods on models up to 370M parameters.

02

Reduces performance gap to BF16 by 51.3% on average.

03

Achieves 1.67x speedup over FP8 training.

Abstract

Large Language Models (LLMs) training is prohibitively expensive, driving interest in low-precision fully-quantized training (FQT). While novel 4-bit formats like NVFP4 offer substantial efficiency gains, achieving near-lossless training at such low precision remains challenging. We introduce TetraJet-v2, an end-to-end 4-bit FQT method that leverages NVFP4 for activations, weights, and gradients in all linear layers. We identify two critical issues hindering low-precision LLM training: weight oscillation and outliers. To address these, we propose: 1) an unbiased double-block quantization method for NVFP4 linear layers with practically optimal convergence in LLM training, 2) OsciReset, the first effective algorithm to suppress LLMs' weight oscillation bottleneck, and 3) OutControl, a mix-precision algorithm to retain outlier accuracy. TetraJet-v2 outperforms prior methods on FP4…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thu-ml/TetraJet-v2-NVFP4Training
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.