Loading paper
To FP8 and Back Again: Quantifying Reduced Precision Effects on LLM Training Stability | Tomesphere