Loading paper
Optimizer-Model Consistency: Full Finetuning with the Same Optimizer as Pretraining Forgets Less | Tomesphere