Loading paper
Understanding AdamW through Proximal Methods and Scale-Freeness | Tomesphere