Loading paper
Training Dynamics of a 1.7B LLaMa Model: A Data-Efficient Approach | Tomesphere