Loading paper
Scaling Law for Language Models Training Considering Batch Size | Tomesphere