Loading paper
Towards On-Policy SFT: Distribution Discriminant Theory and its Applications in LLM Training | Tomesphere