Loading paper
LoopRPT: Reinforcement Pre-Training for Looped Language Models | Tomesphere