Loading paper
Extending RLVR to Open-Ended Tasks via Verifiable Multiple-Choice Reformulation | Tomesphere