Loading paper
LMR-BENCH: Evaluating LLM Agent's Ability on Reproducing Language Modeling Research | Tomesphere