Loading paper
SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents | Tomesphere