Loading paper
CubeBench: Diagnosing Interactive, Long-Horizon Spatial Reasoning Under Partial Observations | Tomesphere