Loading paper
ESPIRE: A Diagnostic Benchmark for Embodied Spatial Reasoning of Vision-Language Models | Tomesphere