Loading paper
VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning? | Tomesphere