Loading paper
PointArena: Probing Multimodal Grounding Through Language-Guided Pointing | Tomesphere