Loading paper
Can Vision-Language Models Count? A Synthetic Benchmark and Analysis of Attention-Based Interventions | Tomesphere