Loading paper
Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy | Tomesphere