Loading paper
CHIRP: A Fine-Grained Benchmark for Open-Ended Response Evaluation in Vision-Language Models | Tomesphere