Loading paper
Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reasoning? | Tomesphere