Loading paper
More Than the Final Answer: Improving Visual Extraction and Logical Consistency in Vision-Language Models | Tomesphere