SketchVL: Policy Optimization via Fine-Grained Credit Assignment for Chart Understanding and More
Muye Huang, Lingling Zhang, Yifei Li, Yaqiang Wu, Jun Liu

TL;DR
SketchVL introduces a fine-grained reinforcement learning approach with intermediate reasoning annotations, significantly improving automated chart understanding by better credit assignment during training.
Contribution
It proposes SketchVL, a novel multimodal language model trained with FinePO, a new RL algorithm for fine-grained credit assignment at each reasoning step.
Findings
Achieves 7.23% performance improvement over base models.
Effectively aligns step-level reasoning with FinePRM scores.
Demonstrates robustness across chart, natural image, and math datasets.
Abstract
Charts are high-density visual carriers of complex data and medium for information extraction and analysis. Due to the need for precise and complex visual reasoning, automated chart understanding poses a significant challenge to existing Multimodal Large Language Models (MLLMs). Many MLLMs trained with reinforcement learning (RL) face the challenge of credit assignment. Their advantage estimation, typically performed at the trajectory level, cannot distinguish between correct and incorrect reasoning steps within a single generated response. To address this limitation, we introduce SketchVL, a novel MLLM that optimized with FinePO, a new RL algorithm designed for fine-grained credit assignment within each trajectory. SketchVL's methodology involves drawing its intermediate reasoning steps as markers on the image and feeding the annotated image back to itself, creating a robust,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Data Visualization and Analytics · Explainable Artificial Intelligence (XAI)
