VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning

Ruifeng Yuan; Chenghao Xiao; Sicong Leng; Jianyu Wang; Long Li; Weiwen Xu; Hou Pong Chan; Deli Zhao; Tingyang Xu; Zhongyu Wei; Hao Zhang; Yu Rong

arXiv:2507.22607·cs.CV·August 1, 2025

VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning

Ruifeng Yuan, Chenghao Xiao, Sicong Leng, Jianyu Wang, Long Li, Weiwen Xu, Hou Pong Chan, Deli Zhao, Tingyang Xu, Zhongyu Wei, Hao Zhang, Yu Rong

PDF

Open Access 1 Models

TL;DR

VL-Cogito introduces a multi-stage curriculum reinforcement learning framework that systematically enhances multimodal reasoning in large language models, leading to improved performance across diverse complex tasks.

Contribution

The paper presents VL-Cogito, a novel multimodal reasoning model trained with a progressive curriculum RL framework featuring dynamic difficulty adjustment and reasoning path regulation.

Findings

01

VL-Cogito outperforms existing models on multimodal benchmarks.

02

The multi-stage curriculum improves reasoning stability and accuracy.

03

Dynamic mechanisms balance reasoning efficiency and correctness.

Abstract

Reinforcement learning has proven its effectiveness in enhancing the reasoning capabilities of large language models. Recent research efforts have progressively extended this paradigm to multimodal reasoning tasks. Due to the inherent complexity and diversity of multimodal tasks, especially in semantic content and problem formulations, existing models often exhibit unstable performance across various domains and difficulty levels. To address these limitations, we propose VL-Cogito, an advanced multimodal reasoning model trained via a novel multi-stage Progressive Curriculum Reinforcement Learning (PCuRL) framework. PCuRL systematically guides the model through tasks of gradually increasing difficulty, substantially improving its reasoning abilities across diverse multimodal contexts. The framework introduces two key innovations: (1) an online difficulty soft weighting mechanism,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
csyrf/VL-Cogito
model· 4 dl· ♡ 5
4 dl♡ 5

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInnovative Teaching and Learning Methods · Multi-Agent Systems and Negotiation · Natural Language Processing Techniques