CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal Reasoning

Yongxin Wang; Zhicheng Yang; Meng Cao; Mingfei Han; Haokun Lin; Yingying Zhu; Xiaojun Chang; Xiaodan Liang

arXiv:2512.19554·cs.LG·March 17, 2026

CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal Reasoning

Yongxin Wang, Zhicheng Yang, Meng Cao, Mingfei Han, Haokun Lin, Yingying Zhu, Xiaojun Chang, Xiaodan Liang

PDF

Open Access

TL;DR

CARE introduces a novel failure-centric framework for multimodal reasoning that leverages errors as supervision, significantly improving accuracy and training stability on visual-reasoning benchmarks.

Contribution

The paper proposes CARE, a new method combining contrastive objectives and self-repair to enhance learning from failures in multimodal reasoning tasks.

Findings

01

Improves accuracy by 4.6 points on Qwen2.5-VL-7B benchmarks.

02

Achieves state-of-the-art results on MathVista and MMMU-Pro.

03

Enhances training smoothness and learning from failures.

Abstract

Group-relative reinforcement learning with verifiable rewards (RLVR) often wastes the most informative data it already has the failures. When all rollouts are wrong, gradients stall; when one happens to be correct, the update usually ignores why the others are close-but-wrong, and credit can be misassigned to spurious chains. We present CARE (Contrastive Anchored REflection), a failure-centric post-training framework for multimodal reasoning that turns errors into supervision. CARE combines: (i) an anchored-contrastive objective that forms a compact subgroup around the best rollout and a set of semantically proximate hard negatives, performs within-subgroup z-score normalization with negative-only scaling, and includes an all-negative rescue to prevent zero-signal batches; and (ii) Reflection-Guided Resampling (RGR), a one-shot structured self-repair that rewrites a representative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning