C2F-Thinker: Coarse-to-Fine Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis
Miaosen Luo, Zhenhao Yang, Jieshen Long, Jinghu Sun, Yichu Liu, Sijie Mai

TL;DR
C2F-Thinker introduces a structured coarse-to-fine reasoning framework with hint-guided reinforcement learning for improved multimodal sentiment analysis, enhancing interpretability and cross-domain robustness.
Contribution
It proposes a two-stage training pipeline combining supervised fine-tuning with CoT data and hint-guided RL to improve reasoning and generalization in sentiment analysis.
Findings
Achieves competitive results on fine-grained sentiment regression tasks.
Significantly outperforms baselines in cross-domain generalization.
Enhances interpretability through structured reasoning.
Abstract
Multimodal sentiment analysis aims to integrate textual, acoustic, and visual information for deep emotional understanding. Despite the progress of multimodal large language models (MLLMs) via supervised fine-tuning, their "black-box" nature hinders interpretability. While Chain-of-Thought (CoT) reasoning offers a potential remedy, it is constrained by high manual annotation costs and the inherent challenges of reinforcement learning (RL), such as reward sparsity and low exploration efficiency on hard samples. This paper presents C2F-Thinker, a framework that harmonizes coarse-to-fine structured reasoning with hint-guided RL through a two-stage progressive training pipeline. In the first stage, we conduct cold-start supervised fine-tuning using high-quality CoT data distilled from a larger teacher model, consisting of three distinct phases: polarity judgment, intermediate analysis, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
