REC-RL: Referring expression counting via Gaussian and range-based reward optimization
Hui Liu, Yunlai Teng, Kunlong Bai, Pengfei Qi, Haotian Yan, Liang Li, Junlan Feng

TL;DR
REC-RL is a reinforcement learning framework that improves referring expression counting by explicitly optimizing visual reasoning with novel rewards and internal decision modeling.
Contribution
It introduces a think-range-answer paradigm with Gaussian and range-based rewards, enhancing reasoning quality without extra annotations.
Findings
Consistent improvements over strong baselines.
Robust generalization across benchmarks.
Effective optimization of visual reasoning process.
Abstract
Referring expression counting (REC) is an intention-driven task that requires context-aware visual reasoning. While recent vision-language models incorporate language for visual understanding, most existing REC methods rely on rulebased reinforcement learning with rewards focused primarily on final accuracy, overlooking the quality of intermediate reasoning. We propose REC-RL, a reinforcement learning framework that introduces a think-range-answer paradigm to explicitly optimize the visual reasoning process. RECRL employs Group Relative Policy Optimization and two lightweight rewards: an accuracy reward that combines range-based interval supervision with Gaussian-based precision guidance, and a format reward that enforces structured outputs. By modeling intermediate focus prediction as internal decision-making, REC-RL avoids additional annotations and better aligns with human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
