Boosting RL-Based Visual Reasoning with Selective Adversarial Entropy Intervention
Yang Yu, Zhuangzhuang Chen, Siqi Wang, Lanqing Li, Xiaomeng Li

TL;DR
This paper introduces SaEI, a novel method that enhances RL-based visual reasoning by distorting visual inputs through adversarial entropy intervention, leading to improved policy exploration and reasoning performance.
Contribution
It proposes a new entropy-guided adversarial sampling and token-selective entropy computation to improve exploration in RL for visual reasoning tasks.
Findings
Significant improvement in reasoning accuracy on multiple datasets.
Enhanced policy exploration due to adversarial entropy intervention.
Effective in both in-domain and out-of-domain scenarios.
Abstract
Recently, reinforcement learning (RL) has become a common choice in enhancing the reasoning capabilities of vision-language models (VLMs). Considering existing RL-based finetuning methods, entropy intervention turns out to be an effective way to benefit exploratory ability, thereby improving policy performance. Notably, most existing studies intervene in entropy by simply controlling the update of specific tokens during policy optimization of RL. They ignore the entropy intervention during the RL sampling that can boost the performance of GRPO by improving the diversity of responses. In this paper, we propose Selective-adversarial Entropy Intervention, namely SaEI, which enhances policy entropy by distorting the visual input with the token-selective adversarial objective coming from the entropy of sampled responses. Specifically, we first propose entropy-guided adversarial sampling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Ethics and Social Impacts of AI
