RAVEN++: Pinpointing Fine-Grained Violations in Advertisement Videos with Active Reinforcement Reasoning
Deyi Ji, Yuekui Yang, Liqun Liu, Peng Shu, Haiyang Wu, Shaogang Tang, Xudong Chen, Shaoping Ma, Tianrun Chen, Lanyun Zhu

TL;DR
RAVEN++ is a framework that enhances fine-grained violation detection in advertisement videos by integrating active reinforcement learning, hierarchical reasoning, and multi-stage training, leading to improved accuracy and generalization.
Contribution
It introduces active reinforcement learning, hierarchical reasoning, and progressive training to significantly improve fine-grained violation understanding in ad videos.
Findings
Outperforms RAVEN and general-purpose LLMs in violation detection
Demonstrates superior reasoning and generalization on multiple datasets
Effective in both offline and online deployment scenarios
Abstract
Advertising (Ad) is a cornerstone of the digital economy, yet the moderation of video advertisements remains a significant challenge due to their complexity and the need for precise violation localization. While recent advancements, such as the RAVEN model, have improved coarse-grained violation detection, critical gaps persist in fine-grained understanding, explainability, and generalization. To address these limitations, we propose RAVEN++, a novel framework that introduces three key innovations: 1) Active Reinforcement Learning (RL), which dynamically adapts training to samples of varying difficulty; 2) Fine-Grained Violation Understanding, achieved through hierarchical reward functions and reasoning distillation; and 3) Progressive Multi-Stage Training, which systematically combines knowledge injection, curriculum-based passive RL, and active RL. Extensive experiments on both public…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Adversarial Robustness in Machine Learning
