Multimodal Label Relevance Ranking via Reinforcement Learning
Taian Guo, Taolin Zhang, Haoqian Wu, Hanjun Li, Ruizhi Qiao, Xing Sun

TL;DR
This paper introduces LR2PPO, a reinforcement learning approach for multimodal label relevance ranking that captures human preferences and reduces annotation needs, supported by a new benchmark dataset LRMovieNet.
Contribution
The paper proposes LR2PPO, a novel reinforcement learning method for label relevance ranking that incorporates partial order relations and introduces the LRMovieNet dataset for evaluation.
Findings
LR2PPO achieves state-of-the-art performance in label relevance ranking.
The method reduces the need for extensive partial order annotations.
Experimental results validate the effectiveness of LR2PPO on the LRMovieNet dataset.
Abstract
Conventional multi-label recognition methods often focus on label confidence, frequently overlooking the pivotal role of partial order relations consistent with human preference. To resolve these issues, we introduce a novel method for multimodal label relevance ranking, named Label Relevance Ranking with Proximal Policy Optimization (LR\textsuperscript{2}PPO), which effectively discerns partial order relations among labels. LR\textsuperscript{2}PPO first utilizes partial order pairs in the target domain to train a reward model, which aims to capture human preference intrinsic to the specific scenario. Furthermore, we meticulously design state representation and a policy loss tailored for ranking tasks, enabling LR\textsuperscript{2}PPO to boost the performance of label relevance ranking model and largely reduce the requirement of partial order annotation for transferring to new scenes.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFuzzy Logic and Control Systems · Text and Document Classification Technologies · Sentiment Analysis and Opinion Mining
MethodsFocus
