Learning to Selectively Transfer: Reinforced Transfer Learning for Deep Text Matching
Chen Qu, Feng Ji, Minghui Qiu, Liu Yang, Zhiyu Min, Haiqing Chen, Jun, Huang, W. Bruce Croft

TL;DR
This paper introduces a reinforced data selection method for transfer learning in deep text matching, jointly training a data selector and transfer model to improve performance and reduce negative transfer.
Contribution
It proposes a novel actor-critic based reinforced data selector integrated with transfer learning, optimizing data selection for better transferability and model performance.
Findings
Significant performance improvements in text matching tasks.
Effective selection of source data close to target domain.
Robustness across different settings and tasks.
Abstract
Deep text matching approaches have been widely studied for many applications including question answering and information retrieval systems. To deal with a domain that has insufficient labeled data, these approaches can be used in a Transfer Learning (TL) setting to leverage labeled data from a resource-rich source domain. To achieve better performance, source domain data selection is essential in this process to prevent the "negative transfer" problem. However, the emerging deep transfer models do not fit well with most existing data selection methods, because the data selection policy and the transfer learning model are not jointly trained, leading to sub-optimal training efficiency. In this paper, we propose a novel reinforced data selector to select high-quality source domain data to help the TL model. Specifically, the data selector "acts" on the source domain data to find a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
