TL;DR
This paper introduces a method to learn data selection measures for transfer learning using Bayesian Optimization, improving performance across multiple NLP tasks by considering domain similarity and diversity.
Contribution
It proposes a novel approach to learn data selection measures with Bayesian Optimization, outperforming existing domain similarity metrics across various tasks.
Findings
Learned measures outperform existing similarity measures.
Incorporating diversity improves data selection.
Some transferability of learned measures across models and tasks.
Abstract
Domain similarity measures can be used to gauge adaptability and select suitable data for transfer learning, but existing approaches define ad hoc measures that are deemed suitable for respective tasks. Inspired by work on curriculum learning, we propose to \emph{learn} data selection measures using Bayesian Optimization and evaluate them across models, domains and tasks. Our learned measures outperform existing domain similarity measures significantly on three tasks: sentiment analysis, part-of-speech tagging, and parsing. We show the importance of complementing similarity with diversity, and that learned measures are -- to some degree -- transferable across models, domains, and even tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
