TL;DR
This paper introduces the SOI framework to analyze training dynamics in pretrained language models across multiple settings, revealing how different training strategies affect model robustness and performance.
Contribution
It proposes the SOI categorization and visualization methods, providing new insights into training behaviors and a subset selection approach to improve model performance.
Findings
Multi-source learning enhances out-of-distribution robustness by up to 7%.
Multi-task learning yields mixed results, with gains in similar task pairs.
Two-stage fine-tuning with SOI-based subset selection improves performance.
Abstract
This work investigates the impact of multi-task, multi-lingual, and multi-source learning approaches on the robustness and performance of pretrained language models. To enhance this analysis, we introduce Subsets of Interest (SOI), a novel categorization framework that identifies six distinct learning behavior patterns during training, including forgettable examples, unlearned examples, and always correct examples. Through SOI transition heatmaps and dataset cartography visualization, we analyze how examples shift between these categories when transitioning from single-setting to multi-setting configurations. We perform comprehensive experiments across three parallel comparisons: multi-task vs. single-task learning using English tasks (entailment, paraphrase, sentiment), multi-source vs. single-source learning using sentiment analysis datasets, and multi-lingual vs. single-lingual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
