A Data Cartography based MixUp for Pre-trained Language Models
Seo Yeon Park, Cornelia Caragea

TL;DR
This paper introduces TDMixUp, a novel data augmentation method for pre-trained language models that uses training dynamics to select more informative sample pairs, improving performance and calibration.
Contribution
It proposes TDMixUp, which leverages training dynamics like confidence and AUM to enhance MixUp for NLP, showing improved calibration and efficiency.
Findings
Achieves competitive performance with less data
Yields lower calibration error on BERT
Effective in both in-domain and out-of-domain tasks
Abstract
MixUp is a data augmentation strategy where additional samples are generated during training by combining random pairs of training samples and their labels. However, selecting random pairs is not potentially an optimal choice. In this work, we propose TDMixUp, a novel MixUp strategy that leverages Training Dynamics and allows more informative samples to be combined for generating new data samples. Our proposed TDMixUp first measures confidence, variability, (Swayamdipta et al., 2020), and Area Under the Margin (AUM) (Pleiss et al., 2020) to identify the characteristics of training samples (e.g., as easy-to-learn or ambiguous samples), and then interpolates these characterized samples. We empirically validate that our method not only achieves competitive performance using a smaller subset of the training data compared with strong baselines, but also yields lower expected calibration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Attention Dropout · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Dense Connections · Dropout
