Dice Loss for Data-imbalanced NLP Tasks
Xiaoya Li, Xiaofei Sun, Yuxian Meng, Junjun Liang, Fei Wu, Jiwei Li

TL;DR
This paper introduces dice loss as a novel training objective for data-imbalanced NLP tasks, effectively addressing class imbalance issues and improving performance across multiple benchmarks.
Contribution
It proposes using dice loss with dynamic weighting to better handle data imbalance in NLP, narrowing the gap between training objectives and evaluation metrics.
Findings
Achieved state-of-the-art results on POS tagging benchmarks.
Attained state-of-the-art results on NER benchmarks.
Demonstrated significant performance improvements on various NLP tasks.
Abstract
Many NLP tasks such as tagging and machine reading comprehension are faced with the severe data imbalance issue: negative examples significantly outnumber positive examples, and the huge number of background examples (or easy-negative examples) overwhelms the training. The most commonly used cross entropy (CE) criteria is actually an accuracy-oriented objective, and thus creates a discrepancy between training and test: at training time, each training instance contributes equally to the objective function, while at test time F1 score concerns more about positive examples. In this paper, we propose to use dice loss in replacement of the standard cross-entropy objective for data-imbalanced NLP tasks. Dice loss is based on the Sorensen-Dice coefficient or Tversky index, which attaches similar importance to false positives and false negatives, and is more immune to the data-imbalance issue.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsTest · Dice Loss
