Dice Loss for Data-imbalanced NLP Tasks

Xiaoya Li; Xiaofei Sun; Yuxian Meng; Junjun Liang; Fei Wu; Jiwei Li

arXiv:1911.02855·cs.CL·September 1, 2020·32 cites

Dice Loss for Data-imbalanced NLP Tasks

Xiaoya Li, Xiaofei Sun, Yuxian Meng, Junjun Liang, Fei Wu, Jiwei Li

PDF

Open Access 4 Repos 1 Models

TL;DR

This paper introduces dice loss as a novel training objective for data-imbalanced NLP tasks, effectively addressing class imbalance issues and improving performance across multiple benchmarks.

Contribution

It proposes using dice loss with dynamic weighting to better handle data imbalance in NLP, narrowing the gap between training objectives and evaluation metrics.

Findings

01

Achieved state-of-the-art results on POS tagging benchmarks.

02

Attained state-of-the-art results on NER benchmarks.

03

Demonstrated significant performance improvements on various NLP tasks.

Abstract

Many NLP tasks such as tagging and machine reading comprehension are faced with the severe data imbalance issue: negative examples significantly outnumber positive examples, and the huge number of background examples (or easy-negative examples) overwhelms the training. The most commonly used cross entropy (CE) criteria is actually an accuracy-oriented objective, and thus creates a discrepancy between training and test: at training time, each training instance contributes equally to the objective function, while at test time F1 score concerns more about positive examples. In this paper, we propose to use dice loss in replacement of the standard cross-entropy objective for data-imbalanced NLP tasks. Dice loss is based on the Sorensen-Dice coefficient or Tversky index, which attaches similar importance to false positives and false negatives, and is more immune to the data-imbalance issue.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
Derify/ChemMRL-alpha
model· 3 dl
3 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsTest · Dice Loss