# Imbalanced Sentiment Classification Enhanced with Discourse Marker

**Authors:** Tao Zhang, Xing Wu, Meng Lin, Jizhong Han, Songlin Hu

arXiv: 1903.11919 · 2019-03-29

## TL;DR

This paper introduces a novel method leveraging discourse markers to improve sentiment classification on imbalanced datasets, enhancing data diversity and boosting classifier performance.

## Contribution

The paper proposes a new plug-and-play approach that samples discourse segments based on markers and validates sentiment, aiding data augmentation in imbalanced sentiment classification.

## Key findings

- Method improves performance on imbalanced datasets
- Effective across multiple sentiment datasets
- Integrates well with oversampling techniques

## Abstract

Imbalanced data commonly exists in real world, espacially in sentiment-related corpus, making it difficult to train a classifier to distinguish latent sentiment in text data. We observe that humans often express transitional emotion between two adjacent discourses with discourse markers like "but", "though", "while", etc, and the head discourse and the tail discourse 3 usually indicate opposite emotional tendencies. Based on this observation, we propose a novel plug-and-play method, which first samples discourses according to transitional discourse markers and then validates sentimental polarities with the help of a pretrained attention-based model. Our method increases sample diversity in the first place, can serve as a upstream preprocessing part in data augmentation. We conduct experiments on three public sentiment datasets, with several frequently used algorithms. Results show that our method is found to be consistently effective, even in highly imbalanced scenario, and easily be integrated with oversampling method to boost the performance on imbalanced sentiment classification.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.11919/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1903.11919/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/1903.11919/full.md

---
Source: https://tomesphere.com/paper/1903.11919