# Combination of Domain Knowledge and Deep Learning for Sentiment Analysis   of Short and Informal Messages on Social Media

**Authors:** Khuong Vo, Tri Nguyen, Dang Pham, Mao Nguyen, Minh Truong, Trung Mai,, Tho Quan

arXiv: 1902.06050 · 2019-12-23

## TL;DR

This paper enhances sentiment analysis of short, informal social media messages by integrating domain knowledge with deep learning, employing data augmentation, transfer learning, and multitask learning for improved accuracy.

## Contribution

It introduces novel enhancements to existing models, including negation-based data augmentation and multitask learning, specifically tailored for social media sentiment analysis.

## Key findings

- Significant performance improvements on real social media datasets.
- Effective handling of short, informal messages with combined techniques.
- Enhanced model robustness through domain knowledge integration.

## Abstract

Sentiment analysis has been emerging recently as one of the major natural language processing (NLP) tasks in many applications. Especially, as social media channels (e.g. social networks or forums) have become significant sources for brands to observe user opinions about their products, this task is thus increasingly crucial. However, when applied with real data obtained from social media, we notice that there is a high volume of short and informal messages posted by users on those channels. This kind of data makes the existing works suffer from many difficulties to handle, especially ones using deep learning approaches. In this paper, we propose an approach to handle this problem. This work is extended from our previous work, in which we proposed to combine the typical deep learning technique of Convolutional Neural Networks with domain knowledge. The combination is used for acquiring additional training data augmentation and a more reasonable loss function. In this work, we further improve our architecture by various substantial enhancements, including negation-based data augmentation, transfer learning for word embeddings, the combination of word-level embeddings and character-level embeddings, and using multitask learning technique for attaching domain knowledge rules in the learning process. Those enhancements, specifically aiming to handle short and informal messages, help us to enjoy significant improvement in performance once experimenting on real datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.06050/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/1902.06050/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/1902.06050/full.md

---
Source: https://tomesphere.com/paper/1902.06050