# Efficient Social Network Multilingual Classification using Character,   POS n-grams and Dynamic Normalization

**Authors:** Carlos-Emiliano Gonz\'alez-Gallardo, Juan-Manuel Torres-Moreno,, Azucena Montes Rend\'on, Gerardo Sierra

arXiv: 1702.06467 · 2017-02-22

## TL;DR

This paper introduces a dynamic normalization method for multilingual social media texts that enhances author profiling accuracy by extracting stylistic features through character and POS n-grams, achieving up to 90% performance with SVM.

## Contribution

It presents a novel dynamic normalization technique combined with stylistic feature extraction for improved multilingual author profiling on social media texts.

## Key findings

- Achieved up to 90% performance with SVM.
- Effective normalization improves stylistic feature extraction.
- Method applicable to short, informal social media texts.

## Abstract

In this paper we describe a dynamic normalization process applied to social network multilingual documents (Facebook and Twitter) to improve the performance of the Author profiling task for short texts. After the normalization process, $n$-grams of characters and n-grams of POS tags are obtained to extract all the possible stylistic information encoded in the documents (emoticons, character flooding, capital letters, references to other users, hyperlinks, hashtags, etc.). Experiments with SVM showed up to 90% of performance.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.06467/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1702.06467/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/1702.06467/full.md

---
Source: https://tomesphere.com/paper/1702.06467