# Social Media-based User Embedding: A Literature Review

**Authors:** Shimei Pan, Tao Ding

arXiv: 1907.00725 · 2019-07-02

## TL;DR

This paper reviews recent methods for creating low-dimensional social media user embeddings, highlighting their importance for modeling human traits and behaviors when large-scale ground truth data is scarce.

## Contribution

It provides a comprehensive survey of techniques for learning unified user embeddings from heterogeneous social media data, and discusses current challenges and future research directions.

## Key findings

- Summarizes key methods for social media user embedding.
- Identifies challenges in heterogeneous data integration.
- Suggests future research directions in the field.

## Abstract

Automated representation learning is behind many recent success stories in machine learning. It is often used to transfer knowledge learned from a large dataset (e.g., raw text) to tasks for which only a small number of training examples are available. In this paper, we review recent advance in learning to represent social media users in low-dimensional embeddings. The technology is critical for creating high performance social media-based human traits and behavior models since the ground truth for assessing latent human traits and behavior is often expensive to acquire at a large scale. In this survey, we review typical methods for learning a unified user embeddings from heterogeneous user data (e.g., combines social media texts with images to learn a unified user representation). Finally we point out some current issues and future directions.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.00725/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1907.00725/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/1907.00725/full.md

---
Source: https://tomesphere.com/paper/1907.00725