# Improving Context-Aware Semantic Relationships in Sparse Mobile Datasets

**Authors:** Peter Hansel, Nik Marda, William Yin

arXiv: 1812.09650 · 2018-12-27

## TL;DR

This paper introduces new algorithms that incorporate multimodal data and external context, such as time and location, into sentence embeddings to improve semantic similarity detection in sparse mobile datasets, demonstrated on Twitter data.

## Contribution

The paper presents novel algorithms that integrate multimodal features and external context into sentence embeddings, enhancing semantic similarity measures in sparse datasets.

## Key findings

- PCA with eight components improves embedding quality.
- Multimodal features significantly enhance tweet similarity detection.
- External context integration outperforms text-only models.

## Abstract

Traditional semantic similarity models often fail to encapsulate the external context in which texts are situated. However, textual datasets generated on mobile platforms can help us build a truer representation of semantic similarity by introducing multimodal data. This is especially important in sparse datasets, making solely text-driven interpretation of context more difficult. In this paper, we develop new algorithms for building external features into sentence embeddings and semantic similarity scores. Then, we test them on embedding spaces on data from Twitter, using each tweet's time and geolocation to better understand its context. Ultimately, we show that applying PCA with eight components to the embedding space and appending multimodal features yields the best outcomes. This yields a considerable improvement over pure text-based approaches for discovering similar tweets. Our results suggest that our new algorithm can help improve semantic understanding in various settings.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.09650/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1812.09650/full.md

## References

10 references — full list in the complete paper: https://tomesphere.com/paper/1812.09650/full.md

---
Source: https://tomesphere.com/paper/1812.09650