Linking Tweets with Monolingual and Cross-Lingual News using Transformed Word Embeddings
Aditya Mogadala, Dominik Jung, Achim Rettinger

TL;DR
This paper introduces a transformation framework that aligns word embeddings from tweets and news articles across languages, improving the ability to link social media content with traditional news sources.
Contribution
The novel framework effectively bridges the word usage gap between informal tweets and formal news articles across languages using transformed word embeddings.
Findings
Improved matching accuracy for monolingual tweets and news articles.
Enhanced cross-lingual comparison capabilities.
Significant performance gains over baseline methods.
Abstract
Social media platforms have grown into an important medium to spread information about an event published by the traditional media, such as news articles. Grouping such diverse sources of information that discuss the same topic in varied perspectives provide new insights. But the gap in word usage between informal social media content such as tweets and diligently written content (e.g. news articles) make such assembling difficult. In this paper, we propose a transformation framework to bridge the word usage gap between tweets and online news articles across languages by leveraging their word embeddings. Using our framework, word embeddings extracted from tweets and news articles are aligned closer to each other across languages, thus facilitating the identification of similarity between news articles and tweets. Experimental results show a notable improvement over baselines for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
