Tube2Vec: Social and Semantic Embeddings of YouTube Channels
L\'eopaul Boesinger, Manoel Horta Ribeiro, Veniamin Veselovsky, Robert, West

TL;DR
This paper introduces machine learning-based embeddings for YouTube channels that capture social sharing behavior and semantic content, providing a scalable alternative to manual annotation for analyzing YouTube data.
Contribution
It presents a novel method for creating social and semantic embeddings of YouTube channels using large-scale data and machine learning, enabling richer analysis.
Findings
Recommendation embeddings effectively capture social and semantic dimensions.
Social-sharing embeddings correlate with partisan scores.
Embeddings for 44,000 channels are publicly available.
Abstract
Research using YouTube data often explores social and semantic dimensions of channels and videos. Typically, analyses rely on laborious manual annotation of content and content creators, often found by low-recall methods such as keyword search. Here, we explore an alternative approach, using latent representations (embeddings) obtained via machine learning. Using a large dataset of YouTube links shared on Reddit; we create embeddings that capture social sharing behavior, video metadata (title, description, etc.), and YouTube's video recommendations. We evaluate these embeddings using crowdsourcing and existing datasets, finding that recommendation embeddings excel at capturing both social and semantic dimensions, although social-sharing embeddings better correlate with existing partisan scores. We share embeddings capturing the social and semantic dimensions of 44,000 YouTube channels…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Social Media and Politics · Media Influence and Politics
