The X Types -- Mapping the Semantics of the Twitter Sphere

Ogen Schlachet Drukerman; Einat Minkov

arXiv:2409.14584·cs.CL·September 24, 2024

The X Types -- Mapping the Semantics of the Twitter Sphere

Ogen Schlachet Drukerman, Einat Minkov

PDF

Open Access

TL;DR

This paper develops a method to assign semantic types to popular Twitter accounts by fine-tuning a transformer model with data aligned to DBpedia and Wikidata, enabling better understanding of the Twitter sphere.

Contribution

It introduces a novel approach to infer semantic types for social media entities using transformer-based embeddings and network information, filling a gap in social media knowledge bases.

Findings

01

High accuracy in semantic type prediction on labeled data

02

Effective application of the model to all entities in the social KB

03

Enhanced entity similarity assessment using semantic embeddings

Abstract

Social networks form a valuable source of world knowledge, where influential entities correspond to popular accounts. Unlike factual knowledge bases (KBs), which maintain a semantic ontology, structured semantic information is not available on social media. In this work, we consider a social KB of roughly 200K popular Twitter accounts, which denotes entities of interest. We elicit semantic information about those entities. In particular, we associate them with a fine-grained set of 136 semantic types, e.g., determine whether a given entity account belongs to a politician, or a musical artist. In the lack of explicit type information in Twitter, we obtain semantic labels for a subset of the accounts via alignment with the KBs of DBpedia and Wikidata. Given the labeled dataset, we finetune a transformer-based text encoder to generate semantic embeddings of the entities based on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimedia Communication and Technology

MethodsSparse Evolutionary Training