EmojiNet: An Open Service and API for Emoji Sense Discovery
Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran

TL;DR
EmojiNet is a comprehensive, open API and dataset linking Unicode emojis to their English meanings, context, and platform-specific senses, enabling improved emoji understanding and disambiguation.
Contribution
This paper introduces EmojiNet, the largest machine-readable emoji sense inventory with web-extracted senses, context words, and platform-specific interpretations, accessible via an open REST API.
Findings
EmojiNet contains 12,904 sense labels for 2,389 emoji.
The dataset links emoji senses to BabelNet definitions.
Applications include emoji sense disambiguation and similarity.
Abstract
This paper presents the release of EmojiNet, the largest machine-readable emoji sense inventory that links Unicode emoji representations to their English meanings extracted from the Web. EmojiNet is a dataset consisting of: (i) 12,904 sense labels over 2,389 emoji, which were extracted from the web and linked to machine-readable sense definitions seen in BabelNet, (ii) context words associated with each emoji sense, which are inferred through word embedding models trained over Google News corpus and a Twitter message corpus for each emoji sense definition, and (iii) recognizing discrepancies in the presentation of emoji on different platforms, specification of the most likely platform-based emoji sense for a selected set of emoji. The dataset is hosted as an open service with a REST API and is available at http://emojinet.knoesis.org/. The development of this dataset, evaluation of its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Digital Communication and Language
