How Cute is Pikachu? Gathering and Ranking Pok\'emon Properties from   Data with Pok\'emon Word Embeddings

Mika H\"am\"al\"ainen; Khalid Alnajjar; Niko Partanen

arXiv:2108.09546·cs.CL·August 24, 2021

How Cute is Pikachu? Gathering and Ranking Pok\'emon Properties from Data with Pok\'emon Word Embeddings

Mika H\"am\"al\"ainen, Khalid Alnajjar, Niko Partanen

PDF

Open Access

TL;DR

This paper explores automatic methods to generate descriptive properties for Pokmon using word embeddings trained on a Pokmon-specific corpus, comparing different models and expanding property lists.

Contribution

It introduces domain-specific word embedding models for Pokmon and evaluates their effectiveness in ranking adjectives and expanding property lists.

Findings

01

Domain-specific models outperform pretrained ones.

02

Word2Vec yields less noise than fastText.

03

Automatic property expansion is feasible but noisy.

Abstract

We present different methods for obtaining descriptive properties automatically for the 151 original Pok\'emon. We train several different word embeddings models on a crawled Pok\'emon corpus, and use them to rank automatically English adjectives based on how characteristic they are to a given Pok\'emon. Based on our experiments, it is better to train a model with domain specific data than to use a pretrained model. Word2Vec produces less noise in the results than fastText model. Furthermore, we expand the list of properties for each Pok\'emon automatically. However, none of the methods is spot on and there is a considerable amount of noise in the different semantic models. Our models have been released on Zenodo.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Authorship Attribution and Profiling

MethodsfastText