How Cute is Pikachu? Gathering and Ranking Pok\'emon Properties from Data with Pok\'emon Word Embeddings
Mika H\"am\"al\"ainen, Khalid Alnajjar, Niko Partanen

TL;DR
This paper explores automatic methods to generate descriptive properties for Pokmon using word embeddings trained on a Pokmon-specific corpus, comparing different models and expanding property lists.
Contribution
It introduces domain-specific word embedding models for Pokmon and evaluates their effectiveness in ranking adjectives and expanding property lists.
Findings
Domain-specific models outperform pretrained ones.
Word2Vec yields less noise than fastText.
Automatic property expansion is feasible but noisy.
Abstract
We present different methods for obtaining descriptive properties automatically for the 151 original Pok\'emon. We train several different word embeddings models on a crawled Pok\'emon corpus, and use them to rank automatically English adjectives based on how characteristic they are to a given Pok\'emon. Based on our experiments, it is better to train a model with domain specific data than to use a pretrained model. Word2Vec produces less noise in the results than fastText model. Furthermore, we expand the list of properties for each Pok\'emon automatically. However, none of the methods is spot on and there is a considerable amount of noise in the different semantic models. Our models have been released on Zenodo.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Authorship Attribution and Profiling
MethodsfastText
