Synthesis and Evaluation of a Domain-specific Large Data Set for   Dungeons & Dragons

Akila Peiris; Nisansa de Silva

arXiv:2212.09080·cs.CL·December 20, 2022·1 cites

Synthesis and Evaluation of a Domain-specific Large Data Set for Dungeons & Dragons

Akila Peiris, Nisansa de Silva

PDF

Open Access 1 Repo

TL;DR

This paper introduces the Forgotten Realms Wiki dataset, a large, multi-format collection of Dungeons & Dragons lore, and demonstrates its use in domain-specific natural language generation and similarity benchmarking.

Contribution

It provides the first large-scale, multi-format dataset for D&D, enabling advanced NLP tasks and domain-specific language generation in this fantasy setting.

Findings

01

The dataset includes over 45,200 articles in various formats.

02

A pairwise similarity benchmark was established using the dataset.

03

Domain-specific natural language generation was successfully demonstrated.

Abstract

This paper introduces the Forgotten Realms Wiki (FRW) data set and domain specific natural language generation using FRW along with related analyses. Forgotten Realms is the de-facto default setting of the popular open ended tabletop fantasy role playing game, Dungeons & Dragons. The data set was extracted from the Forgotten Realms Fandom wiki consisting of more than over 45,200 articles. The FRW data set is constituted of 11 sub-data sets in a number of formats: raw plain text, plain text annotated by article title, directed link graphs, wiki info-boxes annotated by the wiki article title, Poincar\'e embedding of first link graph, multiple Word2Vec and Doc2Vec models of the corpus. This is the first data set of this size for the Dungeons & Dragons domain. We then present a pairwise similarity comparison benchmark which utilizes similarity measures. In addition, we perform D&D domain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://huggingface.co/Akila/ForgottenRealmsFreeTextGenerator
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWikis in Education and Collaboration · Digital Games and Media · Topic Modeling