Automated Word Puzzle Generation via Topic Dictionaries
Balazs Pinter, Gyula Voros, Zoltan Szabo, Andras Lorincz

TL;DR
This paper introduces a versatile method for automatically generating various types of word puzzles from unstructured text data using topic models and semantic similarity measures, eliminating the need for structured datasets.
Contribution
The proposed approach uniquely generates diverse word puzzles directly from unannotated corpora, enabling domain-specific and difficulty-adjustable puzzles without extensive human annotation.
Findings
Successfully generates multiple puzzle types including odd one out and related word puzzles
Creates domain-specific puzzles by substituting corpus data
Produces puzzles with adjustable difficulty levels for different learners
Abstract
We propose a general method for automated word puzzle generation. Contrary to previous approaches in this novel field, the presented method does not rely on highly structured datasets obtained with serious human annotation effort: it only needs an unstructured and unannotated corpus (i.e., document collection) as input. The method builds upon two additional pillars: (i) a topic model, which induces a topic dictionary from the input corpus (examples include e.g., latent semantic analysis, group-structured dictionaries or latent Dirichlet allocation), and (ii) a semantic similarity measure of word pairs. Our method can (i) generate automatically a large number of proper word puzzles of different types, including the odd one out, choose the related word and separate the topics puzzle. (ii) It can easily create domain-specific puzzles by replacing the corpus component. (iii) It is also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques
