On Semantic Word Cloud Representation
Lukas Barth, Stephen Kobourov, Sergey Pupyrev, Torsten Ueckerdt

TL;DR
This paper formalizes the problem of creating semantic-preserving word clouds as the Word Rectangle Adjacency Contact problem, providing algorithms and demonstrating their effectiveness over heuristics.
Contribution
It introduces the WRAC problem, analyzes its computational complexity, and offers efficient algorithms with proven performance guarantees.
Findings
Polynomial-time algorithms for some WRAC variants.
NP-hardness results for general WRAC variants.
Experimental evidence showing improved results over heuristics.
Abstract
We study the problem of computing semantic-preserving word clouds in which semantically related words are close to each other. While several heuristic approaches have been described in the literature, we formalize the underlying geometric algorithm problem: Word Rectangle Adjacency Contact (WRAC). In this model each word is associated with rectangle with fixed dimensions, and the goal is to represent semantically related words by ensuring that the two corresponding rectangles touch. We design and analyze efficient polynomial-time algorithms for some variants of the WRAC problem, show that several general variants are NP-hard, and describe a number of approximation algorithms. Finally, we experimentally demonstrate that our theoretically-sound algorithms outperform the early heuristics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Web Data Mining and Analysis · Graph Theory and Algorithms
