Going Beyond T-SNE: Exposing \texttt{whatlies} in Text Embeddings
Vincent D. Warmerdam, Thomas Kober, Rachael Tatman

TL;DR
whatlies is an open source toolkit that enables visual inspection and analysis of word and sentence embeddings across multiple backends, combining vector arithmetic with visualization tools for better interpretability.
Contribution
It introduces a unified, extensible API and visualization suite for exploring embeddings, supporting various backends and dimensionality reduction techniques.
Findings
Enhanced interpretability of embeddings through visualization
Support for multiple embedding backends and techniques
Interactive visualizations easily shareable via Jupyter notebooks
Abstract
We introduce whatlies, an open source toolkit for visually inspecting word and sentence embeddings. The project offers a unified and extensible API with current support for a range of popular embedding backends including spaCy, tfhub, huggingface transformers, gensim, fastText and BytePair embeddings. The package combines a domain specific language for vector arithmetic with visualisation tools that make exploring word embeddings more intuitive and concise. It offers support for many popular dimensionality reduction techniques as well as many interactive visualisations that can either be statically exported or shared via Jupyter notebooks. The project documentation is available from https://rasahq.github.io/whatlies/.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Authorship Attribution and Profiling · Hate Speech and Cyberbullying Detection
MethodsfastText
