A Survey of Text Representation Methods and Their Genealogy
Philipp Siebers, Christian Janiesch, Patrick Zschech

TL;DR
This paper provides a comprehensive survey and taxonomy of modern text representation methods, highlighting their evolution, interrelations, and significance in natural language processing applications.
Contribution
It systematically compiles, arranges in a genealogy, and conceptualizes a taxonomy of current text representation approaches, addressing the rapid evolution in the field.
Findings
Provides a detailed survey of recent text representation methods.
Creates a genealogy illustrating the evolution of these methods.
Develops a taxonomy to classify and understand current approaches.
Abstract
In recent years, with the advent of highly scalable artificial-neural-network-based text representation methods the field of natural language processing has seen unprecedented growth and sophistication. It has become possible to distill complex linguistic information of text into multidimensional dense numeric vectors with the use of the distributional hypothesis. As a consequence, text representation methods have been evolving at such a quick pace that the research community is struggling to retain knowledge of the methods and their interrelations. We contribute threefold to this lack of compilation, composition, and systematization by providing a survey of current approaches, by arranging them in a genealogy, and by conceptualizing a taxonomy of text representation methods to examine and explain the state-of-the-art. Our research is a valuable guide and reference for artificial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
