A Survey on Deep Text Hashing: Efficient Semantic Text Retrieval with Binary Representation
Liyang He, Zhenya Huang, Cheng Yang, Rui Li, Zheng Zhang, Kai Zhang, Zhi Li, Qi Liu, Enhong Chen

TL;DR
This survey reviews deep text hashing techniques that enable efficient semantic text retrieval using binary representations, highlighting recent advancements, evaluation methods, applications, and future research directions.
Contribution
It categorizes current deep text hashing methods, provides a comprehensive evaluation schema, and discusses integration with large language models for future improvements.
Findings
Deep text hashing significantly accelerates semantic similarity computation.
Deep neural networks learn compact, semantically rich binary codes.
Evaluation on popular datasets demonstrates the effectiveness of current methods.
Abstract
With the rapid growth of textual content on the Internet, efficient large-scale semantic text retrieval has garnered increasing attention from both academia and industry. Text hashing, which projects original texts into compact binary hash codes, is a crucial method for this task. By using binary codes, the semantic similarity computation for text pairs is significantly accelerated via fast Hamming distance calculations, and storage costs are greatly reduced. With the advancement of deep learning, deep text hashing has demonstrated significant advantages over traditional, data-independent hashing techniques. By leveraging deep neural networks, these methods can learn compact and semantically rich binary representations directly from data, overcoming the performance limitations of earlier approaches. This survey investigates current deep text hashing methods by categorizing them based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Algorithms and Data Compression · Handwritten Text Recognition Techniques
