Short Text Hashing Improved by Integrating Multi-Granularity Topics and   Tags

Jiaming Xu; Bo Xu; Guanhua Tian; Jun Zhao; Fangyuan Wang; Hongwei Hao

arXiv:1503.02801·cs.IR·April 14, 2015

Short Text Hashing Improved by Integrating Multi-Granularity Topics and Tags

Jiaming Xu, Bo Xu, Guanhua Tian, Jun Zhao, Fangyuan Wang, Hongwei Hao

PDF

1 Repo

TL;DR

This paper introduces HMTT, a novel short text hashing method that integrates multi-granularity topics and tags to improve semantic similarity preservation, outperforming existing methods on various datasets.

Contribution

The paper proposes a unified approach combining multi-granularity topic selection and tag exploitation for enhanced short text hashing.

Findings

01

HMTT significantly outperforms baseline methods on evaluation metrics.

02

Optimal multi-granularity topic selection depends on dataset type.

03

Incorporating tags improves semantic similarity in hash codes.

Abstract

Due to computational and storage efficiencies of compact binary codes, hashing has been widely used for large-scale similarity search. Unfortunately, many existing hashing methods based on observed keyword features are not effective for short texts due to the sparseness and shortness. Recently, some researchers try to utilize latent topics of certain granularity to preserve semantic similarity in hash codes beyond keyword matching. However, topics of certain granularity are not adequate to represent the intrinsic semantic information. In this paper, we present a novel unified approach for short text Hashing using Multi-granularity Topics and Tags, dubbed HMTT. In particular, we propose a selection method to choose the optimal multi-granularity topics depending on the type of dataset, and design two distinct hashing strategies to incorporate multi-granularity topics. We also propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jacoxu/short-text-hashing-HMTT
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.