Hessian-aware Quantized Node Embeddings for Recommendation

Huiyuan Chen; Kaixiong Zhou; Kwei-Herng Lai; Chin-Chia Michael Yeh,; Yan Zheng; Xia Hu; Hao Yang

arXiv:2309.01032·cs.IR·September 6, 2023

Hessian-aware Quantized Node Embeddings for Recommendation

Huiyuan Chen, Kaixiong Zhou, Kwei-Herng Lai, Chin-Chia Michael Yeh,, Yan Zheng, Xia Hu, Hao Yang

PDF

TL;DR

This paper introduces HQ-GNN, a Hessian-aware quantized graph neural network that compresses node embeddings into low-bit representations, reducing memory and inference time while maintaining high recommendation accuracy.

Contribution

The paper proposes a novel Hessian-aware quantization method for GNNs that improves gradient stability and performance in discrete node embedding representations.

Findings

01

HQ-GNN reduces memory usage and inference latency.

02

It achieves comparable recommendation accuracy to full-precision GNNs.

03

The method demonstrates effectiveness on large-scale datasets.

Abstract

Graph Neural Networks (GNNs) have achieved state-of-the-art performance in recommender systems. Nevertheless, the process of searching and ranking from a large item corpus usually requires high latency, which limits the widespread deployment of GNNs in industry-scale applications. To address this issue, many methods compress user/item representations into the binary embedding space to reduce space requirements and accelerate inference. Also, they use the Straight-through Estimator (STE) to prevent vanishing gradients during back-propagation. However, the STE often causes the gradient mismatch problem, leading to sub-optimal results. In this work, we present the Hessian-aware Quantized GNN (HQ-GNN) as an effective solution for discrete representations of users/items that enable fast retrieval. HQ-GNN is composed of two components: a GNN encoder for learning continuous node embeddings…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.