Generative Retrieval with Semantic Tree-Structured Item Identifiers via   Contrastive Learning

Zihua Si; Zhongxiang Sun; Jiale Chen; Guozhang Chen; Xiaoxue Zang; Kai; Zheng; Yang Song; Xiao Zhang; Jun Xu; Kun Gai

arXiv:2309.13375·cs.IR·July 9, 2024·2 cites

Generative Retrieval with Semantic Tree-Structured Item Identifiers via Contrastive Learning

Zihua Si, Zhongxiang Sun, Jiale Chen, Guozhang Chen, Xiaoxue Zang, Kai, Zheng, Yang Song, Xiao Zhang, Jun Xu, Kun Gai

PDF

Open Access 1 Repo

TL;DR

This paper introduces SEATER, a generative retrieval framework that uses semantic tree-structured item identifiers and contrastive learning to improve efficiency and effectiveness in large-scale recommendation systems.

Contribution

SEATER employs a hierarchical tree structure for item identifiers and contrastive learning to enhance retrieval performance and speed in recommendation tasks.

Findings

01

SEATER outperforms state-of-the-art models on multiple datasets.

02

The hierarchical identifier structure maintains semantic consistency.

03

Contrastive learning improves retrieval accuracy.

Abstract

The retrieval phase is a vital component in recommendation systems, requiring the model to be effective and efficient. Recently, generative retrieval has become an emerging paradigm for document retrieval, showing notable performance. These methods enjoy merits like being end-to-end differentiable, suggesting their viability in recommendation. However, these methods fall short in efficiency and effectiveness for large-scale recommendations. To obtain efficiency and effectiveness, this paper introduces a generative retrieval framework, namely SEATER, which learns SEmAntic Tree-structured item identifiERs via contrastive learning. Specifically, we employ an encoder-decoder model to extract user interests from historical behaviors and retrieve candidates via tree-structured item identifiers. SEATER devises a balanced k-ary tree structure of item identifiers, allocating semantic space to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ethan00si/seater_generative_retrieval
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning

MethodsInfoNCE · Contrastive Learning · Triplet Loss · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings