SEMINAR: Search Enhanced Multi-modal Interest Network and Approximate   Retrieval for Lifelong Sequential Recommendation

Kaiming Shen; Xichen Ding; Zixiang Zheng; Yuqi Gong; Qianqian Li,; Zhongyi Liu; Guannan Zhang

arXiv:2407.10714·cs.IR·July 16, 2024

SEMINAR: Search Enhanced Multi-modal Interest Network and Approximate Retrieval for Lifelong Sequential Recommendation

Kaiming Shen, Xichen Ding, Zixiang Zheng, Yuqi Gong, Qianqian Li,, Zhongyi Liu, Guannan Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces SEMINAR, a unified model for lifelong multi-modal sequence recommendation that improves user interest modeling, aligns multi-modal embeddings, and accelerates retrieval with approximate methods.

Contribution

The paper proposes SEMINAR, a novel lifelong multi-modal sequence model with a pretraining-finetuning framework and a codebook-based retrieval strategy for recommendation systems.

Findings

01

Effective multi-modal alignment achieved

02

Improved lifelong sequence modeling performance

03

Fast approximate retrieval with codebook strategy

Abstract

The modeling of users' behaviors is crucial in modern recommendation systems. A lot of research focuses on modeling users' lifelong sequences, which can be extremely long and sometimes exceed thousands of items. These models use the target item to search for the most relevant items from the historical sequence. However, training lifelong sequences in click through rate (CTR) prediction or personalized search ranking (PSR) is extremely difficult due to the insufficient learning problem of ID embedding, especially when the IDs in the lifelong sequence features do not exist in the samples of training dataset. Additionally, existing target attention mechanisms struggle to learn the multi-modal representations of items in the sequence well. The distribution of multi-modal embedding (text, image and attributes) output of user's interacted items are not properly aligned and there exist…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

paper-submission-coder/seminar
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Machine Learning in Healthcare

MethodsSoftmax · Attention Is All You Need · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings