Sliding Window Training -- Utilizing Historical Recommender Systems Data   for Foundation Models

Swanand Joshi; Yesu Feng; Ko-Jen Hsiao; Zhe Zhang; Sudarshan Lamkhede

arXiv:2409.14517·cs.IR·September 24, 2024

Sliding Window Training -- Utilizing Historical Recommender Systems Data for Foundation Models

Swanand Joshi, Yesu Feng, Ko-Jen Hsiao, Zhe Zhang, Sudarshan Lamkhede

PDF

TL;DR

This paper introduces a sliding window training method for recommender system foundation models that effectively incorporates long user histories during training without increasing input size, improving long-term preference learning and catalog quality.

Contribution

The paper proposes a novel sliding window training technique that enables learning from extensive user histories without enlarging model input dimensions.

Findings

01

Improves the learning of long-term user preferences.

02

Enhances the quality of items learned during pretraining.

03

Provides both quantitative and qualitative benefits.

Abstract

Long-lived recommender systems (RecSys) often encounter lengthy user-item interaction histories that span many years. To effectively learn long term user preferences, Large RecSys foundation models (FM) need to encode this information in pretraining. Usually, this is done by either generating a long enough sequence length to take all history sequences as input at the cost of large model input dimension or by dropping some parts of the user history to accommodate model size and latency requirements on the production serving side. In this paper, we introduce a sliding window training technique to incorporate long user history sequences during training time without increasing the model input dimension. We show the quantitative & qualitative improvements this technique brings to the RecSys FM in learning user long term preferences. We additionally show that the average quality of items in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.