Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation

Nikita Severin; Danil Kartushov; Vladislav Urzhumov; Vladislav Kulikov; Oksana Konovalova; Alexey Grishanov; Anton Klenitskiy; Artem Fatkulin; Alexey Vasilev; Andrey Savchenko; and Ilya Makarov

arXiv:2604.21536·cs.IR·April 24, 2026

Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation

Nikita Severin, Danil Kartushov, Vladislav Urzhumov, Vladislav Kulikov, Oksana Konovalova, Alexey Grishanov, Anton Klenitskiy, Artem Fatkulin, Alexey Vasilev, Andrey Savchenko, and Ilya Makarov

PDF

TL;DR

This paper introduces a knowledge distillation method that leverages textual user profiles from pre-trained LLMs to improve sequential recommenders, achieving rich user understanding without increasing inference costs.

Contribution

It proposes a novel approach to incorporate LLM-derived user profiles into sequential recommenders without needing LLM inference during deployment.

Findings

01

Maintains inference efficiency of traditional models.

02

No architectural modifications or LLM fine-tuning required.

03

Enhances user understanding in recommender systems.

Abstract

Sequential recommender systems have achieved significant success in modeling temporal user behavior but remain limited in capturing rich user semantics beyond interaction patterns. Large Language Models (LLMs) present opportunities to enhance user understanding with their reasoning capabilities, yet existing integration approaches create prohibitive inference costs in real time. To address these limitations, we present a novel knowledge distillation method that utilizes textual user profile generated by pre-trained LLMs into sequential recommenders without requiring LLM inference at serving time. The resulting approach maintains the inference efficiency of traditional sequential models while requiring neither architectural modifications nor LLM fine-tuning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.