On-Device Large Language Models for Sequential Recommendation

Xin Xia; Hongzhi Yin; Shane Culpepper

arXiv:2601.09306·cs.IR·January 15, 2026

On-Device Large Language Models for Sequential Recommendation

Xin Xia, Hongzhi Yin, Shane Culpepper

PDF

Open Access

TL;DR

This paper introduces OD-LLM, a task-adaptive compression framework that enables efficient, on-device deployment of large language models for sequential recommendation without sacrificing accuracy.

Contribution

The paper presents a novel compression framework combining low-rank SVD and tokenization normalization, along with a progressive alignment algorithm for on-device LLM deployment.

Findings

01

No loss in recommendation effectiveness at 50% model size reduction.

02

OD-LLM significantly reduces memory and computational requirements.

03

Scalable and practical for real-time on-device recommendation systems.

Abstract

On-device recommendation is critical for a number of real-world applications, especially in scenarios that have agreements on execution latency, user privacy, and robust functionality when internet connectivity is unstable or even impossible. While large language models (LLMs) can now provide exceptional capabilities that model user behavior for sequential recommendation tasks, their substantial memory footprint and computational overhead make the deployment on resource-constrained devices a high risk proposition. In this paper, we propose OD-LLM, the first task-adaptive compression framework explicitly designed to provide efficient and accurate on-device deployment of LLMs for sequential recommendation tasks. OD-LLM uniquely integrates two complementary compression strategies: a low-rank structural compression algorithm which uses Singular Value Decomposition (SVD) to significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Big Data and Digital Economy · Mobile Crowdsensing and Crowdsourcing