Enhancing Bandit Algorithms with LLMs for Time-varying User Preferences in Streaming Recommendations

Chenglei Shen; Yi Zhan; Weijie Yu; Xiao Zhang; Jun Xu

arXiv:2602.08067·cs.LG·February 10, 2026

Enhancing Bandit Algorithms with LLMs for Time-varying User Preferences in Streaming Recommendations

Chenglei Shen, Yi Zhan, Weijie Yu, Xiao Zhang, Jun Xu

PDF

Open Access

TL;DR

This paper introduces HyperBandit+, a novel bandit algorithm that incorporates time-aware hypernetworks and LLM-assisted warm-starts to adapt to evolving user preferences in streaming recommendations, improving exploration and overall performance.

Contribution

The paper proposes HyperBandit+, integrating a time-aware hypernetwork and LLM-based warm-start to better handle time-varying preferences and enhance early exploration in streaming recommender systems.

Findings

01

HyperBandit+ outperforms state-of-the-art baselines in real-world datasets.

02

Theoretical regret bounds are established for the proposed method.

03

Low-rank factorization reduces training complexity without sacrificing performance.

Abstract

In real-world streaming recommender systems, user preferences evolve dynamically over time. Existing bandit-based methods treat time merely as a timestamp, neglecting its explicit relationship with user preferences and leading to suboptimal performance. Moreover, online learning methods often suffer from inefficient exploration-exploitation during the early online phase. To address these issues, we propose HyperBandit+, a novel contextual bandit policy that integrates a time-aware hypernetwork to adapt to time-varying user preferences and employs a large language model-assisted warm-start mechanism (LLM Start) to enhance exploration-exploitation efficiency in the early online phase. Specifically, HyperBandit+ leverages a neural network that takes time features as input and generates parameters for estimating time-varying rewards by capturing the correlation between time and user…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Mobile Crowdsensing and Crowdsourcing