Inference Time Feature Injection: A Lightweight Approach for Real-Time Recommendation Freshness
Qiang Chen, Venkatesh Ganapati Hegde, Hongfei Li

TL;DR
This paper introduces a lightweight, inference-time feature injection method for real-time personalization in video streaming recommender systems, significantly improving user engagement without retraining models.
Contribution
It proposes a novel, model-agnostic approach for intra-day personalization by selectively injecting recent user watch history at inference time, enhancing recommendation freshness.
Findings
0.47% increase in user engagement metrics
First evidence of intra-day personalization impact in streaming
No need for model retraining
Abstract
Many recommender systems in long-form video streaming reply on batch-trained models and batch-updated features, where user features are updated daily and served statically throughout the day. While efficient, this approach fails to incorporate a user's most recent actions, often resulting in stale recommendations. In this work, we present a lightweight, model-agnostic approach for intra-day personalization that selectively injects recent watch history at inference time without requiring model retraining. Our approach selectively overrides stale user features at inference time using the recent watch history, allowing the system to adapt instantly to evolving preferences. By reducing the personalization feedback loop from daily to intra-day, we observed a statistically significant 0.47% increase in key user engagement metrics which ranked among the most substantial engagement gains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Explainable Artificial Intelligence (XAI)
