STARS: Semantic Tokens with Augmented Representations for Recommendation at Scale
Han Chen, Steven Zhu, Yingrui Li

TL;DR
STARS is a scalable, low-latency transformer-based recommendation system that enhances content understanding and user modeling, significantly improving relevance and engagement metrics in large-scale ecommerce settings.
Contribution
The paper introduces STARS, a novel recommendation framework combining semantic tokens, dual-memory user embeddings, and a two-stage retrieval pipeline for improved performance at scale.
Findings
Over 75% improvement in Hit@5 over existing systems
Statistically significant increases in orders, add-to-cart, and visits per user
Effective handling of cold-start and long-tail items in real-world deployment
Abstract
Real-world ecommerce recommender systems must deliver relevant items under strict tens-of-milliseconds latency constraints despite challenges such as cold-start products, rapidly shifting user intent, and dynamic context including seasonality, holidays, and promotions. We introduce STARS, a transformer-based sequential recommendation framework built for large-scale, low-latency ecommerce settings. STARS combines several innovations: dual-memory user embeddings that separate long-term preferences from short-term session intent; semantic item tokens that fuse pretrained text embeddings, learnable deltas, and LLM-derived attribute tags, strengthening content-based matching, long-tail coverage, and cold-start performance; context-aware scoring with learned calendar and event offsets; and a latency-conscious two-stage retrieval pipeline that performs offline embedding generation and online…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Machine Learning in Healthcare · Topic Modeling
