Wukong: Towards a Scaling Law for Large-Scale Recommendation

Buyun Zhang; Liang Luo; Yuxin Chen; Jade Nie; Xi Liu; Daifeng Guo,; Yanli Zhao; Shen Li; Yuchen Hao; Yantao Yao; Guna Lakshminarayanan; Ellie; Dingqiao Wen; Jongsoo Park; Maxim Naumov; Wenlin Chen

arXiv:2403.02545·cs.LG·June 5, 2024·1 cites

Wukong: Towards a Scaling Law for Large-Scale Recommendation

Buyun Zhang, Liang Luo, Yuxin Chen, Jade Nie, Xi Liu, Daifeng Guo,, Yanli Zhao, Shen Li, Yuchen Hao, Yantao Yao, Guna Lakshminarayanan, Ellie, Dingqiao Wen, Jongsoo Park, Maxim Naumov, Wenlin Chen

PDF

Open Access 2 Repos

TL;DR

This paper introduces Wukong, a novel recommendation model with a scalable architecture based on stacked factorization machines, demonstrating consistent quality improvements and a new scaling law across diverse datasets and model complexities.

Contribution

Wukong presents a new scalable recommendation model architecture and upscaling strategy that establish a scaling law, enabling effective handling of complex datasets and larger models.

Findings

01

Wukong outperforms state-of-the-art models on six public datasets.

02

Wukong maintains superior quality across two orders of magnitude in model complexity.

03

Wukong demonstrates scalability on large-scale internal datasets.

Abstract

Scaling laws play an instrumental role in the sustainable improvement in model quality. Unfortunately, recommendation models to date do not exhibit such laws similar to those observed in the domain of large language models, due to the inefficiencies of their upscaling mechanisms. This limitation poses significant challenges in adapting these models to increasingly more complex real-world datasets. In this paper, we propose an effective network architecture based purely on stacked factorization machines, and a synergistic upscaling strategy, collectively dubbed Wukong, to establish a scaling law in the domain of recommendation. Wukong's unique design makes it possible to capture diverse, any-order of interactions simply through taller and wider layers. We conducted extensive evaluations on six public datasets, and our results demonstrate that Wukong consistently outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare