CS3: Efficient Online Capability Synergy for Two-Tower Recommendation
Lixiang Wang, Shaoyun Shi, Peng Wang, Wenjin Wu, Peng Jiang

TL;DR
This paper introduces CS3, an online framework that enhances two-tower recommender models by improving their capacity and alignment without sacrificing real-time performance.
Contribution
The paper proposes a novel, plug-and-play framework with three mechanisms to strengthen two-tower models for online recommendation systems.
Findings
Achieves up to 8.36% revenue increase in large-scale deployment.
Demonstrates consistent improvements on three public datasets.
Maintains millisecond-level latency in real-time scenarios.
Abstract
To balance effectiveness and efficiency in recommender systems, multi-stage pipelines commonly use lightweight two-tower models for large-scale candidate retrieval. However, the isolated two-tower architecture restricts representation capacity, embedding-space alignment, and cross-feature interactions. Existing solutions such as late interaction and knowledge distillation can mitigate these issues, but often increase latency or are difficult to deploy in online learning settings. We propose Capability Synergy (CS3), an efficient online framework that strengthens two-tower retrievers while preserving real-time constraints. CS3 introduces three mechanisms: (1) Cycle-Adaptive Structure for self-revision via adaptive feature denoising within each tower; (2) Cross-Tower Synchronization to improve alignment through lightweight mutual awareness between towers; and (3) Cascade-Model Sharing to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
