CS3: Efficient Online Capability Synergy for Two-Tower Recommendation
Lixiang Wang, Shaoyun Shi, Peng Wang, Wenjin Wu, Peng Jiang

TL;DR
The paper introduces CS3, an online framework that enhances two-tower recommendation models by improving their representation and alignment capabilities without sacrificing real-time performance.
Contribution
CS3 provides a novel, efficient online enhancement for two-tower models through cycle-adaptive structure, cross-tower synchronization, and cascade model sharing.
Findings
CS3 increases online ad revenue by up to 8.36%.
Maintains millisecond-level latency in large-scale deployment.
Compatible with various two-tower architectures.
Abstract
To balance effectiveness and efficiency in recommender systems, multi-stage pipelines employ lightweight two-tower models for large-scale candidate retrieval. However, their isolated architecture inherently hampers representation capacity, embedding-space alignment, and cross-feature modeling. Prior studies have explored incorporating late interaction or knowledge distillation to mitigate these issues, but such approaches often significantly increase model latency or pose challenges for implementation in online learning scenarios. To address these limitations, we propose an efficient online framework called Capability Synergy (CS3), which enhances two-tower models through three key innovations: (1) Cycle-Adaptive Structure, enabling self-revision via adaptive feature denoising within individual towers; (2) Cross-Tower Synchronization, improving representation alignment through mutual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
