Distribution-Aware End-to-End Embedding for Streaming Numerical Features in Click-Through Rate Prediction
Jiahao Liu, Hongji Ruan, Weimin Zhang, Ziye Tong, Derick Tang, Zhanpeng Zeng, Qinsong Zeng, Peng Zhang, Tun Lu, Ning Gu

TL;DR
This paper introduces DAES, a novel end-to-end framework for embedding streaming numerical features in CTR prediction, effectively capturing distributional information and adapting to non-i.i.d. streaming data.
Contribution
The paper proposes DAES, which integrates distribution estimation and modulation strategies into neural embeddings for streaming numerical features, addressing semantic drift and distributional shifts.
Findings
DAES outperforms existing methods in offline and online tests.
Successfully deployed on a large-scale short-video platform.
Improves embedding quality for streaming numerical features.
Abstract
This paper explores effective numerical feature embedding for Click-Through Rate prediction in streaming environments. Conventional static binning methods rely on offline statistics of numerical distributions; however, this inherently two-stage process often triggers semantic drift during bin boundary updates. While neural embedding methods enable end-to-end learning, they often discard explicit distributional information. Integrating such information end-to-end is challenging because streaming features often violate the i.i.d. assumption, precluding unbiased estimation of the population distribution via the expectation of order statistics. Furthermore, the critical context dependency of numerical distributions is often neglected. To this end, we propose DAES, an end-to-end framework designed to tackle numerical feature embedding in streaming training scenarios by integrating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Domain Adaptation and Few-Shot Learning · Caching and Content Delivery
