Time Aggregation Features for XGBoost Models
Mykola Pinchuk

TL;DR
This paper investigates the impact of time aggregation features on XGBoost models for click-through rate prediction, demonstrating that trailing window features offer consistent improvements over target encoding alone.
Contribution
It introduces and evaluates time aggregation features, especially trailing windows, showing their effectiveness in improving model performance in a strict out-of-time setting.
Findings
Trailing window features improve ROC AUC by about 0.0066 to 0.0082.
Event count windows provide small but consistent gains.
Simple trailing windows outperform gap and bucketized windows.
Abstract
This paper studies time aggregation features for XGBoost models in click-through rate prediction. The setting is the Avazu click-through rate prediction dataset with strict out-of-time splits and a no-lookahead feature constraint. Features for hour H use only impressions from hours strictly before H. This paper compares a strong time-aware target encoding baseline to models augmented with entity history time aggregation under several window designs. Across two rolling-tail folds on a deterministic ten percent sample, a trailing window specification improves ROC AUC by about 0.0066 to 0.0082 and PR AUC by about 0.0084 to 0.0094 relative to target encoding alone. Within the time aggregation design grid, event count windows provide the only consistent improvement over trailing windows, and the gain is small. Gap windows and bucketized windows underperform simple trailing windows in this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Video Coding and Compression Technologies · Advanced Neural Network Applications
