Bootstrapping your behavior: a new pretraining strategy for user behavior sequence data
Weichang Wu, Xiaolu Zhang, Jun Zhou, Yuchen Li, Wenwen Xia

TL;DR
This paper introduces a novel pretraining strategy for user behavior sequence modeling that automatically constructs supervision embeddings, eliminating manual vocabulary selection and improving performance and efficiency in industrial applications.
Contribution
The paper proposes Bootstrapping Your Behavior, a new pretraining method that automatically generates supervision embeddings using a student-teacher encoder scheme, removing the need for manual behavior vocabulary construction.
Findings
Achieves 3.9% average AUC improvement on real-world datasets.
Increases training throughput by 98.9%.
Enhances online risk prediction accuracy, reducing bad debt risk.
Abstract
User Behavior Sequence (UBS) modeling is crucial in industrial applications. As data scale and task diversity grow, UBS pretraining methods have become increasingly pivotal. State-of-the-art UBS pretraining methods rely on predicting behavior distributions. The key step in these methods is constructing a selected behavior vocabulary. However, this manual step is labor-intensive and prone to bias. The limitation of vocabulary capacity also directly affects models' generalization ability. In this paper, we introduce Bootstrapping Your Behavior (\model{}), a novel UBS pretraining strategy that predicts an automatically constructed supervision embedding summarizing all behaviors' information within a future time window, eliminating the manual behavior vocabulary selection. In implementation, we incorporate a student-teacher encoder scheme to construct the pretraining supervision…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Mobility and Location-Based Analysis
