Predictive Modeling with Delayed Information: a Case Study in E-commerce Transaction Fraud Control
Junxuan Li, Yung-wen Liu, Yuting Jia, Yifei Ren, Jay Nanduri

TL;DR
This paper addresses the challenge of predictive modeling in e-commerce fraud detection when transaction labels are delayed, proposing frameworks that utilize mature and partially mature data to improve environment estimation accuracy.
Contribution
Introduces CEI and FEI frameworks that leverage long-term and short-term data to enhance probabilistic environment estimation in delayed label scenarios.
Findings
Frameworks significantly improved environment estimation accuracy.
Multiple learning methods tested, including neural networks and tree-based models.
Results demonstrated better fraud detection performance in real-world Microsoft data.
Abstract
In Business Intelligence, accurate predictive modeling is the key for providing adaptive decisions. We studied predictive modeling problems in this research which was motivated by real-world cases that Microsoft data scientists encountered while dealing with e-commerce transaction fraud control decisions using transaction streaming data in an uncertain probabilistic decision environment. The values of most online transactions related features can return instantly, while the true fraud labels only return after a stochastic delay. Using partially mature data directly for predictive modeling in an uncertain probabilistic decision environment would lead to significant inaccuracy on risk decision-making. To improve accurate estimation of the probabilistic prediction environment, which leads to more accurate predictive modeling, two frameworks, Current Environment Inference (CEI) and Future…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Imbalanced Data Classification Techniques · Machine Learning and Data Classification
