CASPR: Customer Activity Sequence-based Prediction and Representation
Pin-Jung Chen, Sahil Bhatnagar, Sagar Goyal, Damian Konrad Kowalczyk,, Mayank Shrivastava

TL;DR
CASPR introduces a Transformer-based method to encode customer activity sequences into generic representations, reducing the need for application-specific feature engineering and improving predictive performance across various enterprise tasks.
Contribution
The paper presents CASPR, a novel Transformer-based approach for encoding customer activity sequences into universal representations applicable to multiple enterprise prediction tasks.
Findings
CASPR improves prediction accuracy across different applications.
It reduces development time by minimizing feature engineering.
Validated at scale for both small and large enterprise datasets.
Abstract
Tasks critical to enterprise profitability, such as customer churn prediction, fraudulent account detection or customer lifetime value estimation, are often tackled by models trained on features engineered from customer data in tabular format. Application-specific feature engineering adds development, operationalization and maintenance costs over time. Recent advances in representation learning present an opportunity to simplify and generalize feature engineering across applications. When applying these advancements to tabular data researchers deal with data heterogeneity, variations in customer engagement history or the sheer volume of enterprise datasets. In this paper, we propose a novel approach to encode tabular data containing customer transactions, purchase history and other interactions into a generic representation of a customer's association with the business. We then evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCustomer churn and segmentation · Imbalanced Data Classification Techniques · Artificial Intelligence in Healthcare
MethodsMulti-Head Attention · Attention Is All You Need · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection · Linear Layer · Adam · Layer Normalization · Softmax · Absolute Position Encodings
