Transformer Choice Net: A Transformer Neural Network for Choice Prediction
Hanzhao Wang, Xiaocheng Li, Kalyan Talluri

TL;DR
This paper introduces the Transformer Choice Net, a neural network architecture designed to predict multiple customer choices in e-commerce, outperforming traditional models by effectively capturing context and features.
Contribution
The paper presents a novel transformer-based neural network model for multi-choice prediction, addressing limitations of traditional discrete-choice models in complex selection scenarios.
Findings
Outperforms existing models on benchmark datasets
Effectively captures context and past choices
No need for custom tuning per instance
Abstract
Discrete-choice models, such as Multinomial Logit, Probit, or Mixed-Logit, are widely used in Marketing, Economics, and Operations Research: given a set of alternatives, the customer is modeled as choosing one of the alternatives to maximize a (latent) utility function. However, extending such models to situations where the customer chooses more than one item (such as in e-commerce shopping) has proven problematic. While one can construct reasonable models of the customer's behavior, estimating such models becomes very challenging because of the combinatorial explosion in the number of possible subsets of items. In this paper we develop a transformer neural network architecture, the Transformer Choice Net, that is suitable for predicting multiple choices. Transformer networks turn out to be especially suitable for this task as they take into account not only the features of the customer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCustomer churn and segmentation · Consumer Market Behavior and Pricing · Forecasting Techniques and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Byte Pair Encoding · Label Smoothing · Softmax · Residual Connection · Absolute Position Encodings · Layer Normalization
