Machine Learning to Predict Digital Frustration from Clickstream Data

Jibin Joseph

arXiv:2512.20438·cs.LG·December 24, 2025

Machine Learning to Predict Digital Frustration from Clickstream Data

Jibin Joseph

PDF

Open Access

TL;DR

This paper develops machine learning models, including XGBoost and LSTM, to predict user frustration in e-commerce sessions from clickstream data, achieving high accuracy with early session data.

Contribution

It introduces a novel approach combining rule-based frustration labeling with deep learning and tree-based models for early prediction of user frustration.

Findings

01

XGBoost achieves 90% accuracy and 0.9579 ROC AUC.

02

LSTM achieves 91% accuracy and 0.9705 ROC AUC.

03

Early prediction is effective with only 20-30 interactions.

Abstract

Many businesses depend on their mobile apps and websites, so user frustration while trying to complete a task on these channels can cause lost sales and complaints. In this research, I use clickstream data from a real e-commerce site to predict whether a session is frustrated or not. Frustration is defined using certain rules based on rage bursts, back and forth navigation (U turns), cart churn, search struggle, and long wandering sessions, and applies these rules to 5.4 million raw clickstream events (304,881 sessions). From each session, I build tabular features and train standard classifier models. I also use the full event sequence to train a discriminative LSTM classifier. XGBoost reaches about 90% accuracy, ROC AUC of 0.9579, while the LSTM performs best with about 91% accuracy and a ROC AUC of 0.9705. Finally, the research shows that with only the first 20 to 30 interactions, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPersonal Information Management and User Behavior · Spam and Phishing Detection · Mind wandering and attention