Dynamic User Segmentation and Usage Profiling
Animesh Mitra, Saswata Sahoo, Soumyabrata Dey

TL;DR
This paper introduces a novel clustering method for ultra-high dimensional, sparse categorical user data by transforming it into a lower-dimensional space based on covariate classes, improving cluster quality.
Contribution
It proposes a feature transformation technique that leverages dataset sparsity to enable effective clustering of categorical big data, outperforming traditional methods.
Findings
Achieved similar-sized user clusters with minimal overlap (8%)
Validated on a large song playlist dataset
Demonstrated effectiveness for diverse business applications
Abstract
Usage data of a group of users distributed across a number of categories, such as songs, movies, webpages, links, regular household products, mobile apps, games, etc. can be ultra-high dimensional and massive in size. More often this kind of data is categorical and sparse in nature making it even more difficult to interpret any underlying hidden patterns such as clusters of users. However, if this information can be estimated accurately, it will have huge impacts in different business areas such as user recommendations for apps, songs, movies, and other similar products, health analytics using electronic health record (EHR) data, and driver profiling for insurance premium estimation or fleet management. In this work, we propose a clustering strategy of such categorical big data, utilizing the hidden sparsity of the dataset. Most traditional clustering methods fail to give proper…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Mobility and Location-Based Analysis · Recommender Systems and Techniques
Methodsfail
