Cluster Analysis on Jester Dataset: A Review

Navoneel Chakrabarty

arXiv:2110.02740·cs.LG·October 7, 2021

Cluster Analysis on Jester Dataset: A Review

Navoneel Chakrabarty

PDF

Open Access

TL;DR

This paper reviews and validates a cluster analysis approach on the Jester joke dataset, emphasizing data preparation challenges due to missing ratings and proposing future research directions.

Contribution

It provides a review and validation of the only known cluster analysis method applied to the Jester dataset, highlighting data imputation and analysis steps.

Findings

01

Validated clustering results on Jester dataset

02

Identified data imputation as a key challenge

03

Suggested future research directions

Abstract

Unsupervised Machine Learning Paradigms are often the only methodology to rely on, given a Pattern Recognition Task with no target label or annotations being present. In such scenarios, data preparation is a crucial step to be performed so that the Unsupervised Paradigms work with as much perfection as possible. But, when there is no sufficient or missing data being present in each and every instance of a dataset, data preparation becomes a challenge itself. One such case-study is the Jester Dataset that has missing values which are basically ratings given by Joke-Readers to a specified set of 100 jokes. In order to perform a Cluster Analysis on such a dataset, the data preparation step should involve filling the missing ratings with appropriate values followed by cluster analysis using an Unsupervised ML Paradigm. In this study, the most recent and probably the only work that involves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Face and Expression Recognition