New Approach to Clustering Random Attributes

Zenon Gniazdowski

arXiv:2412.09748·cs.LG·December 16, 2024

New Approach to Clustering Random Attributes

Zenon Gniazdowski

PDF

TL;DR

This paper introduces a universal clustering method for numerical and nominal attributes, utilizing encoding and factor analysis to enable similarity-based clustering across different data types.

Contribution

The paper presents a novel clustering algorithm that handles both numerical and nominal attributes through encoding and factor analysis, enhancing versatility.

Findings

01

Method effectively clusters numerical attributes.

02

Method successfully clusters nominal attributes after encoding.

03

Allows simultaneous clustering of numerical and encoded nominal attributes.

Abstract

This paper proposes a new method for similarity analysis and, consequently, a new algorithm for clustering different types of random attributes, both numerical and nominal. However, in order for nominal attributes to be clustered, their values must be properly encoded. In the encoding process, nominal attributes obtain a new representation in numerical form. Only the numeric attributes can be subjected to factor analysis, which allows them to be clustered in terms of their similarity to factors. The proposed method was tested for several sample datasets. It was found that the proposed method is universal. On the one hand, the method allows clustering of numerical attributes. On the other hand, it provides the ability to cluster nominal attributes. It also allows simultaneous clustering of numerical attributes and numerically encoded nominal attributes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.