Condensed Representation of Machine Learning Data

Rahman Salim Zengin (1); Volkan Sezer (1) ((1) Istanbul Technical; University)

arXiv:2212.14229·cs.LG·January 2, 2023

Condensed Representation of Machine Learning Data

Rahman Salim Zengin (1), Volkan Sezer (1) ((1) Istanbul Technical, University)

PDF

Open Access

TL;DR

This paper introduces a novel condensed data representation method for machine learning that reduces redundancy and computational costs while maintaining acceptable accuracy, using K-means clustering with corrections.

Contribution

A new condensed data representation technique combining K-means with correction mechanisms for efficient machine learning training.

Findings

01

Reduced computational resource utilization.

02

Maintained acceptable model accuracy.

03

Effective on synthetically generated data.

Abstract

Training of a Machine Learning model requires sufficient data. The sufficiency of the data is not always about the quantity, but about the relevancy and reduced redundancy. Data-generating processes create massive amounts of data. When used raw, such big data is causing much computational resource utilization. Instead of using the raw data, a proper Condensed Representation can be used instead. Combining K-means, a well-known clustering method, with some correction and refinement facilities a novel Condensed Representation method for Machine Learning applications is introduced. To present the novel method meaningfully and visually, synthetically generated data is employed. It has been shown that by using the condensed representation, instead of the raw data, acceptably accurate model training is possible.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification