RAZOR: Refining Accuracy by Zeroing Out Redundancies

Daniel Riccio; Genoveffa Tortora; Mara Sangiovanni

arXiv:2410.14254·cs.LG·October 21, 2024

RAZOR: Refining Accuracy by Zeroing Out Redundancies

Daniel Riccio, Genoveffa Tortora, Mara Sangiovanni

PDF

Open Access

TL;DR

RAZOR is a new instance selection method that efficiently reduces data redundancy, improving learning efficiency without sacrificing accuracy, applicable in both supervised and unsupervised contexts.

Contribution

It introduces RAZOR, a scalable and robust instance selection technique that outperforms existing methods in effectiveness and efficiency for large-scale datasets.

Findings

01

RAZOR significantly reduces dataset size while maintaining accuracy.

02

It outperforms recent state-of-the-art techniques in effectiveness.

03

RAZOR is applicable in both supervised and unsupervised settings.

Abstract

In many application domains, the proliferation of sensors and devices is generating vast volumes of data, imposing significant pressure on existing data analysis and data mining techniques. Nevertheless, an increase in data volume does not inherently imply an increase in informational content, as a substantial portion may be redundant or represent noise. This challenge is particularly evident in the deep learning domain, where the utility of additional data is contingent on its informativeness. In the absence of such, larger datasets merely exacerbate the computational cost and complexity of the learning process. To address these challenges, we propose RAZOR, a novel instance selection technique designed to extract a significantly smaller yet sufficiently informative subset from a larger set of instances without compromising the learning process. RAZOR has been specifically engineered…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems · AI-based Problem Solving and Planning

MethodsSparse Evolutionary Training