cardinalR: Generating Interesting High-Dimensional Data Structures
Jayani P. Gamage, Dianne Cook, Paul Harrison, Michael Lydeamore, Thiyanga S. Talagala

TL;DR
cardinalR is an R package that provides new methods for generating diverse high-dimensional data structures, aiding researchers in testing and improving analytical algorithms, especially in nonlinear dimension reduction.
Contribution
The paper introduces novel data generation methods and an R package for creating complex high-dimensional datasets with various structures.
Findings
Provides a versatile R package for high-dimensional data simulation
Enables testing of dimension reduction and learning algorithms
Includes example datasets for benchmarking and validation
Abstract
Simulated high-dimensional data is useful for testing, validating, and improving algorithms used in dimension reduction, supervised and unsupervised learning. High-dimensional data is characterized by multiple variables that are dependent or associated in some way, such as linear, nonlinear, clustering or anomalies. Here we provide new methods for generating a variety of high-dimensional structures using mathematical functions and statistical distributions organized into the R package cardinalR. Several example data sets are also provided. These will be useful for researchers to better understand how different analytical methods work and can be improved, with a special focus on nonlinear dimension reduction methods. This package enriches the existing toolset of benchmark datasets for evaluating algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Neural Networks and Applications · Advanced Clustering Algorithms Research
