cardinalR: Generating Interesting High-Dimensional Data Structures
By: Jayani P. Gamage , Dianne Cook , Paul Harrison and more
Simulated high-dimensional data is useful for testing, validating, and improving algorithms used in dimension reduction, supervised and unsupervised learning. High-dimensional data is characterized by multiple variables that are dependent or associated in some way, such as linear, nonlinear, clustering or anomalies. Here we provide new methods for generating a variety of high-dimensional structures using mathematical functions and statistical distributions organized into the R package cardinalR. Several example data sets are also provided. These will be useful for researchers to better understand how different analytical methods work and can be improved, with a special focus on nonlinear dimension reduction methods. This package enriches the existing toolset of benchmark datasets for evaluating algorithms.
Similar Papers
An Interactive Tool for Analyzing High-Dimensional Clusterings
Applications
Helps see hidden patterns in complex data.
DRtool: An Interactive Tool for Analyzing High-Dimensional Clusterings
Applications
Helps see hidden patterns in complex data.
Contributions to Robust and Efficient Methods for Analysis of High Dimensional Data
Statistics Theory
Finds important patterns in huge, messy data.