Method of fractal diversity in data science problems
Vitalii Vladimirov, Elena Vladimirova

TL;DR
This paper introduces a fractal-based method to measure cross-correlation in data, applicable to large, non-Gaussian datasets, with potential uses in identifying biologically active conformers in X-ray spectra.
Contribution
The paper presents a novel fractal diversity method for quantifying data cross-correlation that is invariant under linear transformations and applicable to complex data sets.
Findings
The method effectively distinguishes Gaussian from non-Gaussian data.
It successfully identifies biologically active conformers in X-ray diffraction spectra.
The approach is universal and independent of data distribution or correlation nature.
Abstract
The parameter (SNR) is obtained for distinguishing the Gaussian function, the distribution of random variables in the absence of cross correlation, from other functions, which makes it possible to describe collective states with strong cross-correlation of data. The signal-to-noise ratio (SNR) in one-dimensional space is determined and a calculation algorithm based on the fractal variety of the Cantor dust in a closed loop is given. The algorithm is invariant for linear transformations of the initial data set, has renormalization-group invariance, and determines the intensity of cross-correlation (collective effect) of the data. The description of the collective state is universal and does not depend on the nature of the correlation of data, nor is the universality of the distribution of random variables in the absence of data correlation. The method is applicable for large sets of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Fractal and DNA sequence analysis · Advanced Scientific Research Methods
