Analysis of multiple data sequences with different distributions: defining common principal component axes by ergodic sequence generation and multiple reweighting composition
Ikuo Fukuda, Kei Moritsugu

TL;DR
This paper introduces a method to define common principal component axes for multiple data sequences with different distributions by using ergodic sampling and reweighting techniques, enabling fair comparison across diverse datasets.
Contribution
It proposes a novel approach combining ergodic sequence generation and reweighting to find common PCA axes for multiple distributions, addressing a key challenge in multisequence analysis.
Findings
Effective common PC axes for diverse sequences
Accurate recovery of target distributions through reweighting
Enhanced comparison of multi-distribution data sets
Abstract
Principal component analysis (PCA) defines a reduced space described by PC axes for a given multidimensional-data sequence to capture the variations of the data. In practice, we need multiple data sequences that accurately obey individual probability distributions and for a fair comparison of the sequences we need PC axes that are common for the multiple sequences but properly capture these multiple distributions. For these requirements, we present individual ergodic samplings for these sequences and provide special reweighting for recovering the target distributions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Machine Learning in Bioinformatics · Spectroscopy and Chemometric Analyses
