Cross-validation of matching correlation analysis by resampling matching weights
Hidetoshi Shimodaira

TL;DR
This paper introduces a cross-validation method for matching correlation analysis (MCA) that resamples matching weights to estimate true matching errors, extending MCA to cross-domain data with theoretical guarantees.
Contribution
It develops a novel resampling-based cross-validation scheme for MCA, providing asymptotic unbiased estimates of matching error and extending MCA to cross-domain data analysis.
Findings
Cross-validation with resampled matching weights yields unbiased error estimates.
The method applies to multi-domain data with different dimensions.
Theoretical analysis confirms the asymptotic unbiasedness of the approach.
Abstract
The strength of association between a pair of data vectors is represented by a nonnegative real number, called matching weight. For dimensionality reduction, we consider a linear transformation of data vectors, and define a matching error as the weighted sum of squared distances between transformed vectors with respect to the matching weights. Given data vectors and matching weights, the optimal linear transformation minimizing the matching error is solved by the spectral graph embedding of Yan et al. (2007). This method is a generalization of the canonical correlation analysis, and will be called as matching correlation analysis (MCA). In this paper, we consider a novel sampling scheme where the observed matching weights are randomly sampled from underlying true matching weights with small probability, whereas the data vectors are treated as constants. We then investigate a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Neural Networks and Applications · Topological and Geometric Data Analysis
