Large-Scale Approximate Kernel Canonical Correlation Analysis
Weiran Wang, Karen Livescu

TL;DR
This paper introduces a scalable stochastic optimization method for approximate kernel canonical correlation analysis (KCCA), enabling it to handle large datasets with high-dimensional feature spaces efficiently.
Contribution
It proposes a stochastic optimization approach for approximate KCCA that reduces computational demands and allows large-scale applications.
Findings
Successfully applied to a speech dataset with 1.4 million samples
Handled a random feature space of dimension 100,000 on a standard workstation
Demonstrated significant computational efficiency improvements
Abstract
Kernel canonical correlation analysis (KCCA) is a nonlinear multi-view representation learning technique with broad applicability in statistics and machine learning. Although there is a closed-form solution for the KCCA objective, it involves solving an eigenvalue system where is the training set size, making its computational requirements in both memory and time prohibitive for large-scale problems. Various approximation techniques have been developed for KCCA. A commonly used approach is to first transform the original inputs to an -dimensional random feature space so that inner products in the feature space approximate kernel evaluations, and then apply linear CCA to the transformed inputs. In many applications, however, the dimensionality of the random feature space may need to be very large in order to obtain a sufficiently good approximation; it then becomes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Blind Source Separation Techniques
