Out-of-Sample Embedding with Proximity Data: Projection versus Restricted Reconstruction
Michael W. Trosset, Kaiyi Tan, Minh Tang, Carey E. Priebe

TL;DR
This paper surveys kernel methods for out-of-sample embedding using proximity data, comparing projection and restricted reconstruction strategies, and analyzing their mathematical foundations and practical implications.
Contribution
It clarifies the theoretical basis of out-of-sample embedding methods by categorizing them into projection and restricted reconstruction approaches.
Findings
Projection resembles PCA point addition formula.
Restricted reconstruction involves nonlinear optimization.
Different scenarios favor different embedding strategies.
Abstract
The problem of using proximity (similarity or dissimilarity) data for the purpose of "adding a point to a vector diagram" was first studied by J.C. Gower in 1968. Since then, a number of methods -- mostly kernel methods -- have been proposed for solving what has come to be called the problem of *out-of-sample embedding*. We survey the various kernel methods that we have encountered and show that each can be derived from one or the other of two competing strategies: *projection* or *restricted reconstruction*. Projection can be analogized to a well-known formula for adding a point to a principal component analysis. Restricted reconstruction poses a different challenge: how to best approximate redoing the entire multivariate analysis while holding fixed the vector diagram that was previously obtained. This strategy results in a nonlinear optimization problem that can be simplified to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Control Systems and Identification · Face and Expression Recognition
