Bootstrap Confidence Regions for Learned Feature Embeddings

Kris Sankaran

arXiv:2202.00180·stat.CO·February 2, 2022·J. Comput. Graph. Stat.

Bootstrap Confidence Regions for Learned Feature Embeddings

Kris Sankaran

PDF

Open Access 1 Repo

TL;DR

This paper develops bootstrap-based methods to quantify uncertainty in low-dimensional projections of learned feature embeddings from high-dimensional non-matrix data, aiding interpretability.

Contribution

It adapts bootstrap techniques for PCA to learned feature embeddings, providing a new way to assess uncertainty in these projections.

Findings

01

Bootstrap confidence regions are effective in simulations.

02

Methods are applicable to spatial proteomic data.

03

Code and data are publicly available.

Abstract

Algorithmic feature learners provide high-dimensional vector representations for non-matrix structured signals, like images, audio, text, and graphs. Low-dimensional projections derived from these representations can be used to explore variation across collections of these data. However, it is not clear how to assess the uncertainty associated with these projections. We adapt methods developed for bootstrapping principal components analysis to the setting where features are learned from non-matrix data. We empirically compare the derived confidence regions in simulations, varying factors that influence both feature learning and the bootstrap. Approaches are illustrated on spatial proteomic data. Code, data, and trained models are released as an R compendium.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

krisrs1128/lfbcr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Analysis with R · Gene expression and cancer classification