Analyzing the Structure of Handwritten Digits: A Comparative Study of PCA, Factor Analysis, and UMAP
Jyotiraditya Gupta

TL;DR
This study compares PCA, FA, and UMAP to analyze the intrinsic low-dimensional structure of handwritten digits in MNIST, revealing complementary insights into their geometric and statistical organization.
Contribution
It provides a comprehensive comparison of three dimensionality reduction methods to understand the latent structure of handwritten digits beyond classification.
Findings
PCA captures dominant variance directions and allows accurate reconstructions.
FA identifies interpretable handwriting primitives like strokes and loops.
UMAP reveals nonlinear manifolds reflecting stylistic transitions.
Abstract
Handwritten digit images lie in a high-dimensional pixel space but exhibit strong geometric and statistical structure. This paper investigates the latent organization of handwritten digits in the MNIST dataset using three complementary dimensionality reduction techniques: Principal Component Analysis (PCA), Factor Analysis (FA), and Uniform Manifold Approximation and Projection (UMAP). Rather than focusing on classification accuracy, we study how each method characterizes intrinsic dimensionality, shared variation, and nonlinear geometry. PCA reveals dominant global variance directions and enables high-fidelity reconstructions using a small number of components. FA decomposes digits into interpretable latent handwriting primitives corresponding to strokes, loops, and symmetry. UMAP uncovers nonlinear manifolds that reflect smooth stylistic transitions between digit classes. Together,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Face Recognition and Perception · Aesthetic Perception and Analysis
