Learning Ordered Representations in Latent Space for Intrinsic Dimension Estimation via Principal Component Autoencoder

Qipeng Zhan; Zhuoping Zhou; Zexuan Wang; Li Shen

arXiv:2601.19179·cs.LG·January 28, 2026

Learning Ordered Representations in Latent Space for Intrinsic Dimension Estimation via Principal Component Autoencoder

Qipeng Zhan, Zhuoping Zhou, Zexuan Wang, Li Shen

PDF

Open Access 4 Reviews

TL;DR

This paper introduces a novel autoencoder framework that combines non-uniform variance regularization with an isometric constraint, effectively generalizing PCA to nonlinear settings for better intrinsic dimension estimation.

Contribution

It proposes a new autoencoder design that preserves PCA-like ordered representations in nonlinear latent spaces, improving nonlinear dimensionality reduction.

Findings

01

Successfully captures ordered principal components in nonlinear autoencoders

02

Enhances intrinsic dimension estimation accuracy

03

Maintains variance retention similar to PCA

Abstract

Autoencoders have long been considered a nonlinear extension of Principal Component Analysis (PCA). Prior studies have demonstrated that linear autoencoders (LAEs) can recover the ordered, axis-aligned principal components of PCA by incorporating non-uniform $ℓ_{2}$ regularization or by adjusting the loss function. However, these approaches become insufficient in the nonlinear setting, as the remaining variance cannot be properly captured independently of the nonlinear mapping. In this work, we propose a novel autoencoder framework that integrates non-uniform variance regularization with an isometric constraint. This design serves as a natural generalization of PCA, enabling the model to preserve key advantages, such as ordered representations and variance retention, while remaining effective for nonlinear dimensionality reduction tasks.

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 2Confidence 4

Strengths

1. The studied problem of finding latent components with geometric interpretation is important to the machine learning community. 2. The experimental results look very convincing.

Weaknesses

1. It is difficult to ascertain the correctness of the experiments since the code is not provided in the supplementary material and the experimental setup is not described in sufficient detail (see questions below). 2. The theoretical results and insights are very limited. Moreover, since the isometry is “enforced” via a soft regularization, it is not necessarily true that the resulting encoder will be an isometry. Further, it is not clear why notions such as correlation are useful in a nonlinea

Reviewer 02Rating 6Confidence 4

Strengths

1. The addressed problem is of significance to the community. 2. Good clarity and compactness in context. 3. The method is simple but gives new insights rather than in a direct manner.

Weaknesses

1. The contribution and the key novelty should be further clarified. From the title and most of the presented experiments, it seems that intrinsic dimension estimation is positioned as a main part. However, the proposed algorithm and its mechanisms and technical contributions are mainly enclosed in Sec 4.2, i.e., PCAE. In sec 4.3, the determination of intrinsic dimensions is introduced in short, which is simply taking the conventional technique of choosing a threshold; even it takes the releva

Reviewer 03Rating 2Confidence 4

Strengths

I found the main, high-level idea of reweighting variances to find intrinsic dimensionality, interesting.

Weaknesses

I found the detail of the reasoning of the paper difficult to follow, I put some examples below. The biggest weakness I found was the motivation of section 4.2. As far as I can tell, the goal is to take the variances of the components found by the PCA and re-weight them with scalars to penalise variances in the later components. But these scalars are completely arbitrary, and have no link to the data at hand. You could put anything you like there, and all it will do is change how much variabili

Reviewer 04Rating 2Confidence 5

Strengths

1. The paper presents a novel method supported by rigorous theoretical analysis, offering some valuable insights, especially Lemma 2. 2. The paper is well-written with good structure, easy to follow.

Weaknesses

Overall, the authors should conduct experiments more rigorously and exercise caution when making claims. Their reliance on MLE for dimension estimation is the main flaw in their result. Some components of their method are also well-known easy victims of the curse of dimensionality. These combined undermine the soundness of this work. # Major 1. Many experimental results are unsound and accompanied by overly strong claims. Some flaws may even undermine the method’s legitimacy. 1. PCAE's robu

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Generative Adversarial Networks and Image Synthesis · Tensor decomposition and applications