Disentanglement Learning via Topology
Nikita Balabin, Daria Voronkova, Ilya Trofimov, Evgeny Burnaev,, Serguei Barannikov

TL;DR
This paper introduces TopDis, a novel method for learning disentangled data representations by incorporating a differentiable topological loss, which improves disentanglement scores without requiring labeled data or independence of factors.
Contribution
The paper presents the first differentiable topological loss for disentanglement learning, offering a new perspective based on topological properties of data manifolds.
Findings
TopDis improves disentanglement scores like MIG, FactorVAE, SAP, and DCI.
The method preserves reconstruction quality while enhancing disentanglement.
It works in an unsupervised manner and handles correlated factors of variation.
Abstract
We propose TopDis (Topological Disentanglement), a method for learning disentangled representations via adding a multi-scale topological loss term. Disentanglement is a crucial property of data representations substantial for the explainability and robustness of deep learning models and a step towards high-level cognition. The state-of-the-art methods are based on VAE and encourage the joint distribution of latent variables to be factorized. We take a different perspective on disentanglement by analyzing topological properties of data manifolds. In particular, we optimize the topological similarity for data manifolds traversals. To the best of our knowledge, our paper is the first one to propose a differentiable topological loss for disentanglement learning. Our experiments have shown that the proposed TopDis loss improves disentanglement scores such as MIG, FactorVAE score, SAP score,…
Peer Reviews
Decision·ICML 2024 Poster
1. The paper is clearly written and easy to follow. In detail: a. The authors explain the task of disentanglement rather clearly by providing a succinct overview of previous works. b. The motivation and contribution of the paper are also clearly defined with an intuitive explanation of the designed methodology. 2. The authors provide a variety of experiments and ablations, helping to evaluate their proposed disentanglement regularization loss practically. In detail: a. The experiments (Table
1. One of the contributions the authors mention is: “We improve the reconstruction quality by applying gradient orthogonalization;” - however, this contribution is only briefly mentioned in the conclusion and analyzed in the Appendix in greater detail. We suggest the authors to “move” the gradient orthogonalization part to the main paper. 2. As the authors explained, the RTD was defined in a previous work, but we believe it is important to be defined in the main paper. 3. In section 4.1, bulle
1. It is important to explore the constrain in the manifold of latent space for disentanglement, due to the statistical arguments of Locatello et al. (2019). The paper explored a way from topology and proposed a regularization term, which can be easily optimized. 2. The paper provided a good formulation of the TopDis loss and how to optimize it in the VAE framework.
1. The relation between the constrain on latent space and disentanglement is still unclear, the TopDis is based on VAE-framework, which is based on Probability, and the paper referred to the definition of disentanglement based Group. And the paper failed to connect the above two framework, and making the proposed TopDis only kind of an intuitive necessary condition, as shown in Figure 3. 2. From Appendix L, the best performance hyperparameters are quite different across different methods and d
(1) Inspired by [1], the proposed differentiable Representation Topology Divergence (RTD) as a loss for the VAE-framework looks promising to improve the disentanglement. (2) Rich experiments are conducted to evaluate the performance of the proposed TopDis loss for various VAE-based methods. [1] Barannikov, Serguei, et al. "Representation topology divergence: A method for comparing neural network representations." ICML 2022.
(1) It is unclear how the hyper-parameters in Eqn (4) affect the performance. There are γ_1 and γ_2 in Table 9 (appendix N), but there is only one γ in Eqn (4). (2) In Table 1, it seems that some advanced disentanglement methods performed significantly worse than the vanilla VAE (e.g. FactorVAE on 3dshapes, and β-TCVAE on MPI3D, etc), making it a little suspicious for the experimental results and/or the model selections of baselines. Besides, two important evaluations of VAE+TopDis and β-TCVAE
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Cell Image Analysis Techniques · Topological and Geometric Data Analysis
