Deep Incomplete Multi-view Learning via Cyclic Permutation of VAEs
Xin Gao, Jian Pu

TL;DR
This paper introduces MVP, a novel multi-view learning method using cyclic permutation of VAEs to handle incomplete data, improve view consistency, and enhance multi-view clustering and generation performance.
Contribution
The paper proposes a cyclic permutation-based VAE framework for incomplete multi-view learning, enabling invariant view relationships and missing view inference.
Findings
Outperforms existing methods on seven datasets with various missing ratios.
Effectively infers missing views and improves clustering accuracy.
Enhances multi-view generation quality.
Abstract
Multi-View Representation Learning (MVRL) aims to derive a unified representation from multi-view data by leveraging shared and complementary information across views. However, when views are irregularly missing, the incomplete data can lead to representations that lack sufficiency and consistency. To address this, we propose Multi-View Permutation of Variational Auto-Encoders (MVP), which excavates invariant relationships between views in incomplete data. MVP establishes inter-view correspondences in the latent space of Variational Auto-Encoders, enabling the inference of missing views and the aggregation of more sufficient information. To derive a valid Evidence Lower Bound (ELBO) for learning, we apply permutations to randomly reorder variables for cross-view generation and then partition them by views to maintain invariant meanings under permutations. Additionally, we enhance…
Peer Reviews
Decision·ICLR 2025 Poster
The application of cyclic permutations in the VAE latent space presents a unique and innovative approach to modeling inter-view relationships. This method appears more robust than prior approaches, effectively capturing and maintaining consistency across incomplete multi-view data. MVP is rigorously tested across multiple datasets under varying missing rates, demonstrating strong adaptability and consistently superior performance over previous methods in both partially and fully observed data s
Although the paper centers on incomplete multi-view learning, much of the processing related to incomplete data, such as section C1, is relegated to the appendix. This structure may hinder readability and comprehension. A more detailed description of the incomplete data processing steps in the main text would improve accessibility and clarity for readers. While the proposed method’s effectiveness in handling random arrangements of latent variables is supported by experiments, it also introduces
1. This paper is well-motivated since missing is a common and significant problem in real-world multi-view learning. The modelling of the missing problem in this paper is very well designed and the method of applying permutations and segmentations in the latent space is very novel. 2. The paper provides a well-structured overview that is easy for the reader to understand. In addition, implementation details of the selected technology are presented in detail. 3. The paper provides comprehensive e
1. The complexity of the proposed method is not adequately discussed. It would be helpful to compare the computation cost of the proposed method to the baselines. 2. This paper assumes that the first k dimensions of z capture information common to all views, so how is it set on different datasets? 3. This paper only shows the results generated on the PolyMNIST and MVShapeNet datasets, where the views are all RGB type. How are the results generated on datasets with other types of views, such as t
1. Extensive experiments have demonstrated the effectiveness of the methodology designed in the paper. 2. The paper is well-organized. 3. The motivation is clearly described.
1. The use of VAE for invariant feature learning has been extensively studied [1,2,3, 4]. Please analyze the differences. 2. The paper does not provide the code, yet the results in Table 2 show significant performance of the proposed method. The authenticity and fairness of the experiments are questionable. Therefore, please release the code during the rebuttal process, as this will be a key criterion in my evaluation. 3. The construction details of the PolyMNIST and MVShapeNet datasets in lin
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Face and Expression Recognition · Advanced Vision and Imaging
