MDL-Pool: Adaptive Multilevel Graph Pooling Based on Minimum Description Length

Jan von Pichowski; Christopher Bl\"ocker; Ingo Scholtes

arXiv:2409.10263·cs.LG·May 16, 2025

MDL-Pool: Adaptive Multilevel Graph Pooling Based on Minimum Description Length

Jan von Pichowski, Christopher Bl\"ocker, Ingo Scholtes

PDF

Open Access 3 Reviews

TL;DR

MDL-Pool introduces an adaptive graph pooling method based on the MDL principle, effectively modeling hierarchical interdependencies and selecting optimal pooling depths for improved graph classification.

Contribution

It proposes a novel MDL-based pooling operator that explicitly accounts for hierarchical interdependencies and adapts to varying graph sizes, unlike fixed-depth approaches.

Findings

01

Competitive performance on standard graph classification datasets

02

Effective modeling of hierarchical interdependencies

03

Adaptive pooling depth selection improves accuracy

Abstract

Graph pooling compresses graphs and summarises their topological properties and features in a vectorial representation. It is an essential part of deep graph representation learning and is indispensable in graph-level tasks like classification or regression. Current approaches pool hierarchical structures in graphs by iteratively applying shallow pooling operators up to a fixed depth. However, they disregard the interdependencies between structures at different hierarchical levels and do not adapt to datasets that contain graphs with different sizes that may require pooling with various depths. To address these issues, we propose MDL-Pool, a pooling operator based on the minimum description length (MDL) principle, whose loss formulation explicitly models the interdependencies between different hierarchical levels and facilitates a direct comparison between multiple pooling alternatives…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 4

Strengths

1. The method automatically determines the optimal pooling depth per graph instance, addressing a long-standing hyperparameter issue in hierarchical pooling. 2. The paper provides experiments on both synthetic and real-world datasets, including ablations on architecture variants and pooling depths.

Weaknesses

Limited performance gain: In Tables 2 and 3, MDL-Pool does not consistently outperform baselines. For community detection, results are comparable or even worse than baselines on several datasets. Similarly, in graph classification, MDL-Pool’s average accuracy is not higher than several baselines, indicating limited empirical advantage. Insufficient justification of benefits: While the motivation is sound, the claimed benefits (interdependency modeling and adaptive depth) are not strongly suppor

Reviewer 02Rating 6Confidence 4

Strengths

The map equation is beneficial for observing the overall networks and for relevant clustering. The clustering helps for balanced training to fit the model in downstream graph analytics tasks. MDL is beneficial because it detects the depth of the input graph, which assists in effective hierarchical graph learning. The comprehensive result is better than other baselines

Weaknesses

In the experiment, the authors did not mention the hyperparameter's impact on the model. The manuscript does not provide runtime details. Is minimum description length feasible on large volume datasets? The optimization of map equations involves nested matrix operations, which can result in a computationally heavy model. Please check the model's runtime with respect to simpler pooling operations like Top-kPool and SAGPool. In the case of community detection, the datasets are very sparse. Is t

Reviewer 03Rating 4Confidence 3

Strengths

- The integration of the MDL principle and map equation into deep graph pooling is well-motivated. It provides a principled way to address overfitting and model complexity while enhancing interpretability. - The proposed multilevel loss seamlessly integrates hierarchical information, overcoming optimization issues caused by layer-wise independence in stacked pooling. - The MDL framework naturally implements Occam’s razor, removing the need for hyperparameter tuning for cluster count or levels.

Weaknesses

- The MDL-based loss focuses on topological structure and does not fully leverage node features in evaluating community quality, which might reduce performance on feature-dominant tasks. - Experiments show most graphs select only one or two pooling levels; it remains unclear whether MDL-Pool is beneficial in tasks with truly deep hierarchies. - The computation of multilevel flow matrices has quadratic cost in graph size, which may hinder scalability to very large graphs. No experiments on large-

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Advanced Graph Neural Networks · Natural Language Processing Techniques

MethodsMinimum Description Length