Neural Spacetimes for DAG Representation Learning
Haitz S\'aez de Oc\'ariz Borde, Anastasis Kratsios, Marc T. Law,, Xiaowen Dong, Michael Bronstein

TL;DR
This paper introduces Neural Spacetimes, a trainable geometric framework for representing DAGs as events in a spacetime manifold, capturing both edge weights and causality with theoretical guarantees and improved embedding quality.
Contribution
The paper presents Neural Spacetimes, a novel differentiable geometry that adaptively encodes DAG structure and causality, with a universal embedding theorem and practical advantages over fixed geometries.
Findings
Achieves lower embedding distortions than fixed spacetime models.
Universal embedding with logarithmic distortion for any DAG.
Efficient parameterization with sub-cubic complexity in DAG size.
Abstract
We propose a class of trainable deep learning-based geometries called Neural Spacetimes (NSTs), which can universally represent nodes in weighted directed acyclic graphs (DAGs) as events in a spacetime manifold. While most works in the literature focus on undirected graph representation learning or causality embedding separately, our differentiable geometry can encode both graph edge weights in its spatial dimensions and causality in the form of edge directionality in its temporal dimensions. We use a product manifold that combines a quasi-metric (for space) and a partial order (for time). NSTs are implemented as three neural networks trained in an end-to-end manner: an embedding network, which learns to optimize the location of nodes as events in the spacetime manifold, and two other networks that optimize the space and time geometries in parallel, which we call a neural (quasi-)metric…
Peer Reviews
Decision·ICLR 2025 Poster
1. The problem tackled in this paper, i.e., representation learning in DAGs, is challenging and significant in the graph machine learning community. 2. Theoretical guarantees of the proposed approach are provided in the manuscript.
Despite the merits of the paper, I have the following concerns. 1. Most results presented in this paper pertain to distortion and directionality, which demonstrate that NSTs perform robustly when evaluated by these two measures. However, there are no concrete results showing the performance of NSTs in real-world learning tasks, e.g., (semi-supervised/unsupervised) node classification, link prediction, and graph classification in synthetic DAGs, real-world DAGs (e.g., web page hyperlink graphs an
The paper offers originality within its domain by constructing a novel architecture capable of embedding a DAG into a pseudo-Riemannian manifold which has several spatial and temporal components. In particular, adding the capability of embedding a given DAG into more than one temporal component is a novel contribution and allows the network to represent and preserve more complex causal structure from the DAG. Furthermore, the authors modify the neural snowflake model for use as a learnable quasi
The paper would benefit from further experimentation, e.g., showcasing the results of the network to various distributions of DAGs and being explicit about any potential weaknesses. It would be nice to know where this architecture breaks down or how. robust it is to perturbations in the assumptions. The language in lines 420-422 becomes imprecise and can lead to confusion. I would suggest sticking to a single convention here when using variables in a description. Similarly when the switch is ma
* The paper is well-organized with clear problem statement , goal (Section 1), and the proposed solution (Section 3). * The theoretical framework in Section 3 decouples spatial and temporal components while maintaining their interaction through the embedding network. * The theoretical development in Section 3.1 establishes embedding guarantees through Theorem 1, showing NSTs can universally embed k-point DAGs. * The experimental validation in Section 4 demonstrates clear improvements over basel
* In Section 3.1, while the paper proves existence of neural sapcetime embeddings with low distortion, it provides limited practical guidence on selecting the number of temporal dimension. This could be important for complex DAGs where the minimal sufficient dimension is not immediately clear. * The optimization approach in Section 3.2 focuses on preserving local structure during training. While the authors provide theoretical guarantees about global structure preservation, the empirical validat
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsFocus
