FACTS: A Factored State-Space Framework For World Modelling

Li Nanbo; Firas Laakom; Yucheng Xu; Wenyi Wang; J\"urgen Schmidhuber

arXiv:2410.20922·cs.AI·March 3, 2025

FACTS: A Factored State-Space Framework For World Modelling

Li Nanbo, Firas Laakom, Yucheng Xu, Wenyi Wang, J\"urgen Schmidhuber

PDF

Open Access 1 Repo 1 Video 3 Reviews

TL;DR

The paper introduces FACTS, a novel recurrent framework for spatial-temporal world modelling that efficiently captures complex dependencies, supports parallel computation, and outperforms existing models across various tasks.

Contribution

FACTS is a new graph-structured memory framework with permutation invariance and parallel processing, advancing state-space models for world modelling.

Findings

01

Outperforms or matches state-of-the-art models in diverse tasks

02

Supports parallel computation for high-dimensional sequences

03

Provides permutation-invariant and adaptable memory representations

Abstract

World modelling is essential for understanding and predicting the dynamics of complex systems by learning both spatial and temporal dependencies. However, current frameworks, such as Transformers and selective state-space models like Mambas, exhibit limitations in efficiently encoding spatial and temporal structures, particularly in scenarios requiring long-term high-dimensional sequence modelling. To address these issues, we propose a novel recurrent framework, the \textbf{FACT}ored \textbf{S}tate-space (\textbf{FACTS}) model, for spatial-temporal world modelling. The FACTS framework constructs a graph-structured memory with a routing mechanism that learns permutable memory representations, ensuring invariance to input permutations while adapting through selective state-space propagation. Furthermore, FACTS supports parallel computation of high-dimensional sequences. We empirically…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 4

Strengths

- The proposed architecture introduces a permutable memory structure, allowing flexible handling of unordered or dynamically changing inputs. The paper achieves improved performance over baselines by compressing history efficiently, and hence capturing long-term dependencies. - The paper is easy to read and comprehend. - The results shown on long term forecasting are interesting, and helps the reviewer to understand the implications of the proposed work better (especially forecasting with pre-d

Weaknesses

- Object centric video modelling results are a bit weak. it will be interesting to report results also on OBJ3D (another benchmark used in Slotformer paper). It will also be helpful to report downstream results like Predictive VQA on CLEVRER, Physion (similar experiments as in Slotformer paper).

Reviewer 02Rating 6Confidence 3

Strengths

1. The paper addresses the critical challenge of input feature variance, an interesting issue in spatial-temporal learning, by introducing a novel method that utilizes a memory-input routing mechanism. This approach effectively manages the dynamic relationships between input features, ensuring robust modeling even when input orders change. 2. The proposed FACTS model is both simple and highly effective due to its memory-input routing mechanism, which dynamically assigns input features to latent

Weaknesses

For the slot dynamics prediction experiment, the method proposed in the paper relies on a pre-trained encoder and is not end-to-end, which may limit its applicability.

Reviewer 03Rating 6Confidence 4

Strengths

- The paper is well-written and easy to follow. - Modular recurrent architectures are well-studied for world modelling in recent literature, this paper introduces modularity into SSMs while also maintaining their parallel processing capabilities thus the approach seems promising in terms of effeciency.

Weaknesses

- One integral component of the model is the attention mechanism which assigns input nodes to latent factors. I believe that similar kinds of attention mechanism for the tasks similar to the ones studied in this paper have already been explored before in various past works [1, 2, 3]. I wonder if the authors could present a comparison of their method to these approaches or atleast highlight the differences. Specifically, [2] proposes to also incorporate modularity and factorization into SSMs, it

Code & Models

Repositories

nanboli/facts
pytorchOfficial

Videos

FACTS: A Factored State-Space Framework for World Modelling· slideslive

Taxonomy

TopicsModeling and Simulation Systems · Simulation Techniques and Applications