How to Square Tensor Networks and Circuits Without Squaring Them

Lorenzo Loconte; Adri\'an Javaloy; Antonio Vergari

arXiv:2512.17090·cs.LG·December 22, 2025

How to Square Tensor Networks and Circuits Without Squaring Them

Lorenzo Loconte, Adri\'an Javaloy, Antonio Vergari

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a novel parameterization of squared tensor networks and circuits that simplifies marginalization, enabling efficient distribution estimation without losing expressiveness.

Contribution

It proposes a new way to parameterize squared circuits inspired by canonical forms, reducing computational complexity in marginalization tasks.

Findings

01

Efficient marginalization achieved in squared circuits.

02

No loss of expressiveness in distribution estimation.

03

Improved learning efficiency demonstrated in experiments.

Abstract

Squared tensor networks (TNs) and their extension as computational graphs--squared circuits--have been used as expressive distribution estimators, yet supporting closed-form marginalization. However, the squaring operation introduces additional complexity when computing the partition function or marginalizing variables, which hinders their applicability in ML. To solve this issue, canonical forms of TNs are parameterized via unitary matrices to simplify the computation of marginals. However, these canonical forms do not apply to circuits, as they can represent factorizations that do not directly map to a known TN. Inspired by the ideas of orthogonality in canonical forms and determinism in circuits enabling tractable maximization, we show how to parameterize squared circuits to overcome their marginalization overhead. Our parameterizations unlock efficient marginalization even in…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 3

Strengths

1. Very well-written paper 2. Novel connection between tensor networks and probabilistic circuits. The paper conceptually unifies canonical forms in tensor networks with determinism in probabilistic circuits and introduces orthogonality as a more general tool for tractable inference . 3. Theoretical contributions with clear practical implications. Orthogonality and unitarity are shown to guarantee O(|c|) marginalization in squared circuits instead of O(|c|²), and results extend to non-structur

Weaknesses

1. Experiments cover MNIST-style tabular/image datasets; evaluation on more complex tasks (e.g., high-dimensional continuous density estimation, conditional queries, or sampling quality metrics) would strengthen the real-world significance. 2. The core intuition behind orthogonality vs determinism and its practical implications could be communicated more clearly for a broader audience. 3. Orthogonality/unitary constraints often complicate optimization; although addressed here, more ablation on

Reviewer 02Rating 8Confidence 4

Strengths

- Interesting discussion, connecting topics studied in different communities - Well-written - Promising approach to learning high-dimensional probability distributions with tractable marginalization

Weaknesses

- Empirical analysis is very preliminary - Very long text (including appendices), difficult to revise given time constraints of a conference

Reviewer 03Rating 6Confidence 2

Strengths

Strengths + Provides several novel previously unknown results in the context of probabilistic circuits, i.e. a new class of circuits that is expressive and tractable + The connection between determinism to orthogonality seems to be unique and perhaps will offer other research directions + Guarantees on inference complexity with unitary circuits being in normalized form + Paper is high on rigor with proofs for all the key results

Weaknesses

- While the paper makes a strong contribution in probabilistic circuits with the introduction of unitary circuit learning and inference, it does not show why unitary circuits are better than existing tractable probabilistic models. The baseline comparison is with variants of squared PCs but perhaps the benefits of squared PCs over other approaches is not as clear. Maybe this is an empirical aspect that seems missing in the paper. - The choice of experiments and benchmarks was not so clear (MNIS

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Advanced Graph Neural Networks · Generative Adversarial Networks and Image Synthesis