The Expressivity Boundary of Probabilistic Circuits: A Comparison with Large Language Models
Zhiyu Zhao, Xuejie Liu, Muhan Zhang, Anji Liu

TL;DR
This paper compares probabilistic circuits and large language models, revealing expressivity limitations of PCs and proposing logit-space parameterization to improve their language modeling capabilities.
Contribution
It introduces a unified autoregressive formulation for PCs and LLMs, analyzes their expressivity limits, and demonstrates how logit-space parameterization narrows the gap.
Findings
Logit-space parameterization improves PC expressivity in language modeling.
Structured-decomposable PCs match Transformer separation rank on aligned partitions.
Decomposable PCs are more expressive than structured-decomposable ones, but are harder to optimize.
Abstract
Probabilistic Circuits (PCs) are deep generative models that support exact and efficient probabilistic inference. Yet in autoregressive language modeling, PCs still lag behind Transformer-based large language models (LLMs), suggesting an important expressivity gap. In this work, we compare PCs and LLMs under a unified autoregressive formulation. First, an output bottleneck: PCs parameterize predictions as convex combinations in probability space, which struggles to represent the sharp distributions typical of language; adopting a logit-space parameterization substantially narrows this gap. Second, a context-encoding bottleneck: we prove that structured-decomposable PCs can match Transformer separation rank on vtree-aligned partitions, but show, both theoretically and empirically, that this capacity is limited to partitions aligned with the fixed routing structure, leading to severe…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
