Low-Rank Constraints for Fast Inference in Structured Models
Justin T. Chiu, Yuntian Deng, Alexander M. Rush

TL;DR
This paper introduces a low-rank constraint approach to reduce the computational complexity of structured probabilistic models, enabling faster inference without sacrificing accuracy in large state spaces.
Contribution
It proposes a simple low-rank constraint method that accelerates inference in structured models by trading off model expressivity and speed, applicable across various domains.
Findings
Achieves significant speedups in inference.
Maintains accuracy comparable to standard models.
Applicable to language, music, grammar induction, and video modeling.
Abstract
Structured distributions, i.e. distributions over combinatorial spaces, are commonly used to learn latent probabilistic representations from observed data. However, scaling these models is bottlenecked by the high computational and memory complexity with respect to the size of the latent representations. Common models such as Hidden Markov Models (HMMs) and Probabilistic Context-Free Grammars (PCFGs) require time and space quadratic and cubic in the number of hidden states respectively. This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models. We show that by viewing the central inference step as a matrix-vector product and using a low-rank constraint, we can trade off model expressivity and speed via the rank. Experiments with neural parameterized structured models for language modeling, polyphonic music modeling,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsFault Detection and Control Systems · Reservoir Engineering and Simulation Methods · Seismic Imaging and Inversion Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
