Morello: Compiling Fast Neural Networks with Dynamic Programming and   Spatial Compression

Samuel J. Kaufman; Ren\'e Just; and Rastislav Bodik

arXiv:2505.01637·cs.PL·May 6, 2025

Morello: Compiling Fast Neural Networks with Dynamic Programming and Spatial Compression

Samuel J. Kaufman, Ren\'e Just, and Rastislav Bodik

PDF

Open Access 1 Repo

TL;DR

This paper introduces Morello, a compiler that uses dynamic programming and spatial compression to efficiently explore a large search space of neural network optimization programs, achieving high-throughput inference on x86 CPUs.

Contribution

It presents a novel dynamic programming approach with a memory-efficient memoization technique for neural network program optimization, enabling exploration of larger search spaces than prior methods.

Findings

01

Morello successfully synthesized high-throughput matrix multiplication programs.

02

An affine cost model effectively guides program selection for optimization.

03

The approach enables practical deployment of optimized neural network kernels.

Abstract

High-throughput neural network inference requires coordinating many optimization decisions, including parallel tiling, microkernel selection, and data layout. The product of these decisions forms a search space of programs which is typically intractably large. Existing approaches (e.g., auto-schedulers) often address this problem by sampling this space heuristically. In contrast, we introduce a dynamic-programming-based approach to explore more of the search space by iteratively decomposing large program specifications into smaller specifications reachable from a set of rewrites, then composing a final program from each rewrite that minimizes an affine cost model. To reduce memory requirements, we employ a novel memoization table representation, which indexes specifications by coordinates in $Z_{\geq 0}$ and compresses identical, adjacent solutions. This approach can visit a much larger…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

samkaufman/morello
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsSparse Evolutionary Training