Compiling ONNX Neural Network Models Using MLIR
Tian Jin, Gheorghe-Teodor Bercea, Tung D. Le, Tong Chen, Gong Su,, Haruki Imai, Yasushi Negishi, Anh Leu, Kevin O'Brien, Kiyokuni Kawachiya, and, Alexandre E. Eichenberger

TL;DR
This paper introduces onnx-mlir, an open-source compiler built on MLIR infrastructure, that translates ONNX models into optimized code for neural network inference across diverse deployment environments.
Contribution
It presents a novel MLIR-based compiler with new ONNX and loop dialects, enabling efficient model compilation and optimization for inference.
Findings
Early optimization results show promising performance improvements.
The compiler supports various ONNX models through dialect-based representations.
The approach facilitates portability and efficiency in deploying neural networks.
Abstract
Deep neural network models are becoming increasingly popular and have been used in various tasks such as computer vision, speech recognition, and natural language processing. Machine learning models are commonly trained in a resource-rich environment and then deployed in a distinct environment such as high availability machines or edge devices. To assist the portability of models, the open-source community has proposed the Open Neural Network Exchange (ONNX) standard. In this paper, we present a high-level, preliminary report on our onnx-mlir compiler, which generates code for the inference of deep neural network models described in the ONNX format. Onnx-mlir is an open-source compiler implemented using the Multi-Level Intermediate Representation (MLIR) infrastructure recently integrated in the LLVM project. Onnx-mlir relies on the MLIR concept of dialects to implement its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Neural Network Applications · Embedded Systems Design Techniques
