No Saved Kaleidosope: an 100% Jitted Neural Network Coding Language with Pythonic Syntax
Augusto Seben da Rosa, Marlon Daniel Angeli, Jorge Aikes Junior, Alef, Iury Ferreira, Lucas Rafael Gris, Anderson da Silva Soares, Arnaldo Candido, Junior, Frederico Santos de Oliveira, Gabriel Trevisan Damke, Rafael, Teixeira Sousa

TL;DR
This paper introduces a fully jitted neural network coding language with Pythonic syntax, combining object-oriented features, automatic differentiation, and high-performance GPU operations, achieving comparable results to PyTorch on CIFAR-10.
Contribution
It presents a novel jitted compiler for neural networks with Python-like syntax, integrating advanced GPU management and automatic differentiation, and demonstrates competitive performance on benchmark datasets.
Findings
Achieves similar speed and accuracy to PyTorch on CIFAR-10.
Shows degraded performance on ImageNet and GRU tasks compared to existing frameworks.
Provides an open-source implementation for further research and development.
Abstract
We developed a jitted compiler for training Artificial Neural Networks using C++, LLVM and Cuda. It features object-oriented characteristics, strong typing, parallel workers for data pre-processing, pythonic syntax for expressions, PyTorch like model declaration and Automatic Differentiation. We implement the mechanisms of cache and pooling in order to manage VRAM, cuBLAS for high performance matrix multiplication and cuDNN for convolutional layers. Our experiments with Residual Convolutional Neural Networks on ImageNet, we reach similar speed but degraded performance. Also, the GRU network experiments show similar accuracy, but our compiler have degraded speed in that task. However, our compiler demonstrates promising results at the CIFAR-10 benchmark, in which we reach the same performance and about the same speed as PyTorch. We make the code publicly available at:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques
MethodsGated Recurrent Unit · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
