EONSim: An NPU Simulator for On-Chip Memory and Embedding Vector Operations
Sangun Choi, Yunho Oh

TL;DR
EONSim is a comprehensive NPU simulator that accurately models both matrix and embedding vector operations, enabling flexible architecture exploration for modern deep learning workloads.
Contribution
It introduces a holistic simulation framework supporting diverse on-chip memory management schemes for embedding workloads, filling a gap in existing NPU simulators.
Findings
Achieves 1.4% average inference time error
Achieves 2.2% average memory access count error
Supports various on-chip memory policies
Abstract
Embedding vector operations are a key component of modern deep neural network workloads. Unlike matrix operations with deterministic access patterns, embedding vector operations exhibit input data-dependent and non-deterministic memory accesses. Existing neural processing unit (NPU) simulators focus on matrix computations with simple double-buffered on-chip memory systems, lacking the modeling capability for realistic embedding behavior. Next-generation NPUs, however, call for more flexible on-chip memory architectures that can support diverse access and management schemes required by embedding workloads. To enable flexible exploration and design of emerging NPU architectures, we present EONSim, an NPU simulator that holistically models both matrix and embedding vector operations. EONSim integrates a validated performance model for matrix computations with detailed memory simulation for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices
