TL;DR
LLAMA is a C++ library that enables flexible, zero-overhead memory layout abstraction for data structures, improving performance and portability across heterogeneous hardware architectures.
Contribution
It introduces a novel C++ abstraction layer for memory layouts, allowing seamless switching and extension, with demonstrated performance benefits in real-world applications.
Findings
LLAMA-generated layouts match manual implementations in performance.
Layout-aware copy routines outperform naive copying methods.
Integrations show significant speedups in real-world benchmarks.
Abstract
The performance gap between CPU and memory widens continuously. Choosing the best memory layout for each hardware architecture is increasingly important as more and more programs become memory bound. For portable codes that run across heterogeneous hardware architectures, the choice of the memory layout for data structures is ideally decoupled from the rest of a program. This can be accomplished via a zero-runtime-overhead abstraction layer, underneath which memory layouts can be freely exchanged. We present the Low-Level Abstraction of Memory Access (LLAMA), a C++ library that provides such a data structure abstraction layer with example implementations for multidimensional arrays of nested, structured data. LLAMA provides fully C++ compliant methods for defining and switching custom memory layouts for user-defined data types. The library is extensible with third-party allocators.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
