Designing and Implementing a Generator Framework for a SIMD Abstraction Library
Johannes Pietrzyk, Alexander Krause, Dirk Habich, Wolfgang, Lehner

TL;DR
This paper introduces TSLGen, a framework for generating SIMD abstraction libraries that enhance maintainability and extensibility while maintaining performance across diverse hardware architectures.
Contribution
The paper presents TSLGen, a novel end-to-end framework for creating SIMD abstraction libraries that are easier to maintain and extend compared to existing solutions.
Findings
Generated TSL achieves comparable performance to existing libraries.
Programming effort with TSL is similar to current libraries.
Framework supports disruptive interface changes and provides valuable insights.
Abstract
The Single Instruction Multiple Data (SIMD) parallel paradigm is a well-established and heavily-used hardware-driven technique to increase the single-thread performance in different system domains such as database or machine learning. Depending on the hardware vendor and the specific processor generation/version, SIMD capabilities come in different flavors concerning the register size and the supported SIMD instructions. Due to this heterogeneity and the lack of standardized calling conventions, building high-performance and portable systems is a challenging task. To address this challenge, academia and industry have invested a remarkable effort into creating SIMD abstraction libraries that provide unified access to different SIMD hardware capabilities. However, those one-size-fits-all library approaches are inherently complex, which hampers maintainability and extensibility.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
