Efficient GPU Implementation for Single Block Orthogonal Dictionary Learning
Paul Irofti

TL;DR
This paper introduces a GPU-accelerated implementation of the SBO dictionary learning algorithm using OpenCL, significantly speeding up the process while maintaining high data representation quality.
Contribution
The paper presents a novel lock-free GPU implementation of SBO with a map-reduce and PGAS model, improving speed over traditional methods.
Findings
Achieves significant acceleration in dictionary learning time
Maintains comparable data representation quality to PAK-SVD
Demonstrates effective GPU utilization with lock-free design
Abstract
Dictionary training for sparse representations involves dealing with large chunks of data and complex algorithms that determine time consuming implementations. SBO is an iterative dictionary learning algorithm based on constructing unions of orthonormal bases via singular value decomposition, that represents each data item through a single best fit orthobase. In this paper we present a GPGPU approach of implementing SBO in OpenCL. We provide a lock-free solution that ensures full-occupancy of the GPU by following the map-reduce model for the sparse-coding stage and by making use of the Partitioned Global Address Space (PGAS) model for developing parallel dictionary updates. The resulting implementation achieves a favourable trade-off between algorithm complexity and data representation quality compared to PAK-SVD which is the standard overcomplete dictionary learning approach. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Seismic Imaging and Inversion Techniques · Geophysical Methods and Applications
