Scratchpad Sharing in GPUs

Vishwesh Jatala; Jayvant Anantpur; Amey Karkare

arXiv:1607.03238·cs.AR·February 14, 2017

Scratchpad Sharing in GPUs

Vishwesh Jatala, Jayvant Anantpur, Amey Karkare

PDF

Open Access

TL;DR

This paper introduces Scratchpad Sharing, a combination of architectural and compiler optimizations for GPUs that enhances scratchpad memory utilization, leading to significant performance improvements in GPGPU applications.

Contribution

It proposes a novel scratchpad sharing technique with scheduling and compiler strategies to better utilize scratchpad memory in GPUs.

Findings

01

Average performance improvement of 19% across tested kernels

02

Maximum improvement of 92.17% in certain kernels

03

Effective utilization of unutilized scratchpad memory

Abstract

GPGPU applications exploit on-chip scratchpad memory available in the Graphics Processing Units (GPUs) to improve performance. The amount of thread level parallelism present in the GPU is limited by the number of resident threads, which in turn depends on the availability of scratchpad memory in its streaming multiprocessor (SM). Since the scratchpad memory is allocated at thread block granularity, part of the memory may remain unutilized. In this paper, we propose architectural and compiler optimizations to improve the scratchpad utilization. Our approach, Scratchpad Sharing, addresses scratchpad under-utilization by launching additional thread blocks in each SM. These thread blocks use unutilized scratchpad and also share scratchpad with other resident blocks. To improve the performance of scratchpad sharing, we propose Owner Warp First (OWF) scheduling that schedules warps from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Interconnection Networks and Systems