CIAO: Cache Interference-Aware Throughput-Oriented Architecture and   Scheduling for GPUs

Jie Zhang; Shuwen Gao; Nam Sung Kim; Myoungsoo Jung

arXiv:1805.07718·cs.AR·May 22, 2018

CIAO: Cache Interference-Aware Throughput-Oriented Architecture and Scheduling for GPUs

Jie Zhang, Shuwen Gao, Nam Sung Kim, Myoungsoo Jung

PDF

Open Access

TL;DR

This paper introduces CIAO, a GPU architecture and scheduling method that reduces cache interference by redirecting memory requests and selectively throttling warps, significantly improving performance.

Contribution

The paper presents a novel cache interference-aware architecture and warp scheduling approach that adaptively manages cache interference and enhances throughput.

Findings

01

54% performance improvement over prior methods

02

Effective cache interference reduction through shared memory redirection

03

Selective warp throttling improves overall GPU throughput

Abstract

A modern GPU aims to simultaneously execute more warps for higher Thread-Level Parallelism (TLP) and performance. When generating many memory requests, however, warps contend for limited cache space and thrash cache, which in turn severely degrades performance. To reduce such cache thrashing, we may adopt cache locality-aware warp scheduling which gives higher execution priority to warps with higher potential of data locality. However, we observe that warps with high potential of data locality often incurs far more cache thrashing or interference than warps with low potential of data locality. Consequently, cache locality-aware warp scheduling may undesirably increase cache interference and/or unnecessarily decrease TLP. In this paper, we propose Cache Interference-Aware throughput-Oriented (CIAO) on-chip memory architecture and warp scheduling which exploit unused shared memory space…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Interconnection Networks and Systems