Hermes: Accelerating Long-Latency Load Requests via Perceptron-Based   Off-Chip Load Prediction

Rahul Bera; Konstantinos Kanellopoulos; Shankar Balachandran; David; Novo; Ataberk Olgun; Mohammad Sadrosadati; Onur Mutlu

arXiv:2209.00188·cs.AR·October 3, 2022·1 cites

Hermes: Accelerating Long-Latency Load Requests via Perceptron-Based Off-Chip Load Prediction

Rahul Bera, Konstantinos Kanellopoulos, Shankar Balachandran, David, Novo, Ataberk Olgun, Mohammad Sadrosadati, Onur Mutlu

PDF

Open Access 1 Repo

TL;DR

Hermes introduces a perceptron-based off-chip load predictor that accurately forecasts off-chip loads, enabling speculative memory requests to reduce cache access latency and improve processor performance.

Contribution

The paper presents a novel perceptron-based predictor for off-chip loads and a technique to speculatively fetch data, significantly reducing load latency.

Findings

01

Hermes improves processor performance across various workloads.

02

The perceptron predictor accurately identifies off-chip loads.

03

Speculative fetching reduces cache hierarchy access latency.

Abstract

Long-latency load requests continue to limit the performance of high-performance processors. To increase the latency tolerance of a processor, architects have primarily relied on two key techniques: sophisticated data prefetchers and large on-chip caches. In this work, we show that: 1) even a sophisticated state-of-the-art prefetcher can only predict half of the off-chip load requests on average across a wide range of workloads, and 2) due to the increasing size and complexity of on-chip caches, a large fraction of the latency of an off-chip load request is spent accessing the on-chip cache hierarchy. The goal of this work is to accelerate off-chip load requests by removing the on-chip cache access latency from their critical path. To this end, we propose a new technique called Hermes, whose key idea is to: 1) accurately predict which load requests might go off-chip, and 2)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cmu-safari/hermes
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Advanced Data Storage Technologies