NDPage: Efficient Address Translation for Near-Data Processing Architectures via Tailored Page Table
Qingcai Jiang, Buxin Tu, Hong An

TL;DR
NDPage introduces a tailored page table design for NDP architectures that reduces address translation overhead, significantly improving performance for data-intensive workloads by optimizing cache usage and page table structure.
Contribution
The paper proposes NDPage, a novel page table design with cache bypass and flattened structure, specifically optimized for near-data processing systems.
Findings
NDPage improves single-core NDP performance by 14.3%.
NDPage enhances 4-core NDP performance by 9.8%.
NDPage boosts 8-core NDP performance by 30.5%.
Abstract
Near-Data Processing (NDP) has been a promising architectural paradigm to address the memory wall problem for data-intensive applications. Practical implementation of NDP architectures calls for system support for better programmability, where having virtual memory (VM) is critical. Modern computing systems incorporate a 4-level page table design to support address translation in VM. However, simply adopting an existing 4-level page table in NDP systems causes significant address translation overhead because (1) NDP applications generate a lot of address translations, and (2) the limited L1 cache in NDP systems cannot cover the accesses to page table entries (PTEs). We extensively analyze the 4-level page table design in the NDP scenario and observe that (1) the memory access to page table entries is highly irregular, thus cannot benefit from the L1 cache, and (2) the last two levels of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Packet Processing and Optimization · Embedded Systems Design Techniques · Parallel Computing and Optimization Techniques
