On the Accelerating of Two-dimensional Smart Laplacian Smoothing on the GPU
Kunyang Zhao, Gang Mei, Nengxiong Xu, and Jiayin Zhang

TL;DR
This paper introduces a GPU-accelerated implementation of 2D Smart Laplacian smoothing for mesh processing, achieving significant speedups and exploring data layouts, iteration forms, and CUDA Dynamic Parallelism to optimize performance.
Contribution
It presents a novel GPU implementation of Smart Laplacian smoothing that leverages different data layouts, iteration strategies, and CUDA features for enhanced efficiency.
Findings
Achieved up to 44x speedup on GPU GT640.
AoS data layout outperforms SoA in efficiency.
Using CUDA Dynamic Parallelism slightly improves performance.
Abstract
This paper presents a GPU-accelerated implementation of two-dimensional Smart Laplacian smoothing. This implementation is developed under the guideline of our paradigm for accelerating Laplacianbased mesh smoothing [13]. Two types of commonly used data layouts, Array-of-Structures (AoS) and Structure-of-Arrays (SoA) are used to represent triangular meshes in our implementation. Two iteration forms that have different choices of the swapping of intermediate data are also adopted. Furthermore, the feature CUDA Dynamic Parallelism (CDP) is employed to realize the nested parallelization in Smart Laplacian smoothing. Experimental results demonstrate that: (1) our implementation can achieve the speedups of up to 44x on the GPU GT640; (2) the data layout AoS can always obtain better efficiency than the SoA layout; (3) the form that needs to swap intermediate nodal coordinates is always slower…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
