An efficient implementation of parallel simulated annealing algorithm in   GPUs

A.M. Ferreiro; J.A. Garc\'ia; J.G. L\'opez-Salas; and C. V\'azquez

arXiv:2408.00018·cs.DC·August 2, 2024

An efficient implementation of parallel simulated annealing algorithm in GPUs

A.M. Ferreiro, J.A. Garc\'ia, J.G. L\'opez-Salas, and C. V\'azquez

PDF

TL;DR

This paper presents a highly optimized GPU implementation of the simulated annealing algorithm using CUDA, including sequential, asynchronous, and synchronous versions, with benchmarking on the Schwefel function and a hybrid approach for improved performance.

Contribution

The paper introduces a novel, efficient GPU-based implementation of simulated annealing with multiple parallel versions and a hybrid method, enhancing performance for large-scale optimization problems.

Findings

01

Synchronous GPU version outperforms asynchronous in convergence and accuracy.

02

Benchmark results show significant speedup over CPU implementations.

03

Hybrid method improves solution quality and computational efficiency.

Abstract

In this work we propose a highly optimized version of a simulated annealing (SA) algorithm adapted to the more recently developed Graphic Processor Units (GPUs). The programming has been carried out with CUDA toolkit, specially designed for Nvidia GPUs. For this purpose, efficient versions of SA have been first analyzed and adapted to GPUs. Thus, an appropriate sequential SA algorithm has been developed as a starting point. Next, a straightforward asynchronous parallel version has been implemented and then a specific and more efficient synchronous version has been developed. A wide appropriate benchmark to illustrate the performance properties of the implementation has been considered. Among all tests, a classical sample problem provided by the minimization of the normalized Schwefel function has been selected to compare the behavior of the sequential, asynchronous, and synchronous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.