On The Performance of Prefix-Sum Parallel Kalman Filters and Smoothers on GPUs
Simo S\"arkk\"a, \'Angel F. Garc\'ia-Fern\'andez

TL;DR
This paper evaluates the performance of parallel prefix-sum algorithms for Kalman filters and smoothers on GPUs, introduces a novel parallel two-filter smoother, and provides open-source Julia implementations.
Contribution
It provides a comprehensive experimental analysis of all-prefix-sum algorithms for Kalman filtering on GPUs and proposes a new parallel two-filter smoother.
Findings
Parallel scan algorithms' efficiency varies with implementation and hardware.
The novel parallel two-filter smoother shows promising performance improvements.
Open-source Julia code facilitates reproducibility and further research.
Abstract
This paper presents an experimental evaluation of parallel-in-time Kalman filters and smoothers using graphics processing units (GPUs). In particular, the paper evaluates different all-prefix-sum algorithms, that is, parallel scan algorithms for temporal parallelization of Kalman filters and smoothers in two ways: by calculating the required number of operations via simulation, and by measuring the actual run time of the algorithms on real GPU hardware. In addition, a novel parallel-in-time two-filter smoother is proposed and experimentally evaluated. Julia code for Metal and CUDA implementations of all the algorithms is made publicly available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Filter Design and Implementation · Image and Signal Denoising Methods · Advanced Data Compression Techniques
