MAP-UOT: A Memory-Efficient Approach to Unbalanced Optimal Transport Implementation
Chengyu Sun, Jinyu Hu, Hong Jiang

TL;DR
This paper introduces MAP-UOT, a memory-efficient implementation of unbalanced optimal transport that significantly accelerates computation on CPU and GPU by addressing memory bottlenecks, outperforming existing solutions.
Contribution
It proposes a novel memory-optimized approach for UOT, validated through extensive experiments demonstrating substantial performance gains over state-of-the-art methods.
Findings
Single-threaded performance up to 2.9X faster
Parallelized performance up to 3.5X faster
Effective on CPU, GPU, and supercomputers
Abstract
Unbalanced optimal transport (UOT) has been widely used as a fundamental tool in many application domains, where it often dominates the application running time. While many researchers have proposed various optimizations for UOT, few have attempted to optimize it from a computer architecture's perspective. In this paper, we first study the performance bottlenecks of UOT through a series of experiments, which reveals that UOT is heavily memory-bound. Guided by these findings, we propose MAP-UOT, a Memory-efficient APproach to the implementation and optimization of UOT on CPU and GPU platforms. Our experimental evaluations show that the proposed strategy consistently and significantly outperforms the state-of-the-art (SOTA) implementations. Specifically, it provides single-threaded performance improvement over POT/COFFEE by up to 2.9X/2.4X, with an average of 1.9X/1.6X. At the same time,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTransportation and Mobility Innovations · Transportation Planning and Optimization · Smart Parking Systems Research
