Linear-Complexity Relaxed Word Mover's Distance with GPU Acceleration
Kubilay Atasu, Thomas Parnell, Celestine D\"unner, Manolis Sifalakis,, Haralampos Pozidis, Vasileios Vasileiadis, Michail Vlachos, Cesar Berrospi,, Abdel Labbi

TL;DR
This paper introduces LC-RWMD, a GPU-accelerated, linear-time implementation of the Relaxed Word Mover's Distance, significantly speeding up document similarity computations for large-scale text data.
Contribution
The paper presents a novel linear-complexity, GPU-optimized implementation of RWMD that outperforms previous quadratic and cubic methods in speed, enabling scalable large-scale text analysis.
Findings
Performance improved by two orders of magnitude over previous GPU-based quadratic RWMD
Achieves three to four orders of magnitude speedup over distributed WMD implementations
Efficiently scales across GPU clusters for large document sets
Abstract
The amount of unstructured text-based data is growing every day. Querying, clustering, and classifying this big data requires similarity computations across large sets of documents. Whereas low-complexity similarity metrics are available, attention has been shifting towards more complex methods that achieve a higher accuracy. In particular, the Word Mover's Distance (WMD) method proposed by Kusner et al. is a promising new approach, but its time complexity grows cubically with the number of unique words in the documents. The Relaxed Word Mover's Distance (RWMD) method, again proposed by Kusner et al., reduces the time complexity from qubic to quadratic and results in a limited loss in accuracy compared with WMD. Our work contributes a low-complexity implementation of the RWMD that reduces the average time complexity to linear when operating on large sets of documents. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
