A Novel SIMD-Optimized Implementation for Fast and Memory-Efficient Trigonometric Computation
Nikhil Dev Goyal, Parth Arora

TL;DR
This paper introduces a SIMD-optimized trigonometric computation method that is significantly faster and more memory-efficient than existing implementations, especially benefiting resource-constrained devices like low-end FPGAs.
Contribution
The paper presents a novel SIMD-optimized approach for trigonometric functions that outperforms inbuilt and Taylor-based methods in speed and memory usage.
Findings
5x faster than inbuilt C++ functions
Requires no precomputations
Reduces hardware usage significantly
Abstract
This paper proposes a novel set of trigonometric implementations which are 5x faster than the inbuilt C++ functions. The proposed implementation is also highly memory efficient requiring no precomputations of any kind. Benchmark comparisons are done versus inbuilt functions and an optimized taylor implementation. Further, device usage estimates are also obtained, showing significant hardware usage reduction compared to inbuilt functions. This improvement could be particularly useful for low-end FPGAs or other resource-constrained devices.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInertial Sensor and Navigation · Geophysics and Gravity Measurements · Statistical and numerical algorithms
