A Fast, Vectorizable Algorithm for Producing Single-Precision   Sine-Cosine Pairs

Marcus H. Mendenhall

arXiv:cs/0406049·cs.MS·May 23, 2007

A Fast, Vectorizable Algorithm for Producing Single-Precision Sine-Cosine Pairs

Marcus H. Mendenhall

PDF

Open Access

TL;DR

This paper introduces a fast, vectorizable algorithm for computing sine and cosine pairs in single-precision, optimized for architectures like PowerPC AltiVec and easily adaptable to others such as Intel SSE.

Contribution

The paper proposes a novel, branch-free algorithm for sine-cosine computation that enhances performance through vectorization and is portable across different processor architectures.

Findings

01

High-speed sine-cosine pair computation without branches

02

Efficient implementation on PowerPC AltiVec processors

03

Easy adaptation to architectures like Intel SSE

Abstract

This paper presents an algorithm for computing Sine-Cosine pairs to modest accuracy, but in a manner which contains no conditional tests or branching, making it highly amenable to vectorization. An exemplary implementation for PowerPC AltiVec processors is included, but the algorithm should be easily portable to other achitectures, such as Intel SSE.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Algorithms and Applications