CoPE: A Lightweight Complex Positional Encoding
Avinash Amballa

TL;DR
CoPE introduces a lightweight complex-valued positional encoding for transformers, encoding content and position simultaneously, improving performance and efficiency on benchmarks like GLUE.
Contribution
The paper proposes CoPE, a novel complex-valued positional encoding that captures content and position, with phase-aware attention, enhancing transformer performance and efficiency.
Findings
Outperforms RoPE, Sinusoidal, and Learned encodings on GLUE
Does not exhibit long-term decay in positional information
Compatible with linear attention methods
Abstract
Recent studies have demonstrated the effectiveness of position encoding in transformer architectures. By incorporating positional information, this approach provides essential guidance for modeling dependencies between elements across different sequence positions. We introduce CoPE (a lightweight Complex Positional Encoding), a novel architecture that leverages complex-valued encoding to encode both content and positional information. Our approach replaces traditional positional encodings with complex embeddings where the real part captures semantic content and the imaginary part encodes positional information. We introduce phase-aware attention in the first layer of the transformer model to capture position-dependent patterns, followed by standard attention layers for higher-levels. We show that CoPE doesn't exhibit long term decay and is compatible with linear attention. Experimental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
