CoPE: A Lightweight Complex Positional Encoding
By: Avinash Amballa
Potential Business Impact:
Helps computers understand word order better.
Recent studies have demonstrated the effectiveness of position encoding in transformer architectures. By incorporating positional information, this approach provides essential guidance for modeling dependencies between elements across different sequence positions. We introduce CoPE (a lightweight Complex Positional Encoding), a novel architecture that leverages complex-valued encoding to encode both content and positional information. Our approach replaces traditional positional encodings with complex embeddings where the real part captures semantic content and the imaginary part encodes positional information. We introduce phase-aware attention in the first layer of the transformer model to capture position-dependent patterns, followed by standard attention layers for higher-levels. We show that CoPE doesn't exhibit long term decay and is compatible with linear attention. Experimental evaluation on the GLUE benchmark suggest that our approach achieves superior performance with less computational complexity, compared to RoPE, Sinusoidal and Learned positional encodings.
Similar Papers
SeqPE: Transformer with Sequential Position Encoding
Machine Learning (CS)
Helps AI understand longer texts and images.
HoPE: Hyperbolic Rotary Positional Encoding for Stable Long-Range Dependency Modeling in Large Language Models
Computation and Language
Makes AI understand long sentences better.
Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings
Machine Learning (CS)
Helps AI understand long texts better.