Score: 2

CRoPE: Efficient Parametrization of Rotary Positional Embedding

Published: January 6, 2026 | arXiv ID: 2601.02728v1

By: Beicheng Lou, Zifei Xu

BigTech Affiliations: Stanford University

Potential Business Impact:

Makes computer brains use fewer parts.

Business Areas:
Image Recognition Data and Analytics, Software

Rotary positional embedding has become the state-of-the-art approach to encode position information in transformer-based models. While it is often succinctly expressed in complex linear algebra, we note that the actual implementation of $Q/K/V$-projections is not equivalent to a complex linear transformation. We argue that complex linear transformation is a more natural parametrization and saves near 50\% parameters within the attention block. We show empirically that removing such redundancy has negligible impact on the model performance both in sample and out of sample. Our modification achieves more efficient parameter usage, as well as a cleaner interpretation of the representation space.

Country of Origin
πŸ‡ΊπŸ‡Έ United States

Page Count
11 pages

Category
Computer Science:
Machine Learning (CS)