Alternative positional encoding functions for neural transformers
By: Ezequiel Lopez-Rubio, Macoris Decena-Gimenez, Rafael Marcos Luque-Baena
A key module in neural transformer-based deep architectures is positional encoding. This module enables a suitable way to encode positional information as input for transformer neural layers. This success has been rooted in the use of sinusoidal functions of various frequencies, in order to capture recurrent patterns of differing typical periods. In this work, an alternative set of periodic functions is proposed for positional encoding. These functions preserve some key properties of sinusoidal ones, while they depart from them in fundamental ways. Some tentative experiments are reported, where the original sinusoidal version is substantially outperformed. This strongly suggests that the alternative functions may have a wider use in other transformer architectures.
Similar Papers
Theoretical Analysis of Positional Encodings in Transformer Models: Impact on Expressiveness and Generalization
Machine Learning (CS)
Helps AI understand longer stories better.
Extrapolation of Periodic Functions Using Binary Encoding of Continuous Numerical Values
Machine Learning (CS)
Makes computers learn patterns they haven't seen.
Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression
Machine Learning (Stat)
Makes AI models less accurate and more easily fooled.