Composing Linear Layers from Irreducibles
By: Travis Pence, Daisuke Yamada, Vikas Singh
Potential Business Impact:
Makes AI smarter using geometric shapes.
Contemporary large models often exhibit behaviors suggesting the presence of low-level primitives that compose into modules with richer functionality, but these fundamental building blocks remain poorly understood. We investigate this compositional structure in linear layers by asking: can we identify/synthesize linear transformations from a minimal set of geometric primitives? Using Clifford algebra, we show that linear layers can be expressed as compositions of bivectors -- geometric objects encoding oriented planes -- and introduce a differentiable algorithm that decomposes them into products of rotors. This construction uses only O(log^2 d) parameters, versus O(d^2) required by dense matrices. Applied to the key, query, and value projections in LLM attention layers, our rotor-based layers match the performance of strong baselines such as block-Hadamard and low-rank approximations. Our findings provide an algebraic perspective on how these geometric primitives can compose into higher-level functions within deep models.
Similar Papers
Layer Specialization Underlying Compositional Reasoning in Transformers
Machine Learning (CS)
Computers learn to build new ideas from old ones.
Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate
Machine Learning (CS)
Builds smarter AI by combining and growing parts.
Directional Non-Commutative Monoidal Structures with Interchange Law via Commutative Generators
Machine Learning (CS)
Unifies math tools for better data understanding.