Lag Operator SSMs: A Geometric Framework for Structured State Space Modeling
By: Sutashu Tomonaga, Kenji Doya, Noboru Murata
Structured State Space Models (SSMs), which are at the heart of the recently popular Mamba architecture, are powerful tools for sequence modeling. However, their theoretical foundation relies on a complex, multi-stage process of continuous-time modeling and subsequent discretization, which can obscure intuition. We introduce a direct, first-principles framework for constructing discrete-time SSMs that is both flexible and modular. Our approach is based on a novel lag operator, which geometrically derives the discrete-time recurrence by measuring how the system's basis functions "slide" and change from one timestep to the next. The resulting state matrices are computed via a single inner product involving this operator, offering a modular design space for creating novel SSMs by flexibly combining different basis functions and time-warping schemes. To validate our approach, we demonstrate that a specific instance exactly recovers the recurrence of the influential HiPPO model. Numerical simulations confirm our derivation, providing new theoretical tools for designing flexible and robust sequence models.
Similar Papers
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
Machine Learning (CS)
Makes computers understand long stories faster.
The Curious Case of In-Training Compression of State Space Models
Machine Learning (CS)
Shrinks computer models during learning for speed.
Deep Learning-based Approaches for State Space Models: A Selective Review
Machine Learning (Stat)
Helps computers understand changing information better.