Score: 0

What can we learn from signals and systems in a transformer? Insights for probabilistic modeling and inference architecture

Published: August 27, 2025 | arXiv ID: 2508.20211v1

By: Heng-Sheng Chang, Prashant G. Mehta

Potential Business Impact:

Lets computers guess words by understanding patterns.

Business Areas:
Predictive Analytics Artificial Intelligence, Data and Analytics, Software

In the 1940s, Wiener introduced a linear predictor, where the future prediction is computed by linearly combining the past data. A transformer generalizes this idea: it is a nonlinear predictor where the next-token prediction is computed by nonlinearly combining the past tokens. In this essay, we present a probabilistic model that interprets transformer signals as surrogates of conditional measures, and layer operations as fixed-point updates. An explicit form of the fixed-point update is described for the special case when the probabilistic model is a hidden Markov model (HMM). In part, this paper is in an attempt to bridge the classical nonlinear filtering theory with modern inference architectures.

Country of Origin
🇺🇸 United States

Page Count
21 pages

Category
Computer Science:
Machine Learning (CS)