Score: 1

Efficient High-Accuracy PDEs Solver with the Linear Attention Neural Operator

Published: October 19, 2025 | arXiv ID: 2510.16816v1

By: Ming Zhong, Zhenya Yan

Potential Business Impact:

Computers solve science problems faster, more accurately.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Neural operators offer a powerful data-driven framework for learning mappings between function spaces, in which the transformer-based neural operator architecture faces a fundamental scalability-accuracy trade-off: softmax attention provides excellent fidelity but incurs quadratic complexity $\mathcal{O}(N^2 d)$ in the number of mesh points $N$ and hidden dimension $d$, while linear attention variants reduce cost to $\mathcal{O}(N d^2)$ but often suffer significant accuracy degradation. To address the aforementioned challenge, in this paper, we present a novel type of neural operators, Linear Attention Neural Operator (LANO), which achieves both scalability and high accuracy by reformulating attention through an agent-based mechanism. LANO resolves this dilemma by introducing a compact set of $M$ agent tokens $(M \ll N)$ that mediate global interactions among $N$ tokens. This agent attention mechanism yields an operator layer with linear complexity $\mathcal{O}(MN d)$ while preserving the expressive power of softmax attention. Theoretically, we demonstrate the universal approximation property, thereby demonstrating improved conditioning and stability properties. Empirically, LANO surpasses current state-of-the-art neural PDE solvers, including Transolver with slice-based softmax attention, achieving average $19.5\%$ accuracy improvement across standard benchmarks. By bridging the gap between linear complexity and softmax-level performance, LANO establishes a scalable, high-accuracy foundation for scientific machine learning applications.

Integrating Locality-Aware Attention with Transformers for General Geometry PDEs

Machine Learning (CS)

Solves tricky math problems on weird shapes.

18 Apr 2025 1

88%

Transformer Based Linear Attention with Optimized GPU Kernel Implementation

Machine Learning (CS)

Makes AI learn faster and use less memory.

24 Oct 2025 1

88%

Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics

CV and Pattern Recognition

Lets computers see huge pictures faster and cheaper.

3 Jul 2025 2

View PDF Login to Bookmark

Page Count

31 pages

Efficient High-Accuracy PDEs Solver with the Linear Attention Neural Operator

Computers solve science problems faster, more accurately.

Technical Abstract

Integrating Locality-Aware Attention with Transformers for General Geometry PDEs

Transformer Based Linear Attention with Optimized GPU Kernel Implementation

Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics