Joint Optimization of Neural Autoregressors via Scoring rules
By: Jonas Landsgesell
Potential Business Impact:
Helps computers learn from few examples.
Non-parametric distributional regression has achieved significant milestones in recent years. Among these, the Tabular Prior-Data Fitted Network (TabPFN) has demonstrated state-of-the-art performance on various benchmarks. However, a challenge remains in extending these grid-based approaches to a truly multivariate setting. In a naive non-parametric discretization with $N$ bins per dimension, the complexity of an explicit joint grid scales exponentially and the paramer count of the neural networks rise sharply. This scaling is particularly detrimental in low-data regimes, as the final projection layer would require many parameters, leading to severe overfitting and intractability.
Similar Papers
A general framework for deep learning
Statistics Theory
Teaches computers to learn from messy data.
Consistency of Learned Sparse Grid Quadrature Rules using NeuralODEs
Numerical Analysis
Makes math problems with many parts easier to solve.
Beyond One-Size-Fits-All: Neural Networks for Differentially Private Tabular Data Synthesis
Machine Learning (CS)
Makes fake data that's more private and accurate.