Directly Constructing Low-Dimensional Solution Subspaces in Deep Neural Networks
By: Yusuf Kalyoncuoglu
While it is well-established that the weight matrices and feature manifolds of deep neural networks exhibit a low Intrinsic Dimension (ID), current state-of-the-art models still rely on massive high-dimensional widths. This redundancy is not required for representation, but is strictly necessary to solve the non-convex optimization search problem-finding a global minimum, which remains intractable for compact networks. In this work, we propose a constructive approach to bypass this optimization bottleneck. By decoupling the solution geometry from the ambient search space, we empirically demonstrate across ResNet-50, ViT, and BERT that the classification head can be compressed by even huge factors of 16 with negligible performance degradation. This motivates Subspace-Native Distillation as a novel paradigm: by defining the target directly in this constructed subspace, we provide a stable geometric coordinate system for student models, potentially allowing them to circumvent the high-dimensional search problem entirely and realize the vision of Train Big, Deploy Small.
Similar Papers
The Universal Weight Subspace Hypothesis
Machine Learning (CS)
Finds hidden patterns in AI brains.
Scalable Deep Subspace Clustering Network
CV and Pattern Recognition
Finds patterns in data much faster.
From Low Intrinsic Dimensionality to Non-Vacuous Generalization Bounds in Deep Multi-Task Learning
Machine Learning (CS)
Makes computers learn many tasks with less data.