HyperAdapt: Simple High-Rank Adaptation
By: Abel Gurung, Joseph Campbell
Potential Business Impact:
Makes smart computer programs learn faster with less effort.
Foundation models excel across diverse tasks, but adapting them to specialized applications often requires fine-tuning, an approach that is memory and compute-intensive. Parameter-efficient fine-tuning (PEFT) methods mitigate this by updating only a small subset of weights. In this paper, we introduce HyperAdapt, a parameter-efficient fine-tuning method that significantly reduces the number of trainable parameters compared to state-of-the-art methods like LoRA. Specifically, HyperAdapt adapts a pre-trained weight matrix by applying row- and column-wise scaling through diagonal matrices, thereby inducing a high-rank update while requiring only $n+m$ trainable parameters for an $n \times m$ matrix. Theoretically, we establish an upper bound on the rank of HyperAdapt's updates, and empirically, we confirm that it consistently induces high-rank transformations across model layers. Experiments on GLUE, arithmetic reasoning, and commonsense reasoning benchmarks with models up to 14B parameters demonstrate that HyperAdapt matches or nearly matches the performance of full fine-tuning and state-of-the-art PEFT methods while using orders of magnitude fewer trainable parameters.
Similar Papers
Towards Higher Effective Rank in Parameter-efficient Fine-tuning using Khatri--Rao Product
Machine Learning (CS)
Makes AI learn better without needing more power.
Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations
Machine Learning (CS)
Makes AI learn new things faster and cheaper.
HyperAdaLoRA: Accelerating LoRA Rank Allocation During Training via Hypernetworks without Sacrificing Performance
Machine Learning (CS)
Makes AI learn faster without needing more power.