Reversible Deep Equilibrium Models
By: Sam McCallum, Kamran Arora, James Foster
Potential Business Impact:
Makes AI learn better with fewer steps.
Deep Equilibrium Models (DEQs) are an interesting class of implicit model where the model output is implicitly defined as the fixed point of a learned function. These models have been shown to outperform explicit (fixed-depth) models in large-scale tasks by trading many deep layers for a single layer that is iterated many times. However, gradient calculation through DEQs is approximate. This often leads to unstable training dynamics and requires regularisation or many function evaluations to fix. Here, we introduce Reversible Deep Equilibrium Models (RevDEQs) that allow for exact gradient calculation, no regularisation and far fewer function evaluations than DEQs. We show that RevDEQs achieve state-of-the-art performance on language modelling and image classification tasks against comparable implicit and explicit models.
Similar Papers
Gradient flow for deep equilibrium single-index models
Machine Learning (CS)
Makes super-deep computer brains learn faster.
DDEQs: Distributional Deep Equilibrium Models through Wasserstein Gradient Flows
Machine Learning (CS)
Helps computers understand shapes and groups of dots.
Equivariant Deep Equilibrium Models for Imaging Inverse Problems
Image and Video Processing
Trains AI to fix images without perfect examples.