Generative or Discriminative? Revisiting Text Classification in the Era of Transformers
By: Siva Rajesh Kasa , Karan Gupta , Sumegh Roychowdhury and more
Potential Business Impact:
Helps computers learn better with less data.
The comparison between discriminative and generative classifiers has intrigued researchers since Efron's seminal analysis of logistic regression versus discriminant analysis. While early theoretical work established that generative classifiers exhibit lower sample complexity but higher asymptotic error in simple linear settings, these trade-offs remain unexplored in the transformer era. We present the first comprehensive evaluation of modern generative and discriminative architectures - Auto-regressive modeling, Masked Language Modeling, Discrete Diffusion, and Encoders for text classification. Our study reveals that the classical 'two regimes' phenomenon manifests distinctly across different architectures and training paradigms. Beyond accuracy, we analyze sample efficiency, calibration, noise robustness, and ordinality across diverse scenarios. Our findings offer practical guidance for selecting the most suitable modeling approach based on real-world constraints such as latency and data limitations.
Similar Papers
Generative Classifiers Avoid Shortcut Solutions
Machine Learning (CS)
Makes AI smarter by learning real causes, not tricks.
Assessing Classical Machine Learning and Transformer-based Approaches for Detecting AI-Generated Research Text
Computation and Language
Finds if writing is from a person or AI.
A Review on Generative AI For Text-To-Image and Image-To-Image Generation and Implications To Scientific Images
CV and Pattern Recognition
Creates pictures from words and changes pictures.