Neologism Learning as a Parameter-Efficient Alternative to Fine-Tuning for Model Steering
By: Sungjoon Park, Varun Ramamurthi, Owen Terry
Potential Business Impact:
Teaches computers new words to follow instructions better.
In language modeling, neologisms are new tokens trained to represent a concept not already included in a given model's vocabulary. Neologisms can be used to encourage specific behavior in models, for example by appending prompts with "Give me a neologism answer." Behavioral steering can also be achieved through fine-tuning, albeit with more compute and less flexibility: learning a neologism only trains d parameters and allows the user to still access the model's default behavior. We compare the performance of neologism learning against low-rank adaptation (LoRA) fine-tuning, finding that neologisms outperform fine-tuned models under a matched training setup (same data and hyperparameters). We also investigate self-verbalizations of neologisms, and observe that the model will occasionally make up its own new words when asked about a neologism.
Similar Papers
Neologism Learning for Controllability and Self-Verbalization
Computation and Language
Teaches computers new words to control their answers.
Efficient Continual Learning in Neural Machine Translation: A Low-Rank Adaptation Approach
Computation and Language
Teaches computers new languages without forgetting old ones.
Continual Learning via Sparse Memory Finetuning
Computation and Language
Lets AI learn new things without forgetting old ones.