LoRe: Personalizing LLMs via Low-Rank Reward Modeling
By: Avinandan Bose , Zhihan Xiong , Yuejie Chi and more
Potential Business Impact:
Teaches AI to learn what you like.
Personalizing large language models (LLMs) to accommodate diverse user preferences is essential for enhancing alignment and user satisfaction. Traditional reinforcement learning from human feedback (RLHF) approaches often rely on monolithic value representations, limiting their ability to adapt to individual preferences. We introduce a novel framework that leverages low-rank preference modeling to efficiently learn and generalize user-specific reward functions. By representing reward functions in a low-dimensional subspace and modeling individual preferences as weighted combinations of shared basis functions, our approach avoids rigid user categorization while enabling scalability and few-shot adaptation. We validate our method on multiple preference datasets, demonstrating superior generalization to unseen users and improved accuracy in preference prediction tasks.
Similar Papers
Language Model Personalization via Reward Factorization
Machine Learning (CS)
Makes AI understand what *you* like best.
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Machine Learning (Stat)
Makes AI understand what people want better.
A Shared Low-Rank Adaptation Approach to Personalized RLHF
Machine Learning (CS)
AI learns what *you* like, not just what most people like.