CoPL: Collaborative Preference Learning for Personalizing LLMs
By: Youngbin Choi , Seunghyuk Cho , Minjong Lee and more
Potential Business Impact:
Teaches AI to understand what you like best.
Personalizing large language models (LLMs) is important for aligning outputs with diverse user preferences, yet existing methods struggle with flexibility and generalization. We propose CoPL (Collaborative Preference Learning), a graph-based collaborative filtering framework that models user-response relationships to enhance preference estimation, particularly in sparse annotation settings. By integrating a mixture of LoRA experts, CoPL efficiently fine-tunes LLMs while dynamically balancing shared and user-specific preferences. Additionally, an optimization-free adaptation strategy enables generalization to unseen users without fine-tuning. Experiments on UltraFeedback-P demonstrate that CoPL outperforms existing personalized reward models, effectively capturing both common and controversial preferences, making it a scalable solution for personalized LLM alignment. The code is available at https://github.com/ml-postech/CoPL.
Similar Papers
Personalized LLM Decoding via Contrasting Personal Preference
Computation and Language
Makes AI understand what you like best.
Online Preference Alignment for Language Models via Count-based Exploration
Machine Learning (CS)
Helps AI learn better by trying new things.
A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models
Computation and Language
Makes AI understand what you like best.