Score: 2

Preference-Driven Multi-Objective Combinatorial Optimization with Conditional Computation

Published: June 10, 2025 | arXiv ID: 2506.08898v2

By: Mingfeng Fan , Jianan Zhou , Yifeng Zhang and more

Potential Business Impact:

Helps computers solve hard problems better.

Business Areas:

Personalization Commerce and Shopping

Recent deep reinforcement learning methods have achieved remarkable success in solving multi-objective combinatorial optimization problems (MOCOPs) by decomposing them into multiple subproblems, each associated with a specific weight vector. However, these methods typically treat all subproblems equally and solve them using a single model, hindering the effective exploration of the solution space and thus leading to suboptimal performance. To overcome the limitation, we propose POCCO, a novel plug-and-play framework that enables adaptive selection of model structures for subproblems, which are subsequently optimized based on preference signals rather than explicit reward values. Specifically, we design a conditional computation block that routes subproblems to specialized neural architectures. Moreover, we propose a preference-driven optimization algorithm that learns pairwise preferences between winning and losing solutions. We evaluate the efficacy and versatility of POCCO by applying it to two state-of-the-art neural methods for MOCOPs. Experimental results across four classic MOCOP benchmarks demonstrate its significant superiority and strong generalization.

Multi-Objective Reward and Preference Optimization: Theory and Algorithms

Machine Learning (CS)

Teaches computers to make safe, smart choices.

11 Dec 2025 1

89%

BOPO: Neural Combinatorial Optimization via Best-anchored and Objective-guided Preference Optimization

Machine Learning (CS)

Solves hard puzzles much faster with smart computer learning.

10 Mar 2025 2

89%

UCPO: A Universal Constrained Combinatorial Optimization Method via Preference Optimization

Neural and Evolutionary Computing

Helps computers solve hard problems with fewer rules.

13 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 🇸🇬 🇳🇱 China, Netherlands, Singapore

Page Count

22 pages

Preference-Driven Multi-Objective Combinatorial Optimization with Conditional Computation

Helps computers solve hard problems better.

Technical Abstract

Multi-Objective Reward and Preference Optimization: Theory and Algorithms

BOPO: Neural Combinatorial Optimization via Best-anchored and Objective-guided Preference Optimization

UCPO: A Universal Constrained Combinatorial Optimization Method via Preference Optimization