ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation
By: Hyeong Kyu Choi, Sharon Li
Potential Business Impact:
Finds the best answer from many computer guesses.
Selecting a single high-quality output from multiple stochastic generations remains a fundamental challenge for large language models (LLMs), particularly in open-ended tasks where no canonical answer exists. While Best-of-N and self-consistency methods show that aggregating multiple generations can improve performance, existing approaches typically rely on external evaluators, reward models, or exact string-match voting, limiting their applicability and efficiency. We propose Mode Extraction (ModeX), an evaluator-free Best-of-N selection framework that generalizes majority voting to open-ended text generation by identifying the modal output representing the dominant semantic consensus among generated texts. ModeX constructs a similarity graph over candidate generations and recursively applies spectral clustering to select a representative centroid, without requiring additional inference or auxiliary models. We further instantiate this selection principle as ModeX-Lite, an improved version of ModeX with early pruning for efficiency. Across open-ended tasks -- including text summarization, code generation, and mathematical reasoning -- our approaches consistently outperform standard single- and multi-path baselines, providing a computationally efficient solution for robust open-ended text generation. Code is released in https://github.com/deeplearning-wisc/ModeX.
Similar Papers
Majority of the Bests: Improving Best-of-N via Bootstrapping
Machine Learning (CS)
Finds better answers by picking the most common choice.
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
Computation and Language
Makes AI smarter without needing extra brainpower.
MODE: Multi-Objective Adaptive Coreset Selection
Machine Learning (CS)
Makes computer learning use less memory.