Don't Throw Away Your Beams: Improving Consistency-based Uncertainties in LLMs via Beam Search
By: Ekaterina Fadeeva , Maiya Goloburda , Aleksandr Rubashevskii and more
Potential Business Impact:
Makes AI answers more trustworthy and less repetitive.
Consistency-based methods have emerged as an effective approach to uncertainty quantification (UQ) in large language models. These methods typically rely on several generations obtained via multinomial sampling, measuring their agreement level. However, in short-form QA, multinomial sampling is prone to producing duplicates due to peaked distributions, and its stochasticity introduces considerable variance in uncertainty estimates across runs. We introduce a new family of methods that employ beam search to generate candidates for consistency-based UQ, yielding improved performance and reduced variance compared to multinomial sampling. We also provide a theoretical lower bound on the beam set probability mass under which beam search achieves a smaller error than multinomial sampling. We empirically evaluate our approach on six QA datasets and find that its consistent improvements over multinomial sampling lead to state-of-the-art UQ performance.
Similar Papers
The Illusion of Certainty: Uncertainty quantification for LLMs fails under ambiguity
Machine Learning (CS)
Makes AI understand when it's unsure.
Conformal Sets in Multiple-Choice Question Answering under Black-Box Settings with Provable Coverage Guarantees
Computation and Language
Makes AI answers more trustworthy and less wrong.
Can Linear Probes Measure LLM Uncertainty?
Machine Learning (CS)
Makes AI more sure about its answers.