Score: 1

Efficient Contrastive Decoding with Probabilistic Hallucination Detection - Mitigating Hallucinations in Large Vision Language Models -

Published: April 16, 2025 | arXiv ID: 2504.12137v1

By: Laura Fieback , Nishilkumar Balar , Jakob Spiegelberg and more

Potential Business Impact:

Stops AI from making up fake answers about pictures.

Business Areas:
Visual Search Internet Services

Despite recent advances in Large Vision Language Models (LVLMs), these models still suffer from generating hallucinatory responses that do not align with the visual input provided. To mitigate such hallucinations, we introduce Efficient Contrastive Decoding (ECD), a simple method that leverages probabilistic hallucination detection to shift the output distribution towards contextually accurate answers at inference time. By contrasting token probabilities and hallucination scores, ECD subtracts hallucinated concepts from the original distribution, effectively suppressing hallucinations. Notably, our proposed method can be applied to any open-source LVLM and does not require additional LVLM training. We evaluate our method on several benchmark datasets and across different LVLMs. Our experiments show that ECD effectively mitigates hallucinations, outperforming state-of-the-art methods with respect to performance on LVLM benchmarks and computation time.

Country of Origin
🇩🇪 Germany

Page Count
15 pages

Category
Computer Science:
CV and Pattern Recognition