Metric-Fair Prompting: Treating Similar Samples Similarly
By: Jing Wang , Jie Shen , Xing Niu and more
Potential Business Impact:
Makes AI answer medical questions more fairly.
We introduce \emph{Metric-Fair Prompting}, a fairness-aware prompting framework that guides large language models (LLMs) to make decisions under metric-fairness constraints. In the application of multiple-choice medical question answering, each {(question, option)} pair is treated as a binary instance with label $+1$ (correct) or $-1$ (incorrect). To promote {individual fairness}~--~treating similar instances similarly~--~we compute question similarity using NLP embeddings and solve items in \emph{joint pairs of similar questions} rather than in isolation. The prompt enforces a global decision protocol: extract decisive clinical features, map each \((\text{question}, \text{option})\) to a score $f(x)$ that acts as confidence, and impose a Lipschitz-style constraint so that similar inputs receive similar scores and, hence, consistent outputs. Evaluated on the {MedQA (US)} benchmark, Metric-Fair Prompting is shown to improve performance over standard single-item prompting, demonstrating that fairness-guided, confidence-oriented reasoning can enhance LLM accuracy on high-stakes clinical multiple-choice questions.
Similar Papers
More Bias, Less Bias: BiasPrompting for Enhanced Multiple-Choice Question Answering
Computation and Language
Helps AI better answer tricky questions by thinking.
Prompt perturbation and fraction facilitation sometimes strengthen Large Language Model scores
Digital Libraries
Helps computers judge research quality better.
Prompt Fairness: Sub-group Disparities in LLMs
Machine Learning (CS)
Makes AI answer questions fairly for everyone.