Humans overrely on overconfident language models, across languages
By: Neil Rathi, Dan Jurafsky, Kaitlyn Zhou
Potential Business Impact:
Makes AI tell you when it's not sure.
As large language models (LLMs) are deployed globally, it is crucial that their responses are calibrated across languages to accurately convey uncertainty and limitations. Prior work shows that LLMs are linguistically overconfident in English, leading users to overrely on confident generations. However, the usage and interpretation of epistemic markers (e.g., 'I think it's') differs sharply across languages. Here, we study the risks of multilingual linguistic (mis)calibration, overconfidence, and overreliance across five languages to evaluate LLM safety in a global context. Our work finds that overreliance risks are high across languages. We first analyze the distribution of LLM-generated epistemic markers and observe that LLMs are overconfident across languages, frequently generating strengtheners even as part of incorrect responses. Model generations are, however, sensitive to documented cross-linguistic variation in usage: for example, models generate the most markers of uncertainty in Japanese and the most markers of certainty in German and Mandarin. Next, we measure human reliance rates across languages, finding that reliance behaviors differ cross-linguistically: for example, participants are significantly more likely to discount expressions of uncertainty in Japanese than in English (i.e., ignore their 'hedging' function and rely on generations that contain them). Taken together, these results indicate a high risk of reliance on overconfident model generations across languages. Our findings highlight the challenges of multilingual linguistic calibration and stress the importance of culturally and linguistically contextualized model safety evaluations.
Similar Papers
Large Language Models are overconfident and amplify human bias
Software Engineering
Computers think they know more than they do.
How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models
Machine Learning (CS)
Makes AI stick to its first answer.
Gauging Overprecision in LLMs: An Empirical Study
Computation and Language
Helps computers know when they are wrong.