LLMs Learn Constructions That Humans Do Not Know
By: Jonathan Dunn, Mai Mohamed Eida
Potential Business Impact:
Finds AI makes up fake grammar rules.
This paper investigates false positive constructions: grammatical structures which an LLM hallucinates as distinct constructions but which human introspection does not support. Both a behavioural probing task using contextual embeddings and a meta-linguistic probing task using prompts are included, allowing us to distinguish between implicit and explicit linguistic knowledge. Both methods reveal that models do indeed hallucinate constructions. We then simulate hypothesis testing to determine what would have happened if a linguist had falsely hypothesized that these hallucinated constructions do exist. The high accuracy obtained shows that such false hypotheses would have been overwhelmingly confirmed. This suggests that construction probing methods suffer from a confirmation bias and raises the issue of what unknown and incorrect syntactic knowledge these models also possess.
Similar Papers
Large Language Models Do NOT Really Know What They Don't Know
Computation and Language
Computers can't tell true from fake facts.
A comprehensive taxonomy of hallucinations in Large Language Models
Computation and Language
Makes AI tell the truth, not make things up.
Grammaticality Judgments in Humans and Language Models: Revisiting Generative Grammar with LLMs
Computation and Language
Computers learn grammar rules from reading text.