Assessing Code Understanding in LLMs
By: Cosimo Laneve , Alvise Spanò , Dalila Ressi and more
Potential Business Impact:
Helps computers understand code changes better.
We present an empirical evaluation of Large Language Models in code understanding associated with non-trivial, semantic-preserving program transformations such as copy propagation or constant folding. Our findings show that LLMs fail to judge semantic equivalence in approximately 41\% of cases when no context is provided and in 29\% when given a simple generic context. To improve accuracy, we advocate integrating LLMs with code-optimization tools to enhance training and facilitate more robust program understanding.
Similar Papers
How Accurately Do Large Language Models Understand Code?
Software Engineering
Tests if computers truly understand code.
Are Large Language Models Robust in Understanding Code Against Semantics-Preserving Mutations?
Software Engineering
Helps computers understand code, not just guess.
Evaluating Programming Language Confusion
Software Engineering
Fixes computer programs that accidentally switch languages.