Characterizing Knowledge Graph Tasks in LLM Benchmarks Using Cognitive Complexity Frameworks
By: Sara Todorovikj, Lars-Peter Meyer, Michael Martin
Potential Business Impact:
Makes AI understand hard questions better.
Large Language Models (LLMs) are increasingly used for tasks involving Knowledge Graphs (KGs), whose evaluation typically focuses on accuracy and output correctness. We propose a complementary task characterization approach using three complexity frameworks from cognitive psychology. Applying this to the LLM-KG-Bench framework, we highlight value distributions, identify underrepresented demands and motivate richer interpretation and diversity for benchmark evaluation tasks.
Similar Papers
KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs
Computation and Language
Helps computers learn facts better from text.
LLM-KG-Bench 3.0: A Compass for SemanticTechnology Capabilities in the Ocean of LLMs
Artificial Intelligence
Tests how well AI understands and uses knowledge graphs.
Enhancing Large Language Models with Reliable Knowledge Graphs
Computation and Language
Makes AI smarter and more truthful.