Human-Level Reasoning: A Comparative Study of Large Language Models on Logical and Abstract Reasoning
By: Benjamin Grando Moreira
Potential Business Impact:
Tests if AI can think like a person.
Evaluating reasoning ability in Large Language Models (LLMs) is important for advancing artificial intelligence, as it transcends mere linguistic task performance. It involves understanding whether these models truly understand information, perform inferences, and are able to draw conclusions in a logical and valid way. This study compare logical and abstract reasoning skills of several LLMs - including GPT, Claude, DeepSeek, Gemini, Grok, Llama, Mistral, Perplexity, and Sabi\'a - using a set of eight custom-designed reasoning questions. The LLM results are benchmarked against human performance on the same tasks, revealing significant differences and indicating areas where LLMs struggle with deduction.
Similar Papers
A Survey on Large Language Models for Mathematical Reasoning
Artificial Intelligence
Helps computers solve math problems like a person.
Reasoning Models Reason Well, Until They Don't
Artificial Intelligence
Makes smart computers better at solving hard problems.
Cognitive Foundations for Reasoning and Their Manifestation in LLMs
Artificial Intelligence
Teaches computers to think more like people.