Score: 0

From Directions to Cones: Exploring Multidimensional Representations of Propositional Facts in LLMs

Published: May 27, 2025 | arXiv ID: 2505.21800v1

By: Stanley Yu , Vaidehi Bulusu , Oscar Yasunaga and more

Potential Business Impact:

Makes AI tell the truth more often.

Business Areas:

Multi-level Marketing Sales and Marketing

Large Language Models (LLMs) exhibit strong conversational abilities but often generate falsehoods. Prior work suggests that the truthfulness of simple propositions can be represented as a single linear direction in a model's internal activations, but this may not fully capture its underlying geometry. In this work, we extend the concept cone framework, recently introduced for modeling refusal, to the domain of truth. We identify multi-dimensional cones that causally mediate truth-related behavior across multiple LLM families. Our results are supported by three lines of evidence: (i) causal interventions reliably flip model responses to factual statements, (ii) learned cones generalize across model architectures, and (iii) cone-based interventions preserve unrelated model behavior. These findings reveal the richer, multidirectional structure governing simple true/false propositions in LLMs and highlight concept cones as a promising tool for probing abstract behaviors.

Probing the Geometry of Truth: Consistency and Generalization of Truth Directions in LLMs Across Logical Transformations and Question Answering Tasks

Computation and Language

Makes computers tell the truth more often.

1 Jun 2025 1

90%

Exploring the generalization of LLM truth directions on conversational formats

Computation and Language

Helps computers spot lies, even in long talks.

14 May 2025 1

89%

The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence

Machine Learning (CS)

Finds many ways AI can be tricked.

24 Feb 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

18 pages

From Directions to Cones: Exploring Multidimensional Representations of Propositional Facts in LLMs

Makes AI tell the truth more often.

Technical Abstract

Probing the Geometry of Truth: Consistency and Generalization of Truth Directions in LLMs Across Logical Transformations and Question Answering Tasks

Exploring the generalization of LLM truth directions on conversational formats

The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence