Score: 1

FECT: Factuality Evaluation of Interpretive AI-Generated Claims in Contact Center Conversation Transcripts

Published: July 26, 2025 | arXiv ID: 2508.00889v1

By: Hagyeong Shin , Binoy Robin Dalal , Iwona Bialynicka-Birula and more

Potential Business Impact:

Verifies AI truth in customer call summaries

Large language models (LLMs) are known to hallucinate, producing natural language outputs that are not grounded in the input, reference materials, or real-world knowledge. In enterprise applications where AI features support business decisions, such hallucinations can be particularly detrimental. LLMs that analyze and summarize contact center conversations introduce a unique set of challenges for factuality evaluation, because ground-truth labels often do not exist for analytical interpretations about sentiments captured in the conversation and root causes of the business problems. To remedy this, we first introduce a \textbf{3D} -- \textbf{Decompose, Decouple, Detach} -- paradigm in the human annotation guideline and the LLM-judges' prompt to ground the factuality labels in linguistically-informed evaluation criteria. We then introduce \textbf{FECT}, a novel benchmark dataset for \textbf{F}actuality \textbf{E}valuation of Interpretive AI-Generated \textbf{C}laims in Contact Center Conversation \textbf{T}ranscripts, labeled under our 3D paradigm. Lastly, we report our findings from aligning LLM-judges on the 3D paradigm. Overall, our findings contribute a new approach for automatically evaluating the factuality of outputs generated by an AI system for analyzing contact center conversations.

FineDialFact: A benchmark for Fine-grained Dialogue Fact Verification

Computation and Language

Helps computers tell if their answers are true.

7 Aug 2025 1

91%

Hallucination to Truth: A Review of Fact-Checking and Factuality Evaluation in Large Language Models

Computation and Language

Makes AI tell the truth, not lies.

5 Aug 2025 0

91%

Highlight All the Phrases: Enhancing LLM Transparency through Visual Factuality Indicators

Human-Computer Interaction

Colors show if AI is telling the truth.

9 Aug 2025 1

View PDF Login to Bookmark

Country of Origin

🇨🇦 Canada

Repos / Data Links

github.com

Page Count

12 pages

FECT: Factuality Evaluation of Interpretive AI-Generated Claims in Contact Center Conversation Transcripts

Verifies AI truth in customer call summaries

Technical Abstract

FineDialFact: A benchmark for Fine-grained Dialogue Fact Verification

Hallucination to Truth: A Review of Fact-Checking and Factuality Evaluation in Large Language Models

Highlight All the Phrases: Enhancing LLM Transparency through Visual Factuality Indicators