Score: 1

Not in Sync: Unveiling Temporal Bias in Audio Chat Models

Published: October 14, 2025 | arXiv ID: 2510.12185v1

By: Jiayu Yao , Shenghua Liu , Yiwei Wang and more

Potential Business Impact:

Fixes AI's timing mistakes in understanding sounds.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large Audio Language Models (LALMs) are increasingly applied to audio understanding and multimodal reasoning, yet their ability to locate when events occur remains underexplored. We present the first systematic study of temporal bias in LALMs, revealing a key limitation in their timestamp prediction. For example, when asked "At which second does the lecturer introduce the key formula?", models often predict timestamps that are consistently earlier or later than the ground truth. Through controlled experiments on timestamped datasets, we find that temporal bias (i) is prevalent across datasets and models, (ii) increases with audio length - even accumulating to tens of seconds in extended recordings, and (iii) varies across event types and positions. We quantify this effect with the Temporal Bias Index (TBI), measuring systematic misalignment in predicted event timings, and complement it with a visualization framework. Our findings highlight a fundamental limitation in current LALMs and call for the development of temporally robust architectures.

TimeAudio: Bridging Temporal Gaps in Large Audio-Language Models

Sound

Helps computers understand exact moments in audio.

14 Nov 2025 1

90%

When Audio and Text Disagree: Revealing Text Bias in Large Audio-Language Models

Computation and Language

AI ignores sounds when text disagrees.

21 Aug 2025 0

89%

MedVoiceBias: A Controlled Study of Audio LLM Behavior in Clinical Decision-Making

Computation and Language

Voice changes how computers give medical advice.

10 Nov 2025 0

View PDF Login to Bookmark

Page Count

13 pages

Not in Sync: Unveiling Temporal Bias in Audio Chat Models

Fixes AI's timing mistakes in understanding sounds.

Technical Abstract

TimeAudio: Bridging Temporal Gaps in Large Audio-Language Models

When Audio and Text Disagree: Revealing Text Bias in Large Audio-Language Models

MedVoiceBias: A Controlled Study of Audio LLM Behavior in Clinical Decision-Making