Score: 0

How Order-Sensitive Are LLMs? OrderProbe for Deterministic Structural Reconstruction

Published: January 13, 2026 | arXiv ID: 2601.08626v1

By: Yingjie He , Zhaolu Kang , Kehan Jiang and more

Large language models (LLMs) excel at semantic understanding, yet their ability to reconstruct internal structure from scrambled inputs remains underexplored. Sentence-level restoration is ill-posed for automated evaluation because multiple valid word orders often exist. We introduce OrderProbe, a deterministic benchmark for structural reconstruction using fixed four-character expressions in Chinese, Japanese, and Korean, which have a unique canonical order and thus support exact-match scoring. We further propose a diagnostic framework that evaluates models beyond recovery accuracy, including semantic fidelity, logical validity, consistency, robustness sensitivity, and information density. Experiments on twelve widely used LLMs show that structural reconstruction remains difficult even for frontier systems: zero-shot recovery frequently falls below 35%. We also observe a consistent dissociation between semantic recall and structural planning, suggesting that structural robustness is not an automatic byproduct of semantic competence.

From Brute Force to Semantic Insight: Performance-Guided Data Transformation Design with LLMs

CV and Pattern Recognition

Helps computers write better code automatically.

7 Jan 2026 0

88%

When Words Change the Model: Sensitivity of LLMs for Constraint Programming Modelling

Artificial Intelligence

Computers struggle to solve problems with new words.

18 Nov 2025 1

88%

When Words Change the Model: Sensitivity of LLMs for Constraint Programming Modelling

Artificial Intelligence

Computers struggle to solve problems with new words.

18 Nov 2025 1

View PDF Login to Bookmark

How Order-Sensitive Are LLMs? OrderProbe for Deterministic Structural Reconstruction

Technical Abstract

From Brute Force to Semantic Insight: Performance-Guided Data Transformation Design with LLMs

When Words Change the Model: Sensitivity of LLMs for Constraint Programming Modelling

When Words Change the Model: Sensitivity of LLMs for Constraint Programming Modelling