Score: 0

CLaS-Bench: A Cross-Lingual Alignment and Steering Benchmark

Published: January 13, 2026 | arXiv ID: 2601.08331v1

By: Daniil Gurgurov , Yusser Al Ghussin , Tanja Baeumel and more

Understanding and controlling the behavior of large language models (LLMs) is an increasingly important topic in multilingual NLP. Beyond prompting or fine-tuning, , i.e.,~manipulating internal representations during inference, has emerged as a more efficient and interpretable technique for adapting models to a target language. Yet, no dedicated benchmarks or evaluation protocols exist to quantify the effectiveness of steering techniques. We introduce CLaS-Bench, a lightweight parallel-question benchmark for evaluating language-forcing behavior in LLMs across 32 languages, enabling systematic evaluation of multilingual steering methods. We evaluate a broad array of steering techniques, including residual-stream DiffMean interventions, probe-derived directions, language-specific neurons, PCA/LDA vectors, Sparse Autoencoders, and prompting baselines. Steering performance is measured along two axes: language control and semantic relevance, combined into a single harmonic-mean steering score. We find that across languages simple residual-based DiffMean method consistently outperforms all other methods. Moreover, a layer-wise analysis reveals that language-specific structure emerges predominantly in later layers and steering directions cluster based on language family. CLaS-Bench is the first standardized benchmark for multilingual steering, enabling both rigorous scientific analysis of language representations and practical evaluation of steering as a low-cost adaptation alternative.

STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models

Computation and Language

Helps computers understand different online groups.

27 May 2025 0

89%

Rethinking Cross-lingual Alignment: Balancing Transfer and Cultural Erasure in Multilingual LLMs

Computation and Language

Helps AI speak different languages without forgetting culture.

29 Oct 2025 1

88%

Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples

Computation and Language

Tests how well computers understand different languages.

12 Feb 2025 1

View PDF Login to Bookmark

CLaS-Bench: A Cross-Lingual Alignment and Steering Benchmark

Technical Abstract

STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models

Rethinking Cross-lingual Alignment: Balancing Transfer and Cultural Erasure in Multilingual LLMs

Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples