Score: 1

Precise Attribute Intensity Control in Large Language Models via Targeted Representation Editing

Published: October 14, 2025 | arXiv ID: 2510.12121v1

By: Rongzhi Zhang , Liqin Ye , Yuzhao Heng and more

Potential Business Impact:

Makes AI write exactly what you want.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Precise attribute intensity control--generating Large Language Model (LLM) outputs with specific, user-defined attribute intensities--is crucial for AI systems adaptable to diverse user expectations. Current LLM alignment methods, however, typically provide only directional or open-ended guidance, failing to reliably achieve exact attribute intensities. We address this limitation with three key designs: (1) reformulating precise attribute intensity control as a target-reaching problem, rather than simple maximization; (2) training a lightweight value function via temporal-difference learning to predict final attribute intensity scores from partial generations, thereby steering LLM outputs; and (3) employing gradient-based interventions on hidden representations to navigate the model precisely towards specific attribute intensity targets. Our method enables fine-grained, continuous control over attribute intensities, moving beyond simple directional alignment. Experiments on LLaMA-3.2-3b and Phi-4-mini confirm our method's ability to steer text generation to user-specified attribute intensities with high accuracy. Finally, we demonstrate efficiency enhancements across three downstream tasks: preference data synthesis, Pareto frontier approximation and optimization, and distillation of aligned behaviors for intervention-free inference. Our code is available on https://github.com/Pre-Control/pre-control

AttriCtrl: Fine-Grained Control of Aesthetic Attribute Intensity in Diffusion Models

CV and Pattern Recognition

Controls how pretty pictures look, exactly how you want.

4 Aug 2025 0

86%

Activation Steering for Bias Mitigation: An Interpretable Approach to Safer LLMs

Artificial Intelligence

Fixes AI to stop saying unfair or wrong things.

12 Aug 2025 0

85%

From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation

Computation and Language

Makes AI sound more happy and personal.

16 Nov 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com github.com

Page Count

28 pages

Precise Attribute Intensity Control in Large Language Models via Targeted Representation Editing

Makes AI write exactly what you want.

Technical Abstract

AttriCtrl: Fine-Grained Control of Aesthetic Attribute Intensity in Diffusion Models

Activation Steering for Bias Mitigation: An Interpretable Approach to Safer LLMs

From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation