Score: 1

IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation

Published: November 2, 2025 | arXiv ID: 2511.01014v1

By: Bosi Wen , Yilin Niu , Cunxiang Wang and more

Potential Business Impact:

Helps computers follow instructions better and faster.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Instruction following is a fundamental ability of Large Language Models (LLMs), requiring their generated outputs to follow multiple constraints imposed in input instructions. Numerous studies have attempted to enhance this ability through preference optimization or reinforcement learning based on reward signals from LLM-as-a-Judge. However, existing evaluation models for instruction following still possess many deficiencies, such as substantial costs and unreliable assessments. To this end, we propose IF-CRITIC, an LLM critic that can provide efficient and reliable assessments of constraint following in the instructions. We first develop a checklist generator to decompose instructions and generate constraint checklists. With the assistance of the checklists, we collect high-quality critique training data through a multi-stage critique filtering mechanism and employ a constraint-level preference optimization method to train IF-CRITIC. Extensive experiments demonstrate that the evaluation performance of IF-CRITIC can beat strong LLM-as-a-Judge baselines, including Deepseek-R1 and o4-mini. With the scalable reward signals provided by IF-CRITIC, LLMs can achieve substantial performance gains in instruction-following optimization under lower computational overhead compared to strong LLM critic baselines.

Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following

Computation and Language

Teaches AI to follow tricky, multi-step directions.

13 Nov 2025 1

89%

A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models

Computation and Language

Teaches computers to follow instructions better.

12 May 2025 1

88%

Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions?

Computation and Language

Teaches AI to follow tricky, unexpected orders.

4 Sep 2025 2

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com github.com

Page Count

21 pages

IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation

Helps computers follow instructions better and faster.

Technical Abstract

Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following

A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models

Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions?