Score: 0

VLegal-Bench: Cognitively Grounded Benchmark for Vietnamese Legal Reasoning of Large Language Models

Published: December 16, 2025 | arXiv ID: 2512.14554v1

By: Nguyen Tien Dong , Minh-Anh Nguyen , Thanh Dat Hoang and more

Potential Business Impact:

Tests AI's understanding of Vietnamese laws.

Business Areas:

Legal Tech Professional Services

The rapid advancement of large language models (LLMs) has enabled new possibilities for applying artificial intelligence within the legal domain. Nonetheless, the complexity, hierarchical organization, and frequent revisions of Vietnamese legislation pose considerable challenges for evaluating how well these models interpret and utilize legal knowledge. To address this gap, Vietnamese Legal Benchmark (VLegal-Bench) is introduced, the first comprehensive benchmark designed to systematically assess LLMs on Vietnamese legal tasks. Informed by Bloom's cognitive taxonomy, VLegal-Bench encompasses multiple levels of legal understanding through tasks designed to reflect practical usage scenarios. The benchmark comprises 10,450 samples generated through a rigorous annotation pipeline, where legal experts label and cross-validate each instance using our annotation system to ensure every sample is grounded in authoritative legal documents and mirrors real-world legal assistant workflows, including general legal questions and answers, retrieval-augmented generation, multi-step reasoning, and scenario-based problem solving tailored to Vietnamese law. By providing a standardized, transparent, and cognitively informed evaluation framework, VLegal-Bench establishes a solid foundation for assessing LLM performance in Vietnamese legal contexts and supports the development of more reliable, interpretable, and ethically aligned AI-assisted legal systems.

VLegal-Bench: Cognitively Grounded Benchmark for Vietnamese Legal Reasoning of Large Language Models

Computation and Language

Tests AI on understanding Vietnamese laws.

16 Dec 2025 0

91%

LaoBench: A Large-Scale Multidimensional Lao Benchmark for Large Language Models

Computation and Language

Tests AI's understanding of the Lao language.

14 Nov 2025 1

90%

VLQA: The First Comprehensive, Large, and High-Quality Vietnamese Dataset for Legal Question Answering

Computation and Language

Helps computers understand Vietnamese laws better.

26 Jul 2025 2

View PDF Login to Bookmark

Page Count

21 pages

VLegal-Bench: Cognitively Grounded Benchmark for Vietnamese Legal Reasoning of Large Language Models

Tests AI's understanding of Vietnamese laws.

Technical Abstract

VLegal-Bench: Cognitively Grounded Benchmark for Vietnamese Legal Reasoning of Large Language Models

LaoBench: A Large-Scale Multidimensional Lao Benchmark for Large Language Models

VLQA: The First Comprehensive, Large, and High-Quality Vietnamese Dataset for Legal Question Answering