Score: 0

Classifying German Language Proficiency Levels Using Large Language Models

Published: December 6, 2025 | arXiv ID: 2512.06483v1

By: Elias-Leander Ahlers, Witold Brunsmann, Malte Schilling

Potential Business Impact:

Helps teachers know how well students read German.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Assessing language proficiency is essential for education, as it enables instruction tailored to learners needs. This paper investigates the use of Large Language Models (LLMs) for automatically classifying German texts according to the Common European Framework of Reference for Languages (CEFR) into different proficiency levels. To support robust training and evaluation, we construct a diverse dataset by combining multiple existing CEFR-annotated corpora with synthetic data. We then evaluate prompt-engineering strategies, fine-tuning of a LLaMA-3-8B-Instruct model and a probing-based approach that utilizes the internal neural state of the LLM for classification. Our results show a consistent performance improvement over prior methods, highlighting the potential of LLMs for reliable and scalable CEFR classification.

Complementary Learning Approach for Text Classification using Large Language Models

Computation and Language

Helps people and computers work together better.

8 Dec 2025 0

90%

Testing Low-Resource Language Support in LLMs Using Language Proficiency Exams: the Case of Luxembourgish

Computation and Language

Helps computers understand less common languages better.

2 Apr 2025 0

90%

CEFR-Annotated WordNet: LLM-Based Proficiency-Guided Semantic Database for Language Learning

Computation and Language

Helps language learners by sorting words by difficulty.

21 Oct 2025 1

View PDF Login to Bookmark

Country of Origin

🇩🇪 Germany

Page Count

13 pages

Classifying German Language Proficiency Levels Using Large Language Models

Helps teachers know how well students read German.

Technical Abstract

Complementary Learning Approach for Text Classification using Large Language Models

Testing Low-Resource Language Support in LLMs Using Language Proficiency Exams: the Case of Luxembourgish

CEFR-Annotated WordNet: LLM-Based Proficiency-Guided Semantic Database for Language Learning