Score: 2

Privacy-Preserved Automated Scoring using Federated Learning for Educational Research

Published: March 12, 2025 | arXiv ID: 2503.11711v2

By: Ehsan Latif, Xiaoming Zhai

Potential Business Impact:

Schools share test answers without sharing student data.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Data privacy remains a critical concern in educational research, requiring strict adherence to ethical standards and regulatory protocols. While traditional approaches rely on anonymization and centralized data collection, they often expose raw student data to security vulnerabilities and impose substantial logistical overhead. In this study, we propose a federated learning (FL) framework for automated scoring of educational assessments that eliminates the need to share sensitive data across institutions. Our approach leverages parameter-efficient fine-tuning of large language models (LLMs) with Low-Rank Adaptation (LoRA), enabling each client (school) to train locally while sharing only optimized model updates. To address data heterogeneity, we implement an adaptive weighted aggregation strategy that considers both client performance and data volume. We benchmark our model against two state-of-the-art FL methods and a centralized learning baseline using NGSS-aligned multi-label science assessment data from nine middle schools. Results show that our model achieves the highest accuracy (94.5%) among FL approaches, and performs within 0.5-1.0 percentage points of the centralized model on these metrics. Additionally, it achieves comparable rubric-level scoring accuracy, with only a 1.3% difference in rubric match and a lower score deviation (MAE), highlighting its effectiveness in preserving both prediction quality and interpretability.

Federated Learning: A Survey on Privacy-Preserving Collaborative Intelligence

Machine Learning (CS)

Trains computers together without sharing private info.

24 Apr 2025 0

91%

Can Federated Learning Safeguard Private Data in LLM Training? Vulnerabilities, Attacks, and Defense Evaluation

Machine Learning (CS)

Steals private info from shared AI training.

25 Sep 2025 1

90%

Experiences Building Enterprise-Level Privacy-Preserving Federated Learning to Power AI for Science

Distributed, Parallel, and Cluster Computing

Lets AI learn from private data safely.

12 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Repos / Data Links

github.com

Page Count

8 pages

Privacy-Preserved Automated Scoring using Federated Learning for Educational Research

Schools share test answers without sharing student data.

Technical Abstract

Federated Learning: A Survey on Privacy-Preserving Collaborative Intelligence

Can Federated Learning Safeguard Private Data in LLM Training? Vulnerabilities, Attacks, and Defense Evaluation

Experiences Building Enterprise-Level Privacy-Preserving Federated Learning to Power AI for Science