Disentangling Learning from Judgment: Representation Learning for Open Response Analytics
By: Conrad Borchers , Manit Patel , Seiyon M. Lee and more
Potential Business Impact:
Helps computers grade student answers fairly.
Open-ended responses are central to learning, yet automated scoring often conflates what students wrote with how teachers grade. We present an analytics-first framework that separates content signals from rater tendencies, making judgments visible and auditable via analytics. Using de-identified ASSISTments mathematics responses, we model teacher histories as dynamic priors and derive text representations from sentence embeddings, incorporating centering and residualization to mitigate prompt and teacher confounds. Temporally-validated linear models quantify the contributions of each signal, and a projection surfaces model disagreements for qualitative inspection. Results show that teacher priors heavily influence grade predictions; the strongest results arise when priors are combined with content embeddings (AUC~0.815), while content-only models remain above chance but substantially weaker (AUC~0.626). Adjusting for rater effects sharpens the residual content representation, retaining more informative embedding dimensions and revealing cases where semantic evidence supports understanding as opposed to surface-level differences in how students respond. The contribution presents a practical pipeline that transforms embeddings from mere features into learning analytics for reflection, enabling teachers and researchers to examine where grading practices align (or conflict) with evidence of student reasoning and learning.
Similar Papers
AI-Enabled grading with near-domain data for scaling feedback with human-level accuracy
Computers and Society
Grades student answers faster than teachers.
Embedding-Based Rankings of Educational Resources based on Learning Outcome Alignment: Benchmarking, Expert Validation, and Learner Performance
Computers and Society
Helps teachers pick best lessons for students.
Scalable and consistent few-shot classification of survey responses using text embeddings
Computation and Language
Helps sort and understand many answers faster.