Fine-Tuning BERT for Domain-Specific Question Answering: Toward Educational NLP Resources at University Scale
By: Aurélie Montfrond
Prior work on scientific question answering has largely emphasized chatbot-style systems, with limited exploration of fine-tuning foundation models for domain-specific reasoning. In this study, we developed a chatbot for the University of Limerick's Department of Electronic and Computer Engineering to provide course information to students. A custom dataset of 1,203 question-answer pairs in SQuAD format was constructed using the university book of modules, supplemented with manually and synthetically generated entries. We fine-tuned BERT (Devlin et al., 2019) using PyTorch and evaluated performance with Exact Match and F1 scores. Results show that even modest fine-tuning improves hypothesis framing and knowledge extraction, demonstrating the feasibility of adapting foundation models to educational domains. While domain-specific BERT variants such as BioBERT and SciBERT exist for biomedical and scientific literature, no foundation model has yet been tailored to university course materials. Our work addresses this gap by showing that fine-tuning BERT with academic QA pairs yields effective results, highlighting the potential to scale towards the first domain-specific QA model for universities and enabling autonomous educational knowledge systems.
Similar Papers
FinBERT-QA: Financial Question Answering with pre-trained BERT Language Models
Computation and Language
Helps computers understand money questions better.
Advancing Scientific Text Classification: Fine-Tuned Models with Dataset Expansion and Hard-Voting
Computation and Language
Sorts science papers automatically and faster.
Domain-Specific Fine-Tuning and Prompt-Based Learning: A Comparative Study for developing Natural Language-Based BIM Information Retrieval Systems
Information Retrieval
Lets you ask computers about buildings using normal words.