Handwritten Text Recognition of Historical Manuscripts Using Transformer-Based Models
By: Erez Meoded
Potential Business Impact:
Reads old handwriting better by teaching computers.
Historical handwritten text recognition (HTR) is essential for unlocking the cultural and scholarly value of archival documents, yet digitization is often hindered by scarce transcriptions, linguistic variation, and highly diverse handwriting styles. In this study, we apply TrOCR, a state-of-the-art transformer-based HTR model, to 16th-century Latin manuscripts authored by Rudolf Gwalther. We investigate targeted image preprocessing and a broad suite of data augmentation techniques, introducing four novel augmentation methods designed specifically for historical handwriting characteristics. We also evaluate ensemble learning approaches to leverage the complementary strengths of augmentation-trained models. On the Gwalther dataset, our best single-model augmentation (Elastic) achieves a Character Error Rate (CER) of 1.86, while a top-5 voting ensemble achieves a CER of 1.60 - representing a 50% relative improvement over the best reported TrOCR_BASE result and a 42% improvement over the previous state of the art. These results highlight the impact of domain-specific augmentations and ensemble strategies in advancing HTR performance for historical manuscripts.
Similar Papers
Quo Vadis Handwritten Text Generation for Handwritten Text Recognition?
CV and Pattern Recognition
Makes old handwriting easier for computers to read.
HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition
CV and Pattern Recognition
Helps computers read messy handwriting better.
Handwritten Text Recognition for Low Resource Languages
CV and Pattern Recognition
Reads handwritten Hindi and Urdu text better.