SARCH: Multimodal Search for Archaeological Archives
By: Nivedita Sinha , Bharati Khanijo , Sanskar Singh and more
Potential Business Impact:
Finds old book pictures and words faster.
In this paper, we describe a multi-modal search system designed to search old archaeological books and reports. This corpus is digitally available as scanned PDFs, but varies widely in the quality of scans. Our pipeline, designed for multi-modal archaeological documents, extracts and indexes text, images (classified into maps, photos, layouts, and others), and tables. We evaluated different retrieval strategies, including keyword-based search, embedding- based models, and a hybrid approach that selects optimal results from both modalities. We report and analyze our preliminary results and discuss future work in this exciting vertical.
Similar Papers
Retrieval-Augmented Search for Large-Scale Map Collections with ColPali
Information Retrieval
Find old maps easily with smart search.
Hybrid Retrieval-Augmented Generation for Robust Multilingual Document Question Answering
Digital Libraries
Helps computers answer questions from old, messy papers.
Provenance Analysis of Archaeological Artifacts via Multimodal RAG Systems
Information Retrieval
Helps archaeologists identify ancient objects faster.