Provenance Analysis of Archaeological Artifacts via Multimodal RAG Systems
By: Tuo Zhang, Yuechun Sun, Ruiliang Liu
Potential Business Impact:
Helps archaeologists identify ancient objects faster.
In this work, we present a retrieval-augmented generation (RAG)-based system for provenance analysis of archaeological artifacts, designed to support expert reasoning by integrating multimodal retrieval and large vision-language models (VLMs). The system constructs a dual-modal knowledge base from reference texts and images, enabling raw visual, edge-enhanced, and semantic retrieval to identify stylistically similar objects. Retrieved candidates are synthesized by the VLM to generate structured inferences, including chronological, geographical, and cultural attributions, alongside interpretive justifications. We evaluate the system on a set of Eastern Eurasian Bronze Age artifacts from the British Museum. Expert evaluation demonstrates that the system produces meaningful and interpretable outputs, offering scholars concrete starting points for analysis and significantly alleviating the cognitive burden of navigating vast comparative corpora.
Similar Papers
Retrieval-Augmented Generation for Natural Language Art Provenance Searches in the Getty Provenance Index
Computation and Language
Helps find art history by asking questions.
OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval
Information Retrieval
Helps computers answer questions about pictures.
M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
Computation and Language
Helps computers answer questions about pictures in many languages.