Training Data Attribution for Image Generation using Ontology-Aligned Knowledge Graphs
By: Theodoros Aivalis , Iraklis A. Klampanos , Antonis Troumpoukis and more
Potential Business Impact:
Shows which pictures helped AI make new art.
As generative models become powerful, concerns around transparency, accountability, and copyright violations have intensified. Understanding how specific training data contributes to a model's output is critical. We introduce a framework for interpreting generative outputs through the automatic construction of ontologyaligned knowledge graphs (KGs). While automatic KG construction from natural text has advanced, extracting structured and ontology-consistent representations from visual content remains challenging -- due to the richness and multi-object nature of images. Leveraging multimodal large language models (LLMs), our method extracts structured triples from images, aligned with a domain-specific ontology. By comparing the KGs of generated and training images, we can trace potential influences, enabling copyright analysis, dataset transparency, and interpretable AI. We validate our method through experiments on locally trained models via unlearning, and on large-scale models through a style-specific experiment. Our framework supports the development of AI systems that foster human collaboration, creativity and stimulate curiosity.
Similar Papers
Continuous Monitoring of Large-Scale Generative AI via Deterministic Knowledge Graph Structures
Artificial Intelligence
Checks AI for fake answers and bias.
LLM-empowered knowledge graph construction: A survey
Artificial Intelligence
Helps computers understand and organize information better.
Detecting and Mitigating Bias in LLMs through Knowledge Graph-Augmented Training
Computation and Language
Makes smart computer programs fairer and less biased.