MLLM Machine Unlearning via Visual Knowledge Distillation
By: Yuhang Wang , Zhenxing Niu , Haoxuan Ji and more
Potential Business Impact:
Removes unwanted pictures from AI, keeps words.
Recently, machine unlearning approaches have been proposed to remove sensitive information from well-trained large models. However, most existing methods are tailored for LLMs, while MLLM-oriented unlearning remains at its early stage. Inspired by recent studies exploring the internal mechanisms of MLLMs, we propose to disentangle the visual and textual knowledge embedded within MLLMs and introduce a dedicated approach to selectively erase target visual knowledge while preserving textual knowledge. Unlike previous unlearning methods that rely on output-level supervision, our approach introduces a Visual Knowledge Distillation (VKD) scheme, which leverages intermediate visual representations within the MLLM as supervision signals. This design substantially enhances both unlearning effectiveness and model utility. Moreover, since our method only fine-tunes the visual components of the MLLM, it offers significant efficiency advantages. Extensive experiments demonstrate that our approach outperforms state-of-the-art unlearning methods in terms of both effectiveness and efficiency. Moreover, we are the first to evaluate the robustness of MLLM unlearning against relearning attacks.
Similar Papers
EM-KD: Distilling Efficient Multimodal Large Language Model with Unbalanced Vision Tokens
CV and Pattern Recognition
Makes AI understand pictures better without using more power.
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
Computation and Language
Teaches AI to forget private or bad information.
VKnowU: Evaluating Visual Knowledge Understanding in Multimodal LLMs
CV and Pattern Recognition
Teaches computers to understand how the world works.