MemeBLIP2: A novel lightweight multimodal system to detect harmful memes
By: Jiaqi Liu , Ran Tong , Aowei Shen and more
Potential Business Impact:
Finds mean messages hidden in funny pictures.
Memes often merge visuals with brief text to share humor or opinions, yet some memes contain harmful messages such as hate speech. In this paper, we introduces MemeBLIP2, a light weight multimodal system that detects harmful memes by combining image and text features effectively. We build on previous studies by adding modules that align image and text representations into a shared space and fuse them for better classification. Using BLIP-2 as the core vision-language model, our system is evaluated on the PrideMM datasets. The results show that MemeBLIP2 can capture subtle cues in both modalities, even in cases with ironic or culturally specific content, thereby improving the detection of harmful material.
Similar Papers
Detecting and Mitigating Hateful Content in Multimodal Memes with Vision-Language Models
CV and Pattern Recognition
Changes mean memes into funny ones.
Meme Similarity and Emotion Detection using Multimodal Analysis
CV and Pattern Recognition
Finds what emotions memes make you feel.
Improving Multimodal Hateful Meme Detection Exploiting LMM-Generated Knowledge
CV and Pattern Recognition
Finds mean memes using pictures and words.