Learning Event Completeness for Weakly Supervised Video Anomaly Detection
By: Yu Wang, Shiwei Chen
Potential Business Impact:
Finds bad things in videos without exact times.
Weakly supervised video anomaly detection (WS-VAD) is tasked with pinpointing temporal intervals containing anomalous events within untrimmed videos, utilizing only video-level annotations. However, a significant challenge arises due to the absence of dense frame-level annotations, often leading to incomplete localization in existing WS-VAD methods. To address this issue, we present a novel LEC-VAD, Learning Event Completeness for Weakly Supervised Video Anomaly Detection, which features a dual structure designed to encode both category-aware and category-agnostic semantics between vision and language. Within LEC-VAD, we devise semantic regularities that leverage an anomaly-aware Gaussian mixture to learn precise event boundaries, thereby yielding more complete event instances. Besides, we develop a novel memory bank-based prototype learning mechanism to enrich concise text descriptions associated with anomaly-event categories. This innovation bolsters the text's expressiveness, which is crucial for advancing WS-VAD. Our LEC-VAD demonstrates remarkable advancements over the current state-of-the-art methods on two benchmark datasets XD-Violence and UCF-Crime.
Similar Papers
EventVAD: Training-Free Event-Aware Video Anomaly Detection
CV and Pattern Recognition
Finds weird things happening in videos.
Language-guided Open-world Video Anomaly Detection
CV and Pattern Recognition
Teaches computers to spot new, changing bad things.
RefineVAD: Semantic-Guided Feature Recalibration for Weakly Supervised Video Anomaly Detection
CV and Pattern Recognition
Finds weird things in videos by watching motion and meaning.