Novel multimodal contrast learning framework using zero-shot prediction for abnormal behavior recognition
Document Type
Article
Publication Date
1-1-2025
Abstract
Human abnormal behavior detection is important to ensure public safety and prevent unwanted incidents. Currently, recognition systems for human abnormal behavior adopt neural network models and perform standard 1-of-N majority voting procedures. However, recognizing human abnormal behaviors can be challenging due to lengthy and numerous video datasets and the limitations of existing methods that rely on predefined categories and scenarios. This study proposed a novel method named Visual Text Contrastive Learning (VTCL) for identifying abnormal human behavior in campus settings. The proposed model emphasizes semantic information from automatically labeled properties text and videos of abnormal behaviors, moving beyond simple numerical representations. The proposed method integrates the cross and multi-frame methods within the visual branch to improve spatial and temporal performance. In the textual branch, the proposed prompting technique captures the contextual backdrop of abnormal behaviors to enrich supervision with behavioral semantic information. Then, the model learns the visual-text features to enhance the learning process through contrastive learning techniques. In addition, this work also presented a new study to explore zero-shot campus abnormal behavior recognition (CABR). It lays the foundation for unlocking the implementation of highly available and robust CABR for multiple and even new scenarios. The proposed VTCL model demonstrated a Top-1 accuracy of 86.92% and a Top-5 accuracy of 98.14% on the CABR50 dataset, including fifty abnormal behaviors on campus, with competitive computational complexity. Furthermore, the zero-shot performance of the proposed model showed competitive outcomes when evaluated on additional datasets, including CABRZ6 and UCF-101.
Keywords
Abnormal behavior, Multimodal learning, Recognition, Semantic information, Zero-shot
Divisions
sch_ecs
Funders
Universiti Malaya, Malaysia (ST018-2023)
Publication Title
Applied Intelligence
Volume
55
Issue
2
Publisher
Springer
Publisher Location
VAN GODEWIJCKSTRAAT 30, 3311 GZ DORDRECHT, NETHERLANDS