Two-stage complex action recognition framework for real-time surveillance automatic violence detection

Λεπτομέρειες βιβλιογραφικής εγγραφής
Τίτλος: Two-stage complex action recognition framework for real-time surveillance automatic violence detection
Συγγραφείς: Dylan Josh D. Lopez, Cheng‐Chang Lien
Πηγή: Journal of Ambient Intelligence and Humanized Computing. 14:15983-15996
Στοιχεία εκδότη: Springer Science and Business Media LLC, 2023.
Έτος έκδοσης: 2023
Θεματικοί όροι: Pooling, Artificial intelligence, Action (physics), Physics, 02 engineering and technology, 16. Peace & justice, Motion Detection, Computer science, Quantum mechanics, Visual Object Tracking and Person Re-identification, Anomaly Detection in High-Dimensional Data, 5. Gender equality, Artificial Intelligence, Action Recognition, Human Action Recognition and Pose Estimation, Computer Science, Physical Sciences, Multiple Object Tracking, Machine learning, 0202 electrical engineering, electronic engineering, information engineering, Computer vision, Computer Vision and Pattern Recognition, Smoothing
Περιγραφή: Violent action classification in community-based surveillance is a particularly challenging concept in itself. The ambiguity of violence as a complex action can lead to the misclassification of violence-related crimes in detection models and the increased complexity of intelligent surveillance systems leading to greater costs in operations or cost of lives. This paper demonstrates a novel approach to performing automatic violence detection by considering violence as complex actions mitigating oversimplification or overgeneralization of detection models. The proposed work supports the notion that violence is a complex action and is classifiable through decomposition into more identifiable actions that could be easily recognized by human action recognition algorithms. A two-stage framework was designed to detect simple actions which are sub-concepts of violence in a two-stream action recognition architecture. Using a basic logistic regression layer, simple actions were further classified as complex actions for violence detection. Varying configurations of the work were tested, such as applying action silhouettes, varying activation caching sizes, and different pooling methods for post-classification smoothing. The framework was evaluated considering accuracy, recall, and operational speed considering its implications in community deployment. The experimental results show that the developed framework reaches 21 FPS operation speeds for real-time operations and 11 FPS for non-real-time operations. Using the proposed variable caching algorithm, median pooling results in accuracy reaching 83.08% and 80.50% for non-real-time and real-time operations. In comparison, applying max pooling results to recalls reached 89.55% and 84.93% for non-real-time and real-time operations, respectively. This paper shows that complex action decomposition is deemed to be an appropriate method through the comparable performance with existing efforts that have not considered violence as complex actions implying a new perspective for automatic violence detection in intelligent surveillance systems.
Τύπος εγγράφου: Article
Other literature type
Γλώσσα: English
ISSN: 1868-5145
1868-5137
DOI: 10.1007/s12652-023-04679-6
DOI: 10.60692/zas43-qma30
DOI: 10.60692/xctcb-6s133
Rights: CC BY
Αριθμός Καταχώρησης: edsair.doi.dedup.....dd5f6a19f040367e3a84fd1c93edc71e
Βάση Δεδομένων: OpenAIRE
Περιγραφή
ISSN:18685145
18685137
DOI:10.1007/s12652-023-04679-6