Video-based alarm notification systems play a critical role in safety-critical applications such as fire detection and pedestrian monitoring, because they allow for prompt intervention in the event of dangerous situations. However, existing approaches often struggle to balance high sensitivity and specificity with real-time performance, particularly in complex scenes. To address these limitations, we propose a novel method enabling fast alarm notification with task-aware spatio-temporal image classification (FAN-TAST-IC). It is an innovative and efficient framework that combines a lightweight task-specific object detector with a pre-trained Vision-Language Model (VLM) encoder and a binary classifier. Unlike end-to-end multimodal systems, FAN-TAST-IC leverages the VLM solely as a frozen visual feature extractor, preserving its rich semantic knowledge while ensuring computational efficiency. The object detector performs a rough but real-time filtering of the temporal frames and spatial positions where objects can be. This filtering is then refined in two distinct ways, depending on the time constraints of the specific application, either to discard temporally incoherent detections or to confirm those with high confidence. The selection of such candidates drastically reduces the input space for the VLM and classifier to dubious detections only. Thus, the latter is trained on detector-guided positive and negative samples, enabling precise alarm validation, improving specificity, and preserving sensitivity without requiring extensive labeled datasets or fine-tuning the VLM. Experiments on fire and pedestrian detection tasks demonstrate the effectiveness of our method, since FAN-TAST-IC consistently outperforms all the compared approaches, achieving superior F-scores and precision on challenging benchmarks while maintaining real-time capabilities.

FAN-TAST-IC: Fast Alarm Notification with Task-Aware Spatio-Temporal Image Classification / Gragnaniello, Diego; Greco, Antonio; Sansone, Carlo; Vento, Bruno. - In: ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS AND APPLICATIONS. - ISSN 1551-6857. - (2026). [10.1145/3801157]

FAN-TAST-IC: Fast Alarm Notification with Task-Aware Spatio-Temporal Image Classification

Gragnaniello, Diego;Sansone, Carlo;Vento, Bruno
2026

Abstract

Video-based alarm notification systems play a critical role in safety-critical applications such as fire detection and pedestrian monitoring, because they allow for prompt intervention in the event of dangerous situations. However, existing approaches often struggle to balance high sensitivity and specificity with real-time performance, particularly in complex scenes. To address these limitations, we propose a novel method enabling fast alarm notification with task-aware spatio-temporal image classification (FAN-TAST-IC). It is an innovative and efficient framework that combines a lightweight task-specific object detector with a pre-trained Vision-Language Model (VLM) encoder and a binary classifier. Unlike end-to-end multimodal systems, FAN-TAST-IC leverages the VLM solely as a frozen visual feature extractor, preserving its rich semantic knowledge while ensuring computational efficiency. The object detector performs a rough but real-time filtering of the temporal frames and spatial positions where objects can be. This filtering is then refined in two distinct ways, depending on the time constraints of the specific application, either to discard temporally incoherent detections or to confirm those with high confidence. The selection of such candidates drastically reduces the input space for the VLM and classifier to dubious detections only. Thus, the latter is trained on detector-guided positive and negative samples, enabling precise alarm validation, improving specificity, and preserving sensitivity without requiring extensive labeled datasets or fine-tuning the VLM. Experiments on fire and pedestrian detection tasks demonstrate the effectiveness of our method, since FAN-TAST-IC consistently outperforms all the compared approaches, achieving superior F-scores and precision on challenging benchmarks while maintaining real-time capabilities.
2026
FAN-TAST-IC: Fast Alarm Notification with Task-Aware Spatio-Temporal Image Classification / Gragnaniello, Diego; Greco, Antonio; Sansone, Carlo; Vento, Bruno. - In: ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS AND APPLICATIONS. - ISSN 1551-6857. - (2026). [10.1145/3801157]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/1039182
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact