There are numerous applications (e.g., video surveillance, fraud detection, cybersecurity) in which we wish to identify unexplained sets of events. Most related past work has been domain-dependent (e.g., video surveillance, cybersecurity) and has focused on the valuable class of statistical anomalies in which statistically unusual events are considered. In contrast, suppose there is a set A of known activity models (both harmless and harmful) and a log L of time-stamped observations. We define a part L′ ⊆ L of the log to represent an unexplained situation when none of the known activity models can explain L′ with a score exceeding a userspecified threshold. We represent activities via probabilistic penalty graphs (PPGs) and show how a set of PPGs can be combined into one Super-PPG for which we define an index structure. Given a compute cluster of (K+1) nodes (one of which is a master node), we show how to split a Super-PPG into K subgraphs, each of which can be independently processed by a compute node. We provide algorithms for the individual compute nodes to ensure seamless handoffs that maximally leverage parallelism. PADUA is domain-independent and can be applied to many domains (perhaps with some specialization). We conducted detailed experiments with PADUA on two real-world datasets-the ITEA CANDELA video surveillance dataset and a network traffic dataset appropriate for cybersecurity applications. PADUA scales extremely well with the number of processors and significantly outperforms past work both in accuracy and time. Thus, PADUA represents the first parallel architecture and algorithm for identifying unexplained situations in observation data, offering both scalability and accuracy.

PADUA: Parallel architecture to detect unexplained activities / Molinaro, C.; Moscato, Vincenzo; Picariello, Antonio; Pugliese, A.; Rullo, A.; Subrahmanian, V. S.. - In: ACM TRANSACTIONS ON INTERNET TECHNOLOGY. - ISSN 1533-5399. - 14:1(2014). [10.1145/2633685]

PADUA: Parallel architecture to detect unexplained activities

MOSCATO, VINCENZO;PICARIELLO, ANTONIO;
2014

Abstract

There are numerous applications (e.g., video surveillance, fraud detection, cybersecurity) in which we wish to identify unexplained sets of events. Most related past work has been domain-dependent (e.g., video surveillance, cybersecurity) and has focused on the valuable class of statistical anomalies in which statistically unusual events are considered. In contrast, suppose there is a set A of known activity models (both harmless and harmful) and a log L of time-stamped observations. We define a part L′ ⊆ L of the log to represent an unexplained situation when none of the known activity models can explain L′ with a score exceeding a userspecified threshold. We represent activities via probabilistic penalty graphs (PPGs) and show how a set of PPGs can be combined into one Super-PPG for which we define an index structure. Given a compute cluster of (K+1) nodes (one of which is a master node), we show how to split a Super-PPG into K subgraphs, each of which can be independently processed by a compute node. We provide algorithms for the individual compute nodes to ensure seamless handoffs that maximally leverage parallelism. PADUA is domain-independent and can be applied to many domains (perhaps with some specialization). We conducted detailed experiments with PADUA on two real-world datasets-the ITEA CANDELA video surveillance dataset and a network traffic dataset appropriate for cybersecurity applications. PADUA scales extremely well with the number of processors and significantly outperforms past work both in accuracy and time. Thus, PADUA represents the first parallel architecture and algorithm for identifying unexplained situations in observation data, offering both scalability and accuracy.
2014
PADUA: Parallel architecture to detect unexplained activities / Molinaro, C.; Moscato, Vincenzo; Picariello, Antonio; Pugliese, A.; Rullo, A.; Subrahmanian, V. S.. - In: ACM TRANSACTIONS ON INTERNET TECHNOLOGY. - ISSN 1533-5399. - 14:1(2014). [10.1145/2633685]
File in questo prodotto:
File Dimensione Formato  
TOIT2014.pdf

solo utenti autorizzati

Tipologia: Documento in Post-print
Licenza: Accesso privato/ristretto
Dimensione 1.27 MB
Formato Adobe PDF
1.27 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/586485
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 13
  • ???jsp.display-item.citation.isi??? 11
social impact