This paper presents a novel approach to assess time coalescence techniques. These techniques are widely used to reconstruct the failure process of a system and to estimate dependability measurements from its event logs. The approach is based on the use of automatically generated logs, accompanied by the exact knowledge of the ground truth on the failure process. The assessment is conducted by comparing the presumed failure process, reconstructed via coalescence, with the ground truth. We focus on supercomputer logs, due to increasing importance of automatic event log analysis for these systems. Experimental results show how the approach allows to compare different time coalescence techniques and to identify their weaknesses with respect to given system settings. In addition, results revealed an interesting correlation between errors caused by the coalescence and errors in the estimation of dependability measurements. Index Terms—Event Log Analysis, supercomputer dependability, data coalescence, dependability assessment
Assessing Time Coalescence Techniques for the Analysis of Supercomputer Logs / C., Di Martino; Cinque, Marcello; Cotroneo, Domenico. - (2012), pp. 1-12. (Intervento presentato al convegno 42nd Ann. IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’12) tenutosi a Boston, USA nel June, 2012) [10.1109/DSN.2012.6263946].
Assessing Time Coalescence Techniques for the Analysis of Supercomputer Logs
CINQUE, MARCELLO;COTRONEO, DOMENICO
2012
Abstract
This paper presents a novel approach to assess time coalescence techniques. These techniques are widely used to reconstruct the failure process of a system and to estimate dependability measurements from its event logs. The approach is based on the use of automatically generated logs, accompanied by the exact knowledge of the ground truth on the failure process. The assessment is conducted by comparing the presumed failure process, reconstructed via coalescence, with the ground truth. We focus on supercomputer logs, due to increasing importance of automatic event log analysis for these systems. Experimental results show how the approach allows to compare different time coalescence techniques and to identify their weaknesses with respect to given system settings. In addition, results revealed an interesting correlation between errors caused by the coalescence and errors in the estimation of dependability measurements. Index Terms—Event Log Analysis, supercomputer dependability, data coalescence, dependability assessmentI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.