Cloud management systems provide abstractions and APIs for programmatically configuring cloud infrastructures. Unfortunately, residual software bugs in these systems can potentially lead to high-severity failures, such as prolonged outages and data losses. In this paper, we investigate the impact of failures in the context widespread OpenStack cloud management system, by performing fault injection and by analyzing the impact of the resulting failures in terms of fail-stop behavior, failure detection through logging, and failure propagation across components. The analysis points out that most of the failures are not timely detected and notified; moreover, many of these failures can silently propagate over time and through components of the cloud management system, which call for more thorough run-time checks and fault containment.

How bad can a bug get? An empirical analysis of software failures in the OpenStack cloud computing platform / Cotroneo, D.; De Simone, L.; Liguori, Pietro; Natella, R.; Bidokhti, N.. - (2019), pp. 200-211. (Intervento presentato al convegno 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2019 tenutosi a Estonia nel 2019) [10.1145/3338906.3338916].

How bad can a bug get? An empirical analysis of software failures in the OpenStack cloud computing platform

Cotroneo D.;De Simone L.;LIGUORI, PIETRO;Natella R.;
2019

Abstract

Cloud management systems provide abstractions and APIs for programmatically configuring cloud infrastructures. Unfortunately, residual software bugs in these systems can potentially lead to high-severity failures, such as prolonged outages and data losses. In this paper, we investigate the impact of failures in the context widespread OpenStack cloud management system, by performing fault injection and by analyzing the impact of the resulting failures in terms of fail-stop behavior, failure detection through logging, and failure propagation across components. The analysis points out that most of the failures are not timely detected and notified; moreover, many of these failures can silently propagate over time and through components of the cloud management system, which call for more thorough run-time checks and fault containment.
2019
9781450355728
How bad can a bug get? An empirical analysis of software failures in the OpenStack cloud computing platform / Cotroneo, D.; De Simone, L.; Liguori, Pietro; Natella, R.; Bidokhti, N.. - (2019), pp. 200-211. (Intervento presentato al convegno 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2019 tenutosi a Estonia nel 2019) [10.1145/3338906.3338916].
File in questo prodotto:
File Dimensione Formato  
How bad can a bug get? An empirical analysis of software failures in the OpenStack cloud computing platform.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: Copyright dell'editore
Dimensione 1.06 MB
Formato Adobe PDF
1.06 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/766427
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 45
  • ???jsp.display-item.citation.isi??? 42
social impact