SlideShare una empresa de Scribd logo
1 de 14
Enhancing the Analysis of Software Failures in
Cloud Computing Systems with Deep Learning
Domenico Cotroneo, Luigi De Simone, Pietro Liguori, Roberto Natella
DIETI, Università degli Studi di Napoli Federico II, Italy
{cotroneo, luigi.desimone, pietro.liguori, roberto.natella}@unina.it
The 32nd International Symposium on Software Reliability Engineering
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 2
Cloud Computing Infrastructure
 Analyzing how faults can turn into service failures (Failure
Mode Analysis) is very difficult and time-consuming, even for
expert developers
• Huge volumes of data (hundreds of MBs, thousands of events)
• Large number of fault experiments
• High complexity, non-determinism
X
Faults
Storage, network,
software, etc.
Sys. admins
Failures
Data loss, resource
unavailable, etc.
IaaS
Service
requests
Clients
Failure Data
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 3
Our case study: OpenStack
Nova
Horizon
Cinder Neutron
Glance
Keystone
Swift
instance
creation
request
Silent failures occur as
omissions, delays, or out-of-
order events in these workflows
auth-token
validation
get image id
get IP
address
volume
attachment
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 4
Events in Fault-Injection Experiments
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 5
Contribution
 A novel approach for discovering the classes of
failure ("failure modes") of cloud computing systems,
using fault injection and deep learning
 Case study on a dataset of thousands of failures of
the OpenStack cloud computing platform
 The raw failure data (logs, event traces) are clustered into
few failure modes (ease of interpretation by developers and
sysadmins)
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 6
Contribution (cont.)
 The failure dataset containing the events collected in
OpenStack during our fault-injection experiments is
publicly available on GitHub:
https://github.com/dessertlab/Failure-Dataset-OpenStack
 The paper is available on ScienceDirect:
https://doi.org/10.1016/j.jss.2021.111043
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 7
Failure Mode Analysis Based on Plain
Sequences of Events
Vector
representation
Node
Node
Node
Traces under fault-
injected conditions
Execution with fault-
injection
1
Instrumentation
2
1
3
2
Instrumented
communication libraries
(REST APIs, Message
Queues, …)
Clustering
4
3
FAIL
#1
FAIL
#3
FAIL
#2
Visualization
5
AACABBA
Occurrence vector
<A = 4, B = 2, C = 1>
Clusters of failure
modes
Example: the events A, B, C happened
4, 2 and 1 times, respectively, during
the failure
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 8
Anomaly
Detection
Node
Node
Node
Traces under fault-
injected conditions
Traces under fault-
free conditions
Execution with fault-
injection
2
1
Instrumentation
3
1
3
2
Instrumented
communication libraries
(REST APIs, Message
Queues, …)
Fault-free execution
Clustering
6
Model training of
normal behavior
4
5 AACABBA
FAIL
#1
FAIL
#3
FAIL
#2
Visualization
7
Anomaly vector
spurious anomalies
< A = 1, B = 0, C = 1,
A = 0, B = 2, C = 1 >
missing anomalies
Clusters of failure
modes
AABBBBCA
AABBBABCC
AABBABBC
Failure Mode Analysis Based on
Anomaly Detection
Cotroneo, Domenico, et al. "Enhancing failure propagation analysis in cloud computing
systems." 2019 IEEE 30th International Symposium on Software Reliability Engineering
(ISSRE). IEEE, 2019.
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 9
Proposed Solution:
Deep Embedded Clustering (DEC)
Vector representation
Node
Node
Node
Traces under fault-
injected conditions
Execution with fault-
injection
1
Instrumentation
2
1
3
2
Instrumented
communication libraries
(REST APIs, Message
Queues, …)
Autoencoder
4
3
FAIL
#1
FAIL
#3
FAIL
#2
Visualization
6
Clusters of failure
modes
Clustering
Cluster
Layer
Encoder
embedded
features
5
Encoder Decoder
Reconstruction
Error
This solution can be used also in
combination with anomaly detection, by
applying it on anomaly vectors
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 10
Experiments
 2,538 fault-injection experiments in OpenStack cloud
computing:
• 4 fault-types
• 3 workloads (DEPL, NET, STO)
Failure Mode DEPL NET STO
Instance Failure 224 56 320
Volume Failure 151 - 38
Network Failure 52 30 -
SSH Failure 41 176 -
Cleanup Failure 69 - 157
No Failure 539 299 386
Ground Truth
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 11
Clustering without Anomaly Detection
Workload
Clustering
Approach
DEPL NET STO
k-medoids w/o fine-
tuning
0.70 0.80 0.80
k-medoids with
fine-tuning
0.74 0.85 0.82
DEC 0.86 0.86 0.92
DEC achieves clusters with higher purity
compared to traditional clustering, both without
and with manual fine-tuning of feature weights
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 12
Clustering with Anomaly Detection
Workload
Clustering
Approach
DEPL NET STO
k-medoids w/o fine-
tuning
0.80 0.78 0.87
k-medoids with
fine-tuning
0.94 0.86 0.90
DEC 0.84 0.83 0.89
DEC approaches the performance of manually-
tuned clustering with anomaly detection
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 13
Failure Modes Distribution
0
200
400
600
800
1000
1200
1400
1600
1800
Instance
Failure
Volume
Failure
Network
Failure
SSH Failure Cleanup
Failure
No Failure
Ground Truth k-medoids k-med with fine-tuning DEC
ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 14
Conclusion
 We presented a novel approach for analyzing failure
data from cloud systems, by using unsupervised
learning algorithms and deep learning
 We presented results on failure data from the popular
OpenStack cloud computing platform
• The approach can achieve performance comparable to, or in
some cases even better than, the performance of manually-
tuned clustering
• The approach performs better than unsupervised clustering
w/o feature engineering

Más contenido relacionado

La actualidad más candente

Predicting bugs using antipatterns
Predicting bugs using antipatternsPredicting bugs using antipatterns
Predicting bugs using antipatternsFoutse Khomh
 
Automated Program Repair Keynote talk
Automated Program Repair Keynote talkAutomated Program Repair Keynote talk
Automated Program Repair Keynote talkAbhik Roychoudhury
 
Alexandre Borges - Advanced Malware: rootkits, .NET and BIOS/UEFI threats - D...
Alexandre Borges - Advanced Malware: rootkits, .NET and BIOS/UEFI threats - D...Alexandre Borges - Advanced Malware: rootkits, .NET and BIOS/UEFI threats - D...
Alexandre Borges - Advanced Malware: rootkits, .NET and BIOS/UEFI threats - D...DC2711 - DEF CON GROUP - Johannesburg
 
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...Pôle Systematic Paris-Region
 
AI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are DangerousAI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are DangerousRaffael Marty
 
Technical Seminar on Securing the IoT in the Quantum World
Technical Seminar on Securing the IoT in the Quantum WorldTechnical Seminar on Securing the IoT in the Quantum World
Technical Seminar on Securing the IoT in the Quantum WorldSiri Murthy
 
Key Updating for Leakage Resiliency with Application to AES Modes of Operation
Key Updating for Leakage Resiliency with Application to AES Modes of OperationKey Updating for Leakage Resiliency with Application to AES Modes of Operation
Key Updating for Leakage Resiliency with Application to AES Modes of Operation1crore projects
 
AI for Cybersecurity Innovation
AI for Cybersecurity InnovationAI for Cybersecurity Innovation
AI for Cybersecurity InnovationPete Burnap
 
PhilipSamDavisResume
PhilipSamDavisResumePhilipSamDavisResume
PhilipSamDavisResumePhilip Davis
 
The Finest Penetration Testing Framework for Software-Defined Networks
The Finest Penetration Testing Framework for Software-Defined NetworksThe Finest Penetration Testing Framework for Software-Defined Networks
The Finest Penetration Testing Framework for Software-Defined NetworksPriyanka Aash
 
Secret key extraction from wireless signal strength in real environments
Secret key extraction from wireless signal strength in real environmentsSecret key extraction from wireless signal strength in real environments
Secret key extraction from wireless signal strength in real environmentsMuthu Sybian
 
A method for detecting obfuscated calls in malicious binaries
A method for detecting obfuscated calls in malicious binariesA method for detecting obfuscated calls in malicious binaries
A method for detecting obfuscated calls in malicious binariesUltraUploader
 
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario SpacagnaAnomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario SpacagnaData Science Milan
 

La actualidad más candente (17)

Predicting bugs using antipatterns
Predicting bugs using antipatternsPredicting bugs using antipatterns
Predicting bugs using antipatterns
 
Binary Analysis - Luxembourg
Binary Analysis - LuxembourgBinary Analysis - Luxembourg
Binary Analysis - Luxembourg
 
Automated Program Repair Keynote talk
Automated Program Repair Keynote talkAutomated Program Repair Keynote talk
Automated Program Repair Keynote talk
 
Qualifying exam-2015-final
Qualifying exam-2015-finalQualifying exam-2015-final
Qualifying exam-2015-final
 
Alexandre Borges - Advanced Malware: rootkits, .NET and BIOS/UEFI threats - D...
Alexandre Borges - Advanced Malware: rootkits, .NET and BIOS/UEFI threats - D...Alexandre Borges - Advanced Malware: rootkits, .NET and BIOS/UEFI threats - D...
Alexandre Borges - Advanced Malware: rootkits, .NET and BIOS/UEFI threats - D...
 
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
 
AI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are DangerousAI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are Dangerous
 
Technical Seminar on Securing the IoT in the Quantum World
Technical Seminar on Securing the IoT in the Quantum WorldTechnical Seminar on Securing the IoT in the Quantum World
Technical Seminar on Securing the IoT in the Quantum World
 
Key Updating for Leakage Resiliency with Application to AES Modes of Operation
Key Updating for Leakage Resiliency with Application to AES Modes of OperationKey Updating for Leakage Resiliency with Application to AES Modes of Operation
Key Updating for Leakage Resiliency with Application to AES Modes of Operation
 
AI for Cybersecurity Innovation
AI for Cybersecurity InnovationAI for Cybersecurity Innovation
AI for Cybersecurity Innovation
 
PhilipSamDavisResume
PhilipSamDavisResumePhilipSamDavisResume
PhilipSamDavisResume
 
Esrel08 Final
Esrel08 FinalEsrel08 Final
Esrel08 Final
 
The Finest Penetration Testing Framework for Software-Defined Networks
The Finest Penetration Testing Framework for Software-Defined NetworksThe Finest Penetration Testing Framework for Software-Defined Networks
The Finest Penetration Testing Framework for Software-Defined Networks
 
JavaSecure
JavaSecureJavaSecure
JavaSecure
 
Secret key extraction from wireless signal strength in real environments
Secret key extraction from wireless signal strength in real environmentsSecret key extraction from wireless signal strength in real environments
Secret key extraction from wireless signal strength in real environments
 
A method for detecting obfuscated calls in malicious binaries
A method for detecting obfuscated calls in malicious binariesA method for detecting obfuscated calls in malicious binaries
A method for detecting obfuscated calls in malicious binaries
 
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario SpacagnaAnomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
 

Similar a Enhancing the Analysis of Software Failures in Cloud Computing Systems with Deep Learning

REVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNNREVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNNIRJET Journal
 
Azure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKS
Azure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKSAzure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKS
Azure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKSazuredayit
 
DevOps with Kubernetes and Helm - Jenkins World Edition
DevOps with Kubernetes and Helm - Jenkins World EditionDevOps with Kubernetes and Helm - Jenkins World Edition
DevOps with Kubernetes and Helm - Jenkins World EditionJessica Deen
 
indroduction of rain technology
indroduction of rain technologyindroduction of rain technology
indroduction of rain technologynarayan dudhe
 
Sample PPT Format.pptx E-commerce website for login
Sample PPT Format.pptx E-commerce website for loginSample PPT Format.pptx E-commerce website for login
Sample PPT Format.pptx E-commerce website for loginnaveendurga557
 
Virtual Machines Security Internals: Detection and Exploitation
 Virtual Machines Security Internals: Detection and Exploitation Virtual Machines Security Internals: Detection and Exploitation
Virtual Machines Security Internals: Detection and ExploitationMattia Salvi
 
Anomaly Detection at Scale
Anomaly Detection at ScaleAnomaly Detection at Scale
Anomaly Detection at ScaleJeff Henrikson
 
How to get started with Oracle Cloud Infrastructure
How to get started with Oracle Cloud InfrastructureHow to get started with Oracle Cloud Infrastructure
How to get started with Oracle Cloud InfrastructureSimo Vilmunen
 
Using Tetration for application security and policy enforcement in multi-vend...
Using Tetration for application security and policy enforcement in multi-vend...Using Tetration for application security and policy enforcement in multi-vend...
Using Tetration for application security and policy enforcement in multi-vend...Joel W. King
 
DevOps with Kubernetes and Helm - OSCON 2018
DevOps with Kubernetes and Helm - OSCON 2018DevOps with Kubernetes and Helm - OSCON 2018
DevOps with Kubernetes and Helm - OSCON 2018Jessica Deen
 
Openstack Pakistan Workshop (intro)
Openstack Pakistan Workshop (intro)Openstack Pakistan Workshop (intro)
Openstack Pakistan Workshop (intro)Affan Syed
 
DATABASE PRIVATE SECURITY JURISPRUDENCE: A CASE STUDY USING ORACLE
DATABASE PRIVATE SECURITY JURISPRUDENCE: A CASE STUDY USING ORACLEDATABASE PRIVATE SECURITY JURISPRUDENCE: A CASE STUDY USING ORACLE
DATABASE PRIVATE SECURITY JURISPRUDENCE: A CASE STUDY USING ORACLEijdms
 
Automated Abstraction of Flow of Control in a System of Distributed Software...
Automated Abstraction of Flow of Control in a System of Distributed  Software...Automated Abstraction of Flow of Control in a System of Distributed  Software...
Automated Abstraction of Flow of Control in a System of Distributed Software...nimak
 
Weave User Group Talk - DockerCon 2017 Recap
Weave User Group Talk - DockerCon 2017 RecapWeave User Group Talk - DockerCon 2017 Recap
Weave User Group Talk - DockerCon 2017 RecapPatrick Chanezon
 
Using Ansible Tower to implement security policies and telemetry streaming fo...
Using Ansible Tower to implement security policies and telemetry streaming fo...Using Ansible Tower to implement security policies and telemetry streaming fo...
Using Ansible Tower to implement security policies and telemetry streaming fo...Joel W. King
 
Towards differential query services in cost efficient clouds
Towards differential query services in cost efficient cloudsTowards differential query services in cost efficient clouds
Towards differential query services in cost efficient cloudsIEEEFINALYEARPROJECTS
 
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Towards differential query servic...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Towards differential query servic...JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Towards differential query servic...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Towards differential query servic...IEEEGLOBALSOFTTECHNOLOGIES
 
DOST 2016 Cloud Without Failures
DOST 2016 Cloud Without FailuresDOST 2016 Cloud Without Failures
DOST 2016 Cloud Without FailuresJorge Cardoso
 

Similar a Enhancing the Analysis of Software Failures in Cloud Computing Systems with Deep Learning (20)

REVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNNREVIEW ON OBJECT DETECTION WITH CNN
REVIEW ON OBJECT DETECTION WITH CNN
 
Azure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKS
Azure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKSAzure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKS
Azure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKS
 
DevOps with Kubernetes and Helm - Jenkins World Edition
DevOps with Kubernetes and Helm - Jenkins World EditionDevOps with Kubernetes and Helm - Jenkins World Edition
DevOps with Kubernetes and Helm - Jenkins World Edition
 
indroduction of rain technology
indroduction of rain technologyindroduction of rain technology
indroduction of rain technology
 
ACAT08_040
ACAT08_040ACAT08_040
ACAT08_040
 
Sample PPT Format.pptx E-commerce website for login
Sample PPT Format.pptx E-commerce website for loginSample PPT Format.pptx E-commerce website for login
Sample PPT Format.pptx E-commerce website for login
 
Virtual Machines Security Internals: Detection and Exploitation
 Virtual Machines Security Internals: Detection and Exploitation Virtual Machines Security Internals: Detection and Exploitation
Virtual Machines Security Internals: Detection and Exploitation
 
Anomaly Detection at Scale
Anomaly Detection at ScaleAnomaly Detection at Scale
Anomaly Detection at Scale
 
How to get started with Oracle Cloud Infrastructure
How to get started with Oracle Cloud InfrastructureHow to get started with Oracle Cloud Infrastructure
How to get started with Oracle Cloud Infrastructure
 
Using Tetration for application security and policy enforcement in multi-vend...
Using Tetration for application security and policy enforcement in multi-vend...Using Tetration for application security and policy enforcement in multi-vend...
Using Tetration for application security and policy enforcement in multi-vend...
 
DevOps with Kubernetes and Helm - OSCON 2018
DevOps with Kubernetes and Helm - OSCON 2018DevOps with Kubernetes and Helm - OSCON 2018
DevOps with Kubernetes and Helm - OSCON 2018
 
Openstack Pakistan Workshop (intro)
Openstack Pakistan Workshop (intro)Openstack Pakistan Workshop (intro)
Openstack Pakistan Workshop (intro)
 
DATABASE PRIVATE SECURITY JURISPRUDENCE: A CASE STUDY USING ORACLE
DATABASE PRIVATE SECURITY JURISPRUDENCE: A CASE STUDY USING ORACLEDATABASE PRIVATE SECURITY JURISPRUDENCE: A CASE STUDY USING ORACLE
DATABASE PRIVATE SECURITY JURISPRUDENCE: A CASE STUDY USING ORACLE
 
Automated Abstraction of Flow of Control in a System of Distributed Software...
Automated Abstraction of Flow of Control in a System of Distributed  Software...Automated Abstraction of Flow of Control in a System of Distributed  Software...
Automated Abstraction of Flow of Control in a System of Distributed Software...
 
Cisco OpenSOC
Cisco OpenSOCCisco OpenSOC
Cisco OpenSOC
 
Weave User Group Talk - DockerCon 2017 Recap
Weave User Group Talk - DockerCon 2017 RecapWeave User Group Talk - DockerCon 2017 Recap
Weave User Group Talk - DockerCon 2017 Recap
 
Using Ansible Tower to implement security policies and telemetry streaming fo...
Using Ansible Tower to implement security policies and telemetry streaming fo...Using Ansible Tower to implement security policies and telemetry streaming fo...
Using Ansible Tower to implement security policies and telemetry streaming fo...
 
Towards differential query services in cost efficient clouds
Towards differential query services in cost efficient cloudsTowards differential query services in cost efficient clouds
Towards differential query services in cost efficient clouds
 
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Towards differential query servic...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Towards differential query servic...JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Towards differential query servic...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Towards differential query servic...
 
DOST 2016 Cloud Without Failures
DOST 2016 Cloud Without FailuresDOST 2016 Cloud Without Failures
DOST 2016 Cloud Without Failures
 

Último

%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburgmasabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationShrmpro
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durbanmasabamasaba
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 

Último (20)

%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 

Enhancing the Analysis of Software Failures in Cloud Computing Systems with Deep Learning

  • 1. Enhancing the Analysis of Software Failures in Cloud Computing Systems with Deep Learning Domenico Cotroneo, Luigi De Simone, Pietro Liguori, Roberto Natella DIETI, Università degli Studi di Napoli Federico II, Italy {cotroneo, luigi.desimone, pietro.liguori, roberto.natella}@unina.it The 32nd International Symposium on Software Reliability Engineering
  • 2. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 2 Cloud Computing Infrastructure  Analyzing how faults can turn into service failures (Failure Mode Analysis) is very difficult and time-consuming, even for expert developers • Huge volumes of data (hundreds of MBs, thousands of events) • Large number of fault experiments • High complexity, non-determinism X Faults Storage, network, software, etc. Sys. admins Failures Data loss, resource unavailable, etc. IaaS Service requests Clients Failure Data
  • 3. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 3 Our case study: OpenStack Nova Horizon Cinder Neutron Glance Keystone Swift instance creation request Silent failures occur as omissions, delays, or out-of- order events in these workflows auth-token validation get image id get IP address volume attachment
  • 4. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 4 Events in Fault-Injection Experiments
  • 5. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 5 Contribution  A novel approach for discovering the classes of failure ("failure modes") of cloud computing systems, using fault injection and deep learning  Case study on a dataset of thousands of failures of the OpenStack cloud computing platform  The raw failure data (logs, event traces) are clustered into few failure modes (ease of interpretation by developers and sysadmins)
  • 6. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 6 Contribution (cont.)  The failure dataset containing the events collected in OpenStack during our fault-injection experiments is publicly available on GitHub: https://github.com/dessertlab/Failure-Dataset-OpenStack  The paper is available on ScienceDirect: https://doi.org/10.1016/j.jss.2021.111043
  • 7. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 7 Failure Mode Analysis Based on Plain Sequences of Events Vector representation Node Node Node Traces under fault- injected conditions Execution with fault- injection 1 Instrumentation 2 1 3 2 Instrumented communication libraries (REST APIs, Message Queues, …) Clustering 4 3 FAIL #1 FAIL #3 FAIL #2 Visualization 5 AACABBA Occurrence vector <A = 4, B = 2, C = 1> Clusters of failure modes Example: the events A, B, C happened 4, 2 and 1 times, respectively, during the failure
  • 8. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 8 Anomaly Detection Node Node Node Traces under fault- injected conditions Traces under fault- free conditions Execution with fault- injection 2 1 Instrumentation 3 1 3 2 Instrumented communication libraries (REST APIs, Message Queues, …) Fault-free execution Clustering 6 Model training of normal behavior 4 5 AACABBA FAIL #1 FAIL #3 FAIL #2 Visualization 7 Anomaly vector spurious anomalies < A = 1, B = 0, C = 1, A = 0, B = 2, C = 1 > missing anomalies Clusters of failure modes AABBBBCA AABBBABCC AABBABBC Failure Mode Analysis Based on Anomaly Detection Cotroneo, Domenico, et al. "Enhancing failure propagation analysis in cloud computing systems." 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 2019.
  • 9. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 9 Proposed Solution: Deep Embedded Clustering (DEC) Vector representation Node Node Node Traces under fault- injected conditions Execution with fault- injection 1 Instrumentation 2 1 3 2 Instrumented communication libraries (REST APIs, Message Queues, …) Autoencoder 4 3 FAIL #1 FAIL #3 FAIL #2 Visualization 6 Clusters of failure modes Clustering Cluster Layer Encoder embedded features 5 Encoder Decoder Reconstruction Error This solution can be used also in combination with anomaly detection, by applying it on anomaly vectors
  • 10. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 10 Experiments  2,538 fault-injection experiments in OpenStack cloud computing: • 4 fault-types • 3 workloads (DEPL, NET, STO) Failure Mode DEPL NET STO Instance Failure 224 56 320 Volume Failure 151 - 38 Network Failure 52 30 - SSH Failure 41 176 - Cleanup Failure 69 - 157 No Failure 539 299 386 Ground Truth
  • 11. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 11 Clustering without Anomaly Detection Workload Clustering Approach DEPL NET STO k-medoids w/o fine- tuning 0.70 0.80 0.80 k-medoids with fine-tuning 0.74 0.85 0.82 DEC 0.86 0.86 0.92 DEC achieves clusters with higher purity compared to traditional clustering, both without and with manual fine-tuning of feature weights
  • 12. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 12 Clustering with Anomaly Detection Workload Clustering Approach DEPL NET STO k-medoids w/o fine- tuning 0.80 0.78 0.87 k-medoids with fine-tuning 0.94 0.86 0.90 DEC 0.84 0.83 0.89 DEC approaches the performance of manually- tuned clustering with anomaly detection
  • 13. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 13 Failure Modes Distribution 0 200 400 600 800 1000 1200 1400 1600 1800 Instance Failure Volume Failure Network Failure SSH Failure Cleanup Failure No Failure Ground Truth k-medoids k-med with fine-tuning DEC
  • 14. ISSRE, October 25 - 28, 2021 pietro.liguori@unina.it - 14 Conclusion  We presented a novel approach for analyzing failure data from cloud systems, by using unsupervised learning algorithms and deep learning  We presented results on failure data from the popular OpenStack cloud computing platform • The approach can achieve performance comparable to, or in some cases even better than, the performance of manually- tuned clustering • The approach performs better than unsupervised clustering w/o feature engineering