SlideShare una empresa de Scribd logo
Failure prediction for
APU's on a Metro
System
Aaryadev Ghosalkar
Agenda
• Introduction
• Problem Statement
• Literature Review
• Existing Systems
• Modelling
• Results
• Conclusion
• Future Scope
Failure prediction for APU's on a Metro System 2
Introduction
Air production unit (APU) is component on most modern metro systems
which circulates compressed air through the metro, our research presents a
comprehensive comparison of PdM models on APU failure detection. Our
goal is to detect failures at least 2 hours in advance, the challenge with APU's
in particular is that APU failures are rare events
In the dataset there are 3 failure within 6 months of operation
Failure prediction for APU's on a Metro System 3
Problem Statement
• Detect failures at least 2 hours in advance
• Create a system that is customizable can expand to more than 2
hours if required (also keeping in mind the side effects that may
arise)
• Provide a way to deploy the model for real time inference
Failure prediction for APU's on a Metro System
4
Literature
Review
Taking a look at what other
researchers have done
Literature Review
• Veloso et al performed the initial data collection, which
involved converting data from the metro company
database into a CSV file
• We used some of the ideas for data preprocessing
presented by the AzureML team at microsoft
• A study on Davari et al motivated many of the algorithms
that we used in this study, there work has been a great
resource in our study
Failure prediction for APU's on a Metro System 6
Literature Review continued
• Chaudhuri et al used SVM to classify vehicles into 3
distinct risk levels, focusing on model interpretability,
model interpretability refers to how easy it is to
understand the choices made by the model and results of
the model
• Various other researchers have used CNN with GAF to
convert timeseries data into images and used CNNs for
classification most notable was done by Silva which
reported an accuracy of 93% in their study
Failure prediction for APU's on a Metro System 7
Literature Review continued
• Nguyen and Medjaher used LSTM to predicted the
probability of failure in a given time window, we drew a lot
of inspiration from their work in terms of balancing the
dataset and the model used
• In terms of early research into RUL estimation a majority
of the work revolves around using statistical models or
models which assume linear degradation pattern, it was
only after 2016 Deep learning models were used in this
field
Failure prediction for APU's on a Metro System 8
Existing
Systems
Predictive maintenance has been
used in a lot of domains such as
elevators and Jet engines.
The Infrastructure abroad
• Uses NLP on text fields where engineers
describe the problem and how the
problem was solved.
• Also predict if a failure is about to happen
and which component will fail
• TFL expects to save £3 million a year
Failure prediction for APU's on a Metro System
10
• TrainDNA collects and analyzes real-
time data from more than 200 trains
across Australia using IBM Maximo
• 51% Increase in realiablity after
introducing the TrainDNA system
• Considers multiple failures
• Large scale system able to process 30
Million message every hour
London Australia and NZL
Modelling
Taking a look at how the data
was processed and what ML
models were tested
Failure prediction for APU's on a Metro System 11
Data Preprocessing
Most of the data collection
and cleaning work on this
data set was done by
researchers that created
the data set
Thus this data set did not
have any null values
Failure prediction for APU's on a Metro System
• Linear Model are a bad
idea
• There is no obvious
pattern which we can
see to distinguish each
class, which further
motivated use of deep
learning
• Balance is key!
Collection EDA
• Discretizing the data to
make this more
suitable for
classification
• Generalizing the model
using a parameter to
control number of
hours before warning.
Preparation
12
Challenges in data processing
• The sheer volume of data presented a significant
challenge, there were more than 10 000 000 (1Cr) rows in
the dataset to solve this we performed carefully changed
the data types of the features to minimize the loss of
information this allowed us to significantly reduce size of
the data with a 91% decrease in storage space and 77%
decrease in RAM usage
Failure prediction for APU's on a Metro System 13
Challenges in Data processing
continued
• The highly imbalanced nature of the data also presented
a huge challenge when training the models, since APU
failures are a rare event and only 3% of the data
constitutes of APU failure a model that predicts all data
points as normal would mathematically have an accuracy
of 97%, to combat this we used Near miss under
sampling and a convenience sampling like strategy for
the LSTM, inspired by Chen et al
Failure prediction for APU's on a Metro System 14
Machine Learning Models
• SVMs take very long to
train
• Results are not
satisfactory and most
other models perform
better
Failure prediction for APU's on a Metro System
• Very quick training time
• The hyper parameters
such as the criterion do
not make any
difference to the
accuracy
SVM Decision Trees
• Provide greater
accuracy than decision
trees due to this being
an ensemble model
• Training time is similar
to decision trees
Random Forests
15
Deep Learning Models
• Used a small neural network of 25k
parameters with AdamW optimizer
• Best results from all tested model
• Does not take into account the time
series nature of the data
• Requires CUDA for efficient
deployment.
Failure prediction for APU's on a Metro System 16
• Can incorporate temporal patterns
• Large model almost 184K
parameters which makes training
difficult.
• Requires 3rd order tensors as input
for training thus balancing data is
hard
• Also requires CUDA for efficient
training and deployment
Neural Network LSTM Network
Results
Accuracy Precision (Class
1)
Recall (Class 1)
SVM 0.57 0.60 0.62
Decision Tree 0.66 0.69 0.70
Random Forest 0.70 0.69 0.78
Neural Network (Adam) 0.72 0.67 0.91
Neural Network (AdamW) 0.76 0.75 0.87
LSTM Network 0.76 0.64 0.85
Failure prediction for APU's on a Metro System 17
Conclusion
• Neural networks produce satisfactory results however the
Random forest model can be used when deploying on
low end hardware or considering an edge solution.
LSTMs perform nearly identical to neural networks with
less data
• Microservices can be considered when deploying as this
will allow different parts of the model to be deployed
separately
• Multi collinearity is present in the data which can makes
it hard to fit any kind of linear model
Failure prediction for APU's on a Metro System 18
Future Scope
• Transformer based architecture can be considered as
they have shown promising results with sequential data in
the case of NLP applications
• Perhaps due to the imbalanced nature and the rarity of
APU failures on a real metro systems an anomaly
detection approach would be better as this would not
need the data to be balanced and a lot more of the
existing data can be used
Failure prediction for APU's on a Metro System 19
Question and
Answer
Aaryadev Ghosalkar
aaryadevg@gmail.com
https://github.com/aaryadevg

Más contenido relacionado

Similar a Failure Prediction for APU on a Metro System

An Approach to Overcome Modeling Inaccuracies for Performance Simulation Sig...
An Approach to Overcome Modeling  Inaccuracies for Performance Simulation Sig...An Approach to Overcome Modeling  Inaccuracies for Performance Simulation Sig...
An Approach to Overcome Modeling Inaccuracies for Performance Simulation Sig...
Pankaj Singh
 
Improving Resource Utilization in Cloud using Application Placement Heuristics
Improving Resource Utilization in Cloud using Application Placement HeuristicsImproving Resource Utilization in Cloud using Application Placement Heuristics
Improving Resource Utilization in Cloud using Application Placement Heuristics
AtakanAral
 
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
Mohsen Sadok
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
ashishmulchandani
 
Recent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor ModelRecent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor Model
Sandia National Laboratories: Energy & Climate: Renewables
 
Ajila (1)
Ajila (1)Ajila (1)
Ajila (1)
akanksha kunwar
 
Approximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithmsApproximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithms
Sabidur Rahman
 
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
mailjkb
 
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related ToolsModeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
Mohammed Assiri
 
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
amin azari
 
Biomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABBiomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLAB
CodeOps Technologies LLP
 
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Peter Tröger
 
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Databricks
 
Real time operating systems
Real time operating systemsReal time operating systems
Real time operating systems
Sri Manakula Vinayagar Engineering College
 
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
IRJET Journal
 
MCS2SIM - Method Allowing Application of PSA Results in Simulators
MCS2SIM - Method Allowing Application of PSA Results in SimulatorsMCS2SIM - Method Allowing Application of PSA Results in Simulators
MCS2SIM - Method Allowing Application of PSA Results in Simulators
GSE Systems, Inc.
 
KCC2017 28APR2017
KCC2017 28APR2017KCC2017 28APR2017
KCC2017 28APR2017
JEE HYUN PARK
 
Performance of a speculative transmission scheme for scheduling latency reduc...
Performance of a speculative transmission scheme for scheduling latency reduc...Performance of a speculative transmission scheme for scheduling latency reduc...
Performance of a speculative transmission scheme for scheduling latency reduc...
Mumbai Academisc
 
Life in the Fast Lane: A Line-Rate Linear Road
Life in the Fast Lane: A Line-Rate Linear RoadLife in the Fast Lane: A Line-Rate Linear Road
Life in the Fast Lane: A Line-Rate Linear Road
AJAY KHARAT
 
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONS
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONSAPPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONS
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONS
VLSICS Design
 

Similar a Failure Prediction for APU on a Metro System (20)

An Approach to Overcome Modeling Inaccuracies for Performance Simulation Sig...
An Approach to Overcome Modeling  Inaccuracies for Performance Simulation Sig...An Approach to Overcome Modeling  Inaccuracies for Performance Simulation Sig...
An Approach to Overcome Modeling Inaccuracies for Performance Simulation Sig...
 
Improving Resource Utilization in Cloud using Application Placement Heuristics
Improving Resource Utilization in Cloud using Application Placement HeuristicsImproving Resource Utilization in Cloud using Application Placement Heuristics
Improving Resource Utilization in Cloud using Application Placement Heuristics
 
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
Recent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor ModelRecent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor Model
 
Ajila (1)
Ajila (1)Ajila (1)
Ajila (1)
 
Approximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithmsApproximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithms
 
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
 
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related ToolsModeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
 
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
 
Biomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABBiomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLAB
 
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
 
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
 
Real time operating systems
Real time operating systemsReal time operating systems
Real time operating systems
 
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
 
MCS2SIM - Method Allowing Application of PSA Results in Simulators
MCS2SIM - Method Allowing Application of PSA Results in SimulatorsMCS2SIM - Method Allowing Application of PSA Results in Simulators
MCS2SIM - Method Allowing Application of PSA Results in Simulators
 
KCC2017 28APR2017
KCC2017 28APR2017KCC2017 28APR2017
KCC2017 28APR2017
 
Performance of a speculative transmission scheme for scheduling latency reduc...
Performance of a speculative transmission scheme for scheduling latency reduc...Performance of a speculative transmission scheme for scheduling latency reduc...
Performance of a speculative transmission scheme for scheduling latency reduc...
 
Life in the Fast Lane: A Line-Rate Linear Road
Life in the Fast Lane: A Line-Rate Linear RoadLife in the Fast Lane: A Line-Rate Linear Road
Life in the Fast Lane: A Line-Rate Linear Road
 
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONS
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONSAPPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONS
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONS
 

Último

Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Pitangent Analytics & Technology Solutions Pvt. Ltd
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
christinelarrosa
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 

Último (20)

Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 

Failure Prediction for APU on a Metro System

  • 1. Failure prediction for APU's on a Metro System Aaryadev Ghosalkar
  • 2. Agenda • Introduction • Problem Statement • Literature Review • Existing Systems • Modelling • Results • Conclusion • Future Scope Failure prediction for APU's on a Metro System 2
  • 3. Introduction Air production unit (APU) is component on most modern metro systems which circulates compressed air through the metro, our research presents a comprehensive comparison of PdM models on APU failure detection. Our goal is to detect failures at least 2 hours in advance, the challenge with APU's in particular is that APU failures are rare events In the dataset there are 3 failure within 6 months of operation Failure prediction for APU's on a Metro System 3
  • 4. Problem Statement • Detect failures at least 2 hours in advance • Create a system that is customizable can expand to more than 2 hours if required (also keeping in mind the side effects that may arise) • Provide a way to deploy the model for real time inference Failure prediction for APU's on a Metro System 4
  • 5. Literature Review Taking a look at what other researchers have done
  • 6. Literature Review • Veloso et al performed the initial data collection, which involved converting data from the metro company database into a CSV file • We used some of the ideas for data preprocessing presented by the AzureML team at microsoft • A study on Davari et al motivated many of the algorithms that we used in this study, there work has been a great resource in our study Failure prediction for APU's on a Metro System 6
  • 7. Literature Review continued • Chaudhuri et al used SVM to classify vehicles into 3 distinct risk levels, focusing on model interpretability, model interpretability refers to how easy it is to understand the choices made by the model and results of the model • Various other researchers have used CNN with GAF to convert timeseries data into images and used CNNs for classification most notable was done by Silva which reported an accuracy of 93% in their study Failure prediction for APU's on a Metro System 7
  • 8. Literature Review continued • Nguyen and Medjaher used LSTM to predicted the probability of failure in a given time window, we drew a lot of inspiration from their work in terms of balancing the dataset and the model used • In terms of early research into RUL estimation a majority of the work revolves around using statistical models or models which assume linear degradation pattern, it was only after 2016 Deep learning models were used in this field Failure prediction for APU's on a Metro System 8
  • 9. Existing Systems Predictive maintenance has been used in a lot of domains such as elevators and Jet engines.
  • 10. The Infrastructure abroad • Uses NLP on text fields where engineers describe the problem and how the problem was solved. • Also predict if a failure is about to happen and which component will fail • TFL expects to save £3 million a year Failure prediction for APU's on a Metro System 10 • TrainDNA collects and analyzes real- time data from more than 200 trains across Australia using IBM Maximo • 51% Increase in realiablity after introducing the TrainDNA system • Considers multiple failures • Large scale system able to process 30 Million message every hour London Australia and NZL
  • 11. Modelling Taking a look at how the data was processed and what ML models were tested Failure prediction for APU's on a Metro System 11
  • 12. Data Preprocessing Most of the data collection and cleaning work on this data set was done by researchers that created the data set Thus this data set did not have any null values Failure prediction for APU's on a Metro System • Linear Model are a bad idea • There is no obvious pattern which we can see to distinguish each class, which further motivated use of deep learning • Balance is key! Collection EDA • Discretizing the data to make this more suitable for classification • Generalizing the model using a parameter to control number of hours before warning. Preparation 12
  • 13. Challenges in data processing • The sheer volume of data presented a significant challenge, there were more than 10 000 000 (1Cr) rows in the dataset to solve this we performed carefully changed the data types of the features to minimize the loss of information this allowed us to significantly reduce size of the data with a 91% decrease in storage space and 77% decrease in RAM usage Failure prediction for APU's on a Metro System 13
  • 14. Challenges in Data processing continued • The highly imbalanced nature of the data also presented a huge challenge when training the models, since APU failures are a rare event and only 3% of the data constitutes of APU failure a model that predicts all data points as normal would mathematically have an accuracy of 97%, to combat this we used Near miss under sampling and a convenience sampling like strategy for the LSTM, inspired by Chen et al Failure prediction for APU's on a Metro System 14
  • 15. Machine Learning Models • SVMs take very long to train • Results are not satisfactory and most other models perform better Failure prediction for APU's on a Metro System • Very quick training time • The hyper parameters such as the criterion do not make any difference to the accuracy SVM Decision Trees • Provide greater accuracy than decision trees due to this being an ensemble model • Training time is similar to decision trees Random Forests 15
  • 16. Deep Learning Models • Used a small neural network of 25k parameters with AdamW optimizer • Best results from all tested model • Does not take into account the time series nature of the data • Requires CUDA for efficient deployment. Failure prediction for APU's on a Metro System 16 • Can incorporate temporal patterns • Large model almost 184K parameters which makes training difficult. • Requires 3rd order tensors as input for training thus balancing data is hard • Also requires CUDA for efficient training and deployment Neural Network LSTM Network
  • 17. Results Accuracy Precision (Class 1) Recall (Class 1) SVM 0.57 0.60 0.62 Decision Tree 0.66 0.69 0.70 Random Forest 0.70 0.69 0.78 Neural Network (Adam) 0.72 0.67 0.91 Neural Network (AdamW) 0.76 0.75 0.87 LSTM Network 0.76 0.64 0.85 Failure prediction for APU's on a Metro System 17
  • 18. Conclusion • Neural networks produce satisfactory results however the Random forest model can be used when deploying on low end hardware or considering an edge solution. LSTMs perform nearly identical to neural networks with less data • Microservices can be considered when deploying as this will allow different parts of the model to be deployed separately • Multi collinearity is present in the data which can makes it hard to fit any kind of linear model Failure prediction for APU's on a Metro System 18
  • 19. Future Scope • Transformer based architecture can be considered as they have shown promising results with sequential data in the case of NLP applications • Perhaps due to the imbalanced nature and the rarity of APU failures on a real metro systems an anomaly detection approach would be better as this would not need the data to be balanced and a lot more of the existing data can be used Failure prediction for APU's on a Metro System 19