This document discusses using machine learning and Kalman filters for machine health monitoring. It presents a case study using turbine engine degradation data from NASA to predict remaining useful life. The document walks through loading and preparing the data in H2O, including feature engineering techniques like clustering operating modes and standardizing sensor measurements. Various machine learning models like GLM are trained and evaluated on the data. Cross validation and hyperparameter tuning are also demonstrated to select the best model. Finally, the document mentions using Kalman filters for post-processing model outputs.
10. H2O
H2O
H2O
data.csv
HTTP REST API
request to H2O
has HDFS path
H2O ClusterInitiate distributed
ingest
HDFS
NFS
S3
Request data
from HDFS
STEP 2
2.2
2.3
2.4
Python
h2o.import_file()
2.1
Python function
call
PYTHON (AND R) OBJECTS ARE PROXIES FOR BIG DATA
11. H2O
H2O
H2O
Python
HDFS
NFS
S3
STEP 3
Cluster IP
Cluster Port
Pointer to Data
Return pointer to data
in REST API JSON
Response
HDFS provides
data
3.3
3.4
3.1h2o_df object created
in Python
data.csv
h2o_df
H2O
Frame
3.2
Distributed H2O
Frame in DKV
H2O Cluster
PYTHON (AND R) OBJECTS ARE PROXIES FOR BIG DATA