SlideShare una empresa de Scribd logo
1 de 43
Learning to adapt to sensor
changes and failures
Craig Knoblock
Yuan Shi
Minh Pham
University of Southern California
Information Sciences Institute
Introduction
• The Internet of Things will contain many sensors
• People will build applications that will rely on these sensors
• But with the large numbers of sensors, there will be failures
• So one important challenge is seamlessly handling these failures
Outline
• Learning to Replace a Failed Sensor
• Learning to Replace a Compound Sensor
• Assessing Adaptation Quality and Detecting Failures
• Related Work, Discussion, and Future Work
Example: Reconstructing a Missing Sensor
Temperature sensor
2015-04-25:15:07 33.292 118.541 35.2 26.2
2015-04-25:15:12 33.274 118.532 34.8 26.0
Reading
Reading
Reading
Location
timestamp
latitude
longitude
temperature
pressure
Example: Reconstructing a Missing Sensor
Temperature sensor
Reading
Reading
Reading
Location
timestamp
latitude
longitude
temperature
pressure
fNew sensor
2015-04-25:15:07 33.292 118.541 35.2 26.2
2015-04-25:15:12 33.274 118.532 34.8 26.0
Sensor Reconstruction without Overlapping Data
t
X1
X2
X3
model f(X1, X2 ,Y)
• We replace Xk with a new
sensor Y
• Learn a reconstruction function trained on the working sensors,
though there is no overlapping data between X and Y
Xkf( X1, X2, …, XK-1 , Y )
failed/target
sensor
working
sensors
Y
new
sensor
Notations of Individual Sensor Changes
1 2 … N
…
change point
old sensor
t
S1
S2
S3
SK-1
SK
…
1 2 … N N+1 N+2 … N+M
… …
change point
old sensor
new sensor
t
S1
S2
S3
SK-1
SK
…
S1
S2
S3
SK-1
…
SK+1
SK+2
SK+P
SK is replaced by P new sensors: SK+1, … ,SK+P
Notations of Individual Sensor Changes
1 2 … N N+1 N+2 … N+M
… …
change point
old sensor
new sensor
t
S1
S2
S3
SK-1
SK
…
S1
S2
S3
SK-1
…
SK+1
SK+2
SK+P
SK is replaced by P new sensors: SK+1, … ,SK+P
Source Domain Target Domain
X1 X2 XN Z1 Z2 ZM
… …
Notations of Individual Sensor Changes
Sensor-level Adaptation to Individual Sensor Changes
SK is replaced by P new
sensors SK+1…SK+P-1
Unexplored in previous work
Reconstruction function:
f(S1 S2 … SK-1 SK+1 … SK+P) SK
1 2 … N N+1 N+2 … N+M
X1 X2 XN Z1 Z2 ZM
… …
t
S1
S2
…
SK-1
SK SK+1
…
SK+P
Sensor-level Adaptation to Individual Sensor Changes
SK is replaced by P new
sensors SK+1…SK+P-1
Unexplored in previous work
Reconstruction function:
f(S1 S2 … SK-1 SK+1 … SK+P) SK
Challenge: no overlapping between SK and new sensors!
1 2 … N N+1 N+2 … N+M
X1 X2 XN Z1 Z2 ZM
… …
t
S1
S2
…
SK-1
SK SK+1
…
SK+P
Sensor-level Adaptation to Individual Sensor Changes
SK is replaced by P new
sensors SK+1…SK+P-1
Unexplored in previous work
Reconstruction function:
f(S1 S2 … SK-1 SK+1 … SK+P) SK
Challenge: no overlapping between SK and new sensors!
Intuition: S1, S2, …, SK-1 as the bridge
Assumption: S1, S2, …, SK-1 are correlated with SK, as well as SK+1, …, SK+P
1 2 … N N+1 N+2 … N+M
X1 X2 XN Z1 Z2 ZM
… …
t
S1
S2
…
SK-1
SK SK+1
…
SK+P
Sensor-level Adaptation to Individual Sensor Changes
1 2 … N N+1 N+2 … N+M
X1 X2 XN Z1 Z2 ZM
… …
t
1 2 … N N+1 N+2 … N+M
X1 X2 XN Z1 Z2 ZM
… …
t
X1 X2 XN
…
t
f
…
Sensor-level Adaptation to Individual Sensor Changes
1 2 … N N+1 N+2 … N+M
X1 X2 XN
… …
t
f
Two domains distribute similarly
Two sets of samples have similar distributions
Source Target
Sensor-level Adaptation to Individual Sensor Changes
1 2 … N N+1 N+2 … N+M
X1 X2 XN Z1 Z2 ZM
… …
t
1 2 … N N+1 N+2 … N+M
Two sets of samples have similar distributions
Two sets of samples mixed as much as possible
Xs’s k neighbors in the target domain ’s k neighbors in the source domain
Minimize cross-domain k-nearest neighbor distances
Sensor-level Adaptation to Individual Sensor Changes
X1 X2 XN
… …
t
Source Target
1 2 … N N+1 N+2 … N+M
ID Type Unit Range
1 Temperature °C 0.4 – 37.6
2 Dew point °C -9.4 – 18.4
3 Humidity % 11-90
4 Wind speed mph 0 – 38.6
5 Wind gust mph 0 – 46.7
Correlation between individual sensors by month
Five individual sensors from
WeatherUnderground
Two groups of patterns (Nov-Jan, Feb-Oct), suggesting
nonlinear models for modeling relationship among sensors
Relationships in Weather Data
Empirical Study
• Each station has 5-10 individual sensors, producing a sample every 5-10 minutes
• Sensor change simulation: an individual sensor is replaced by the same sensor at a
nearby station
• Source domain: Jan 2015-Aug 2015; target domain: Jan 2016-Aug 2016
• Adaptation errors: root mean square error between reconstructed signal and ground
truth
Sensor-level Adaptation to Individual Sensor Changes
Empirical Study
Ignoring new sensors: Regression on the
remaining old sensors
Missing value imputation: Predicting new
sensors’ readings on the source domain, then
do regression
Our approach: Learning
with previously Unseen
Features (LUF)
[Shi and Knoblock, ‘17]
Average improvement: 17.9%
Sensor-level Adaptation to Individual Sensor Changes
wind speed
reconstructed
pressure
Our approach (LUF) that
uses new sensors
Empirical Study
Sensor-level Adaptation to Individual Sensor Changes
Ignoring new sensors: Regression on the
remaining old sensors
Wind Gust Sensor Reconstructed from a Nearby Station
Outline
• Learning to Replace a Failed Sensor
• Learning to Replace a Compound Sensor
• Assessing Adaptation Quality and Detecting Failures
• Related Work, Discussion, and Future Work
Example: Learning the device operation
Temperature sensor
Reading
Reading
Reading
Location
timestamp
latitude
longitude
temperature
pressure
2015-04-25:15:07 33.292 118.541 35.2 26.2
2015-04-25:15:12 33.274 118.532 34.8 26.0
Example: Automatically Adapting to Changes
Weather Station
Reading
Reading
Reading
Location
timestamp
latitude
longitude
temperature
pressure
2015-04-25:15:07 33.292 118.541 35.2 26
2015-04-25:15:12 33.274 118.532 34.8 27
New Weather Station
28-Apr-15 16:50:50 118 26 59 E 33 58 33 N 74 37.5
28-Apr-15 16:50:59 118 27 10 E 33 58 45 N 77 38.4
Example: Automatically Adapting to Changes
Weather Station
Reading
Reading
Reading
Location
timestamp
latitude
longitude
temperature
pressure
2015-04-25:15:07 33.292 118.541 35.2 26
2015-04-25:15:12 33.274 118.532 34.8 27
New Weather Station
28-Apr-15 16:50:50 118 26 59 E 33 58 33 N 74 37.5
28-Apr-15 16:50:59 118 27 10 E 33 58 45 N 77 38.4
Learn a tranformation program TT
Challenge: How to Automatically Adapt to a New
Sensor
• Problem
• The output of the new sensor is different than that of the original sensor that
software was designed to process
• Solution
• Synthesize a data adapter that transforms the new data into a format usable by the
software system
Identifying the Semantic Types of the Sensor Data
• Use machine learning techniques to learn to recognize different types of data
[Pham et al., 2016]
Type A
118.519
119.117
Unknown Type
34.6
33.5
Pairwise
similarity
features
Random
Forest
Yes
(Same type)
No
(Same type)
Unkown Type = A
X
Different similarity features of data
firstName
...
...
...
First Name
...
...
...
Similarity in
attribute names
Name
Gary Cahill
Juan Mata
De Gea
Player
Juan Quin
Tim Cahill
Metsul Ozeil
Similarity in values
# games
played
1
2
...
8
number of
games
2
3
...
10
Similarity in ranges of values
Value range similarityValue similarity
Attribute name similarity
Different similarity features of data
position
1
4
3
2
Player
GK
MF
DF
FW
Similarity in
historgram
# game
played
4
...
18
23
# goal
scored
3
...
11
22
No similarity
in distribution
Similarity in
value
No similarity
in value
Distribution similarity Histogram similarity
Evaluation
Number of
labeled
sources
1 2 3
Train on soccer 89.75 95.08 97.73
Train on
museum
89.75 95.08 97.73
Train on city 91.86 96.59 97.73
SemanticTyper 85.22 92.04 95.45
MRR performances of our approach on weather data (trained on different domains)
Dealing with String Format Changes
• Learn general string transformations: dd MM yyyy => mm/dd/yyyy
17 May 1983
14 Jul 1984
5 Aug 1991
... ... ... ... ...
3 / 10 / 1979
11 / 22 / 1982
7 / 5 / 1982
... ... ... ... ...
3 / 10 / 1979
11 / 22 / 1982
7 / 5 / 1982
... ... ... ... ...
17
14
5
1983
1984
1991
10
22
5
May
Jul
Aug
Semantic
Labeling
3
11
7
1979
1982
1982
Replace
Replace
Transform
17 May 1983
14 Jul 1984
5 Aug 1991
...
3/10/1979
11/22/1982
7/5/1982
...
Template inference Template matching Transforming
17 May 1983
14 Jul 1984
5 Aug 1991
... ... ... ... ...
Transformation Learning (Future Work)
• Example: 3 => Mar
Query Webtable
database
Transformation inference
Transform
Transform data
to correct
format
? Jul
? May
7 ?
3 ?
1 January Jan
2 February Feb
... ... ...
7 Jul
5 May
7 Jul
3 Mar
Preliminary result for format changes
• Evaluation:
• 38 datasets including date/time, names, stress addresses, telephone numbers, dimensions
• Only contains cases that can be solve with just replacement
• Measurements: accuracy, average edit distance (compared with groundtruth)
• Some examples that work well:
Accuracy Avg edit
distance
Original avg
edit distance
Improvement
on edit distance
0.58 3.5 18.21 81%
Format change Accuracy Avg edit distance Original avg edit
distance
Improvement on
edit distance
dd mm yyyy  dd.mm.yy 1 0 4 100%
[middle_name] last_name; first_name [(c)]
first_name [middle_name] last_name [(c)]
0.862 0.91 5.8 84%
height” [H] x weight” [W] x [depth” [D]] => weight 0.967 0.11 17.79 99%
Outline
• Learning to Replace a Failed Sensor
• Learning to Replace a Compound Sensor
• Assessing Adaptation Quality and Detecting Failures
• Related Work, Discussion, and Future Work
Adaptation Performance Estimation and Sensor Change Detection
How Good is An Adaptation?
• Provide upper-layer software with an estimation of adaptation error
• Select optimal adaptation strategy
Approach
• Simulate sensor failures
• Simulate failures of one or multiple sensors at random time point from
historical data
• Compute the adaptation error for each adaptation strategy and store into library
• New sensor failure: match the most similar case from library
adaptation strategies
f1(S1) S3
f2(S2) S3
f3(S1,S2) S3
Adaptation Performance Estimation and Sensor Change Detection
adaptation strategies:
f1(S1) S3
f2(S2) S3
f3(S1,S2) S3
S3 = 2S1 + 3S2 – 0.5, error = 0.2
error bound can be derived:
(2S1 + 3S2 – S3 – 0.5)2 < 0.22
Can we use it to detect sensor changes?
Adaptation Performance Estimation and Sensor Change Detection
Sensor Change Detection
change or not?
Error bounds derived from adaptation strategies
S1 S2 S3
(S1, S2) (S1, S3) (S2, S3)
(S1, S2, S3)
Violated: at least one sensor was changed
Using logical inference: S1 changes | both S2 and S3 change
simpler
Outline
• Learning to Replace a Failed Sensor
• Learning to Replace a Compound Sensor
• Assessing Adaptation Quality and Detecting Failures
• Related Work, Discussion, and Future Work
• Detecting Sensor Failures and Changes
• Change point detection [Aminikhanghahi and Cook ‘16] [Pimentel et al., ‘14]
• Distribution-based [Kawahara and Sugiyama, ‘12] [Harchaoui et al., ‘09] [Yamanishi and
Takeuchi, ‘02]
• Reconstruction-based [Crook et al., ‘02] [Singh and Markou, ‘04] [Ide and Tsuda, ‘07]
[Chatzigiannakis et al., ‘06]
• Probabilistic [Adams and MacKay, ‘07] [Saatci et al., ‘10] [Dereszynski and Dietterich, ‘12]
[Dietterich et al. ‘12]
• Distance-based [Angiulli and Pizzuti, ‘02] [Bay and Schwabacher, ‘03] [Chawla and Sun, ‘06]
[Keogh et al., ‘01] [Budalakoti et al., ‘06] [Chen et al., ‘15]
• Reconstruction of Sensor Readings
• Most detection methods do not address how to automatically recover
• Some probabilistic methods [Dereszynski and Dietterich, ‘12] [Dietterich et al. ‘12] can be used
to reconstruct changed sensor, but cannot leverage new sensors
• FFX [McConaghy ‘11] is applied to extract sensor-specific transformations
Related Work
41
Our approach explores multiple nonlinear relationship among sensors, and can
potentially detect sensor changes with significantly higher accuracy
Our approach can adapt to new sensors, which are not possible by existing approaches
Related Work:
• String transformation: Most existing approaches requries one-to-one mapping in
training data to work.
• Singh, Rishabh, and Sumit Gulwani. "Transforming spreadsheet data types using
examples." ACM SIGPLAN Notices. Vol. 51. No. 1. ACM, 2016.
• Wu, Bo, and Craig A. Knoblock. "An Iterative Approach to Synthesize Data
Transformation Programs." IJCAI. 2015.
• Semi-auto data cleaning: Most existing approaches requires human interaction to
provide training data and curate the generated results.
• Scaffidi, Christopher. Topes: Enabling end-user programmers to validate and reformat data.
Diss. University of Nebraska-Lincoln, 2009.
• Raman, Vijayshankar, and Joseph M. Hellerstein. "Potter's wheel: An interactive data
cleaning system." VLDB. Vol. 1. 2001.
Discussion
• Presented techniques for
• Reconstructing numeric sensor values
• Reconstructing an compound failed sensor from new sensor
• Assessing the accuracy of a reconstructed sensor and identifying failures
• Many applications where these techniques could be applied
• Geoscientists collecting data about the earth
• Medical devices where information is missing
• Sensors on mobile phones where sensors may be too costly to run
• Etc…
Conference Talk on Thursday
• Learning with Previously Unseen Features
Yuan Shi & Craig A. Knoblock
• Thursday, August 24 16:30-18:00 (Yuan will talk at 17:15)
• ML-TAML3 – Transfer, Adaptation, Multi-Task Learning 3 (212)
Thanks!

Más contenido relacionado

Similar a Learning to Adapt to Sensor Changes and Failures

DSD-NL 2018 Inverse Analysis for Workshop Anura3D MPM - Ghasemi Martinelli
DSD-NL 2018 Inverse Analysis for Workshop Anura3D MPM - Ghasemi MartinelliDSD-NL 2018 Inverse Analysis for Workshop Anura3D MPM - Ghasemi Martinelli
DSD-NL 2018 Inverse Analysis for Workshop Anura3D MPM - Ghasemi MartinelliDeltares
 
Asymptotic Notations
Asymptotic NotationsAsymptotic Notations
Asymptotic NotationsRishabh Soni
 
Introduction to computing Processing and performance.pdf
Introduction to computing Processing and performance.pdfIntroduction to computing Processing and performance.pdf
Introduction to computing Processing and performance.pdfTulasiramKandula1
 
OPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACKOPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACKInfluxData
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
Model Transformation Reuse
Model Transformation ReuseModel Transformation Reuse
Model Transformation Reusemiso_uam
 
Ch1. Analysis of Algorithms.pdf
Ch1. Analysis of Algorithms.pdfCh1. Analysis of Algorithms.pdf
Ch1. Analysis of Algorithms.pdfzoric99
 
C++ Notes PPT.ppt
C++ Notes PPT.pptC++ Notes PPT.ppt
C++ Notes PPT.pptAlpha474815
 
lecture1.ppt
lecture1.pptlecture1.ppt
lecture1.pptSagarDR5
 
Chapter 4: Induction Heating Computer Simulation
Chapter 4: Induction Heating Computer SimulationChapter 4: Induction Heating Computer Simulation
Chapter 4: Induction Heating Computer SimulationFluxtrol Inc.
 
Supply Insensitivity Temperature Sensor for Microprocessor Thermal Monitoring...
Supply Insensitivity Temperature Sensor for Microprocessor Thermal Monitoring...Supply Insensitivity Temperature Sensor for Microprocessor Thermal Monitoring...
Supply Insensitivity Temperature Sensor for Microprocessor Thermal Monitoring...IOSR Journals
 
Thermal modeling and management of cluster storage systems xunfei jiang 2014
Thermal modeling and management of cluster storage systems xunfei jiang 2014Thermal modeling and management of cluster storage systems xunfei jiang 2014
Thermal modeling and management of cluster storage systems xunfei jiang 2014Xiao Qin
 
An Introduction to Distributed Data Streaming
An Introduction to Distributed Data StreamingAn Introduction to Distributed Data Streaming
An Introduction to Distributed Data StreamingParis Carbone
 
Igarss2011snow.pptx
Igarss2011snow.pptxIgarss2011snow.pptx
Igarss2011snow.pptxgrssieee
 
Problem 7PurposeBreak apart a complicated system.ConstantsC7C13.docx
Problem 7PurposeBreak apart a complicated system.ConstantsC7C13.docxProblem 7PurposeBreak apart a complicated system.ConstantsC7C13.docx
Problem 7PurposeBreak apart a complicated system.ConstantsC7C13.docxLacieKlineeb
 
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...University of Pavia
 
2015-07-08 Paper 38 - ICVS Talk
2015-07-08 Paper 38 - ICVS Talk2015-07-08 Paper 38 - ICVS Talk
2015-07-08 Paper 38 - ICVS TalkThomas Sølund
 

Similar a Learning to Adapt to Sensor Changes and Failures (20)

DSD-NL 2018 Inverse Analysis for Workshop Anura3D MPM - Ghasemi Martinelli
DSD-NL 2018 Inverse Analysis for Workshop Anura3D MPM - Ghasemi MartinelliDSD-NL 2018 Inverse Analysis for Workshop Anura3D MPM - Ghasemi Martinelli
DSD-NL 2018 Inverse Analysis for Workshop Anura3D MPM - Ghasemi Martinelli
 
Asymptotic Notations
Asymptotic NotationsAsymptotic Notations
Asymptotic Notations
 
l1_introduction.pdf
l1_introduction.pdfl1_introduction.pdf
l1_introduction.pdf
 
Introduction to computing Processing and performance.pdf
Introduction to computing Processing and performance.pdfIntroduction to computing Processing and performance.pdf
Introduction to computing Processing and performance.pdf
 
Searching Algorithms
Searching AlgorithmsSearching Algorithms
Searching Algorithms
 
OPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACKOPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACK
 
2local.pdf
2local.pdf2local.pdf
2local.pdf
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Model Transformation Reuse
Model Transformation ReuseModel Transformation Reuse
Model Transformation Reuse
 
Ch1. Analysis of Algorithms.pdf
Ch1. Analysis of Algorithms.pdfCh1. Analysis of Algorithms.pdf
Ch1. Analysis of Algorithms.pdf
 
C++ Notes PPT.ppt
C++ Notes PPT.pptC++ Notes PPT.ppt
C++ Notes PPT.ppt
 
lecture1.ppt
lecture1.pptlecture1.ppt
lecture1.ppt
 
Chapter 4: Induction Heating Computer Simulation
Chapter 4: Induction Heating Computer SimulationChapter 4: Induction Heating Computer Simulation
Chapter 4: Induction Heating Computer Simulation
 
Supply Insensitivity Temperature Sensor for Microprocessor Thermal Monitoring...
Supply Insensitivity Temperature Sensor for Microprocessor Thermal Monitoring...Supply Insensitivity Temperature Sensor for Microprocessor Thermal Monitoring...
Supply Insensitivity Temperature Sensor for Microprocessor Thermal Monitoring...
 
Thermal modeling and management of cluster storage systems xunfei jiang 2014
Thermal modeling and management of cluster storage systems xunfei jiang 2014Thermal modeling and management of cluster storage systems xunfei jiang 2014
Thermal modeling and management of cluster storage systems xunfei jiang 2014
 
An Introduction to Distributed Data Streaming
An Introduction to Distributed Data StreamingAn Introduction to Distributed Data Streaming
An Introduction to Distributed Data Streaming
 
Igarss2011snow.pptx
Igarss2011snow.pptxIgarss2011snow.pptx
Igarss2011snow.pptx
 
Problem 7PurposeBreak apart a complicated system.ConstantsC7C13.docx
Problem 7PurposeBreak apart a complicated system.ConstantsC7C13.docxProblem 7PurposeBreak apart a complicated system.ConstantsC7C13.docx
Problem 7PurposeBreak apart a complicated system.ConstantsC7C13.docx
 
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...
 
2015-07-08 Paper 38 - ICVS Talk
2015-07-08 Paper 38 - ICVS Talk2015-07-08 Paper 38 - ICVS Talk
2015-07-08 Paper 38 - ICVS Talk
 

Más de Craig Knoblock

From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...
From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...
From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...Craig Knoblock
 
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Craig Knoblock
 
Lessons Learned in Building Linked Data for the American Art Collaborative
Lessons Learned in Building Linked Data for the American Art CollaborativeLessons Learned in Building Linked Data for the American Art Collaborative
Lessons Learned in Building Linked Data for the American Art CollaborativeCraig Knoblock
 
Extracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge GraphsExtracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge GraphsCraig Knoblock
 
Assigning semantic labels to data sources
Assigning semantic labels to data sourcesAssigning semantic labels to data sources
Assigning semantic labels to data sourcesCraig Knoblock
 
A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...Craig Knoblock
 
Building and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human TraffickingBuilding and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human TraffickingCraig Knoblock
 
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked KnowledgeFrom Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked KnowledgeCraig Knoblock
 
Semantics for Big Data Integration and Analysis
Semantics for Big Data Integration and AnalysisSemantics for Big Data Integration and Analysis
Semantics for Big Data Integration and AnalysisCraig Knoblock
 
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and  Integrating Heterogeneous Ge...A Semantic Approach to Retrieving, Linking, and  Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...Craig Knoblock
 
Discovering Alignments in Ontologies of Linked Data
Discovering Alignments in Ontologies of Linked DataDiscovering Alignments in Ontologies of Linked Data
Discovering Alignments in Ontologies of Linked DataCraig Knoblock
 

Más de Craig Knoblock (11)

From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...
From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...
From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...
 
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
 
Lessons Learned in Building Linked Data for the American Art Collaborative
Lessons Learned in Building Linked Data for the American Art CollaborativeLessons Learned in Building Linked Data for the American Art Collaborative
Lessons Learned in Building Linked Data for the American Art Collaborative
 
Extracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge GraphsExtracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge Graphs
 
Assigning semantic labels to data sources
Assigning semantic labels to data sourcesAssigning semantic labels to data sources
Assigning semantic labels to data sources
 
A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...
 
Building and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human TraffickingBuilding and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human Trafficking
 
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked KnowledgeFrom Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
 
Semantics for Big Data Integration and Analysis
Semantics for Big Data Integration and AnalysisSemantics for Big Data Integration and Analysis
Semantics for Big Data Integration and Analysis
 
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and  Integrating Heterogeneous Ge...A Semantic Approach to Retrieving, Linking, and  Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
 
Discovering Alignments in Ontologies of Linked Data
Discovering Alignments in Ontologies of Linked DataDiscovering Alignments in Ontologies of Linked Data
Discovering Alignments in Ontologies of Linked Data
 

Último

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Learning to Adapt to Sensor Changes and Failures

  • 1. Learning to adapt to sensor changes and failures Craig Knoblock Yuan Shi Minh Pham University of Southern California Information Sciences Institute
  • 2. Introduction • The Internet of Things will contain many sensors • People will build applications that will rely on these sensors • But with the large numbers of sensors, there will be failures • So one important challenge is seamlessly handling these failures
  • 3. Outline • Learning to Replace a Failed Sensor • Learning to Replace a Compound Sensor • Assessing Adaptation Quality and Detecting Failures • Related Work, Discussion, and Future Work
  • 4. Example: Reconstructing a Missing Sensor Temperature sensor 2015-04-25:15:07 33.292 118.541 35.2 26.2 2015-04-25:15:12 33.274 118.532 34.8 26.0 Reading Reading Reading Location timestamp latitude longitude temperature pressure
  • 5. Example: Reconstructing a Missing Sensor Temperature sensor Reading Reading Reading Location timestamp latitude longitude temperature pressure fNew sensor 2015-04-25:15:07 33.292 118.541 35.2 26.2 2015-04-25:15:12 33.274 118.532 34.8 26.0
  • 6. Sensor Reconstruction without Overlapping Data t X1 X2 X3 model f(X1, X2 ,Y) • We replace Xk with a new sensor Y • Learn a reconstruction function trained on the working sensors, though there is no overlapping data between X and Y Xkf( X1, X2, …, XK-1 , Y ) failed/target sensor working sensors Y new sensor
  • 7. Notations of Individual Sensor Changes 1 2 … N … change point old sensor t S1 S2 S3 SK-1 SK …
  • 8. 1 2 … N N+1 N+2 … N+M … … change point old sensor new sensor t S1 S2 S3 SK-1 SK … S1 S2 S3 SK-1 … SK+1 SK+2 SK+P SK is replaced by P new sensors: SK+1, … ,SK+P Notations of Individual Sensor Changes
  • 9. 1 2 … N N+1 N+2 … N+M … … change point old sensor new sensor t S1 S2 S3 SK-1 SK … S1 S2 S3 SK-1 … SK+1 SK+2 SK+P SK is replaced by P new sensors: SK+1, … ,SK+P Source Domain Target Domain X1 X2 XN Z1 Z2 ZM … … Notations of Individual Sensor Changes
  • 10. Sensor-level Adaptation to Individual Sensor Changes SK is replaced by P new sensors SK+1…SK+P-1 Unexplored in previous work Reconstruction function: f(S1 S2 … SK-1 SK+1 … SK+P) SK 1 2 … N N+1 N+2 … N+M X1 X2 XN Z1 Z2 ZM … … t S1 S2 … SK-1 SK SK+1 … SK+P
  • 11. Sensor-level Adaptation to Individual Sensor Changes SK is replaced by P new sensors SK+1…SK+P-1 Unexplored in previous work Reconstruction function: f(S1 S2 … SK-1 SK+1 … SK+P) SK Challenge: no overlapping between SK and new sensors! 1 2 … N N+1 N+2 … N+M X1 X2 XN Z1 Z2 ZM … … t S1 S2 … SK-1 SK SK+1 … SK+P
  • 12. Sensor-level Adaptation to Individual Sensor Changes SK is replaced by P new sensors SK+1…SK+P-1 Unexplored in previous work Reconstruction function: f(S1 S2 … SK-1 SK+1 … SK+P) SK Challenge: no overlapping between SK and new sensors! Intuition: S1, S2, …, SK-1 as the bridge Assumption: S1, S2, …, SK-1 are correlated with SK, as well as SK+1, …, SK+P 1 2 … N N+1 N+2 … N+M X1 X2 XN Z1 Z2 ZM … … t S1 S2 … SK-1 SK SK+1 … SK+P
  • 13. Sensor-level Adaptation to Individual Sensor Changes 1 2 … N N+1 N+2 … N+M X1 X2 XN Z1 Z2 ZM … … t
  • 14. 1 2 … N N+1 N+2 … N+M X1 X2 XN Z1 Z2 ZM … … t X1 X2 XN … t f … Sensor-level Adaptation to Individual Sensor Changes 1 2 … N N+1 N+2 … N+M
  • 15. X1 X2 XN … … t f Two domains distribute similarly Two sets of samples have similar distributions Source Target Sensor-level Adaptation to Individual Sensor Changes 1 2 … N N+1 N+2 … N+M X1 X2 XN Z1 Z2 ZM … … t 1 2 … N N+1 N+2 … N+M
  • 16. Two sets of samples have similar distributions Two sets of samples mixed as much as possible Xs’s k neighbors in the target domain ’s k neighbors in the source domain Minimize cross-domain k-nearest neighbor distances Sensor-level Adaptation to Individual Sensor Changes X1 X2 XN … … t Source Target 1 2 … N N+1 N+2 … N+M
  • 17. ID Type Unit Range 1 Temperature °C 0.4 – 37.6 2 Dew point °C -9.4 – 18.4 3 Humidity % 11-90 4 Wind speed mph 0 – 38.6 5 Wind gust mph 0 – 46.7 Correlation between individual sensors by month Five individual sensors from WeatherUnderground Two groups of patterns (Nov-Jan, Feb-Oct), suggesting nonlinear models for modeling relationship among sensors Relationships in Weather Data
  • 18. Empirical Study • Each station has 5-10 individual sensors, producing a sample every 5-10 minutes • Sensor change simulation: an individual sensor is replaced by the same sensor at a nearby station • Source domain: Jan 2015-Aug 2015; target domain: Jan 2016-Aug 2016 • Adaptation errors: root mean square error between reconstructed signal and ground truth Sensor-level Adaptation to Individual Sensor Changes
  • 19. Empirical Study Ignoring new sensors: Regression on the remaining old sensors Missing value imputation: Predicting new sensors’ readings on the source domain, then do regression Our approach: Learning with previously Unseen Features (LUF) [Shi and Knoblock, ‘17] Average improvement: 17.9% Sensor-level Adaptation to Individual Sensor Changes
  • 20. wind speed reconstructed pressure Our approach (LUF) that uses new sensors Empirical Study Sensor-level Adaptation to Individual Sensor Changes Ignoring new sensors: Regression on the remaining old sensors
  • 21. Wind Gust Sensor Reconstructed from a Nearby Station
  • 22. Outline • Learning to Replace a Failed Sensor • Learning to Replace a Compound Sensor • Assessing Adaptation Quality and Detecting Failures • Related Work, Discussion, and Future Work
  • 23. Example: Learning the device operation Temperature sensor Reading Reading Reading Location timestamp latitude longitude temperature pressure 2015-04-25:15:07 33.292 118.541 35.2 26.2 2015-04-25:15:12 33.274 118.532 34.8 26.0
  • 24. Example: Automatically Adapting to Changes Weather Station Reading Reading Reading Location timestamp latitude longitude temperature pressure 2015-04-25:15:07 33.292 118.541 35.2 26 2015-04-25:15:12 33.274 118.532 34.8 27 New Weather Station 28-Apr-15 16:50:50 118 26 59 E 33 58 33 N 74 37.5 28-Apr-15 16:50:59 118 27 10 E 33 58 45 N 77 38.4
  • 25. Example: Automatically Adapting to Changes Weather Station Reading Reading Reading Location timestamp latitude longitude temperature pressure 2015-04-25:15:07 33.292 118.541 35.2 26 2015-04-25:15:12 33.274 118.532 34.8 27 New Weather Station 28-Apr-15 16:50:50 118 26 59 E 33 58 33 N 74 37.5 28-Apr-15 16:50:59 118 27 10 E 33 58 45 N 77 38.4 Learn a tranformation program TT
  • 26. Challenge: How to Automatically Adapt to a New Sensor • Problem • The output of the new sensor is different than that of the original sensor that software was designed to process • Solution • Synthesize a data adapter that transforms the new data into a format usable by the software system
  • 27. Identifying the Semantic Types of the Sensor Data • Use machine learning techniques to learn to recognize different types of data [Pham et al., 2016] Type A 118.519 119.117 Unknown Type 34.6 33.5 Pairwise similarity features Random Forest Yes (Same type) No (Same type) Unkown Type = A X
  • 28. Different similarity features of data firstName ... ... ... First Name ... ... ... Similarity in attribute names Name Gary Cahill Juan Mata De Gea Player Juan Quin Tim Cahill Metsul Ozeil Similarity in values # games played 1 2 ... 8 number of games 2 3 ... 10 Similarity in ranges of values Value range similarityValue similarity Attribute name similarity
  • 29. Different similarity features of data position 1 4 3 2 Player GK MF DF FW Similarity in historgram # game played 4 ... 18 23 # goal scored 3 ... 11 22 No similarity in distribution Similarity in value No similarity in value Distribution similarity Histogram similarity
  • 30. Evaluation Number of labeled sources 1 2 3 Train on soccer 89.75 95.08 97.73 Train on museum 89.75 95.08 97.73 Train on city 91.86 96.59 97.73 SemanticTyper 85.22 92.04 95.45 MRR performances of our approach on weather data (trained on different domains)
  • 31. Dealing with String Format Changes • Learn general string transformations: dd MM yyyy => mm/dd/yyyy 17 May 1983 14 Jul 1984 5 Aug 1991 ... ... ... ... ... 3 / 10 / 1979 11 / 22 / 1982 7 / 5 / 1982 ... ... ... ... ... 3 / 10 / 1979 11 / 22 / 1982 7 / 5 / 1982 ... ... ... ... ... 17 14 5 1983 1984 1991 10 22 5 May Jul Aug Semantic Labeling 3 11 7 1979 1982 1982 Replace Replace Transform 17 May 1983 14 Jul 1984 5 Aug 1991 ... 3/10/1979 11/22/1982 7/5/1982 ... Template inference Template matching Transforming 17 May 1983 14 Jul 1984 5 Aug 1991 ... ... ... ... ...
  • 32. Transformation Learning (Future Work) • Example: 3 => Mar Query Webtable database Transformation inference Transform Transform data to correct format ? Jul ? May 7 ? 3 ? 1 January Jan 2 February Feb ... ... ... 7 Jul 5 May 7 Jul 3 Mar
  • 33. Preliminary result for format changes • Evaluation: • 38 datasets including date/time, names, stress addresses, telephone numbers, dimensions • Only contains cases that can be solve with just replacement • Measurements: accuracy, average edit distance (compared with groundtruth) • Some examples that work well: Accuracy Avg edit distance Original avg edit distance Improvement on edit distance 0.58 3.5 18.21 81% Format change Accuracy Avg edit distance Original avg edit distance Improvement on edit distance dd mm yyyy  dd.mm.yy 1 0 4 100% [middle_name] last_name; first_name [(c)] first_name [middle_name] last_name [(c)] 0.862 0.91 5.8 84% height” [H] x weight” [W] x [depth” [D]] => weight 0.967 0.11 17.79 99%
  • 34. Outline • Learning to Replace a Failed Sensor • Learning to Replace a Compound Sensor • Assessing Adaptation Quality and Detecting Failures • Related Work, Discussion, and Future Work
  • 35. Adaptation Performance Estimation and Sensor Change Detection How Good is An Adaptation? • Provide upper-layer software with an estimation of adaptation error • Select optimal adaptation strategy Approach • Simulate sensor failures • Simulate failures of one or multiple sensors at random time point from historical data • Compute the adaptation error for each adaptation strategy and store into library • New sensor failure: match the most similar case from library adaptation strategies f1(S1) S3 f2(S2) S3 f3(S1,S2) S3
  • 36. Adaptation Performance Estimation and Sensor Change Detection adaptation strategies: f1(S1) S3 f2(S2) S3 f3(S1,S2) S3 S3 = 2S1 + 3S2 – 0.5, error = 0.2 error bound can be derived: (2S1 + 3S2 – S3 – 0.5)2 < 0.22 Can we use it to detect sensor changes?
  • 37. Adaptation Performance Estimation and Sensor Change Detection Sensor Change Detection change or not? Error bounds derived from adaptation strategies S1 S2 S3 (S1, S2) (S1, S3) (S2, S3) (S1, S2, S3) Violated: at least one sensor was changed Using logical inference: S1 changes | both S2 and S3 change simpler
  • 38. Outline • Learning to Replace a Failed Sensor • Learning to Replace a Compound Sensor • Assessing Adaptation Quality and Detecting Failures • Related Work, Discussion, and Future Work
  • 39. • Detecting Sensor Failures and Changes • Change point detection [Aminikhanghahi and Cook ‘16] [Pimentel et al., ‘14] • Distribution-based [Kawahara and Sugiyama, ‘12] [Harchaoui et al., ‘09] [Yamanishi and Takeuchi, ‘02] • Reconstruction-based [Crook et al., ‘02] [Singh and Markou, ‘04] [Ide and Tsuda, ‘07] [Chatzigiannakis et al., ‘06] • Probabilistic [Adams and MacKay, ‘07] [Saatci et al., ‘10] [Dereszynski and Dietterich, ‘12] [Dietterich et al. ‘12] • Distance-based [Angiulli and Pizzuti, ‘02] [Bay and Schwabacher, ‘03] [Chawla and Sun, ‘06] [Keogh et al., ‘01] [Budalakoti et al., ‘06] [Chen et al., ‘15] • Reconstruction of Sensor Readings • Most detection methods do not address how to automatically recover • Some probabilistic methods [Dereszynski and Dietterich, ‘12] [Dietterich et al. ‘12] can be used to reconstruct changed sensor, but cannot leverage new sensors • FFX [McConaghy ‘11] is applied to extract sensor-specific transformations Related Work 41 Our approach explores multiple nonlinear relationship among sensors, and can potentially detect sensor changes with significantly higher accuracy Our approach can adapt to new sensors, which are not possible by existing approaches
  • 40. Related Work: • String transformation: Most existing approaches requries one-to-one mapping in training data to work. • Singh, Rishabh, and Sumit Gulwani. "Transforming spreadsheet data types using examples." ACM SIGPLAN Notices. Vol. 51. No. 1. ACM, 2016. • Wu, Bo, and Craig A. Knoblock. "An Iterative Approach to Synthesize Data Transformation Programs." IJCAI. 2015. • Semi-auto data cleaning: Most existing approaches requires human interaction to provide training data and curate the generated results. • Scaffidi, Christopher. Topes: Enabling end-user programmers to validate and reformat data. Diss. University of Nebraska-Lincoln, 2009. • Raman, Vijayshankar, and Joseph M. Hellerstein. "Potter's wheel: An interactive data cleaning system." VLDB. Vol. 1. 2001.
  • 41. Discussion • Presented techniques for • Reconstructing numeric sensor values • Reconstructing an compound failed sensor from new sensor • Assessing the accuracy of a reconstructed sensor and identifying failures • Many applications where these techniques could be applied • Geoscientists collecting data about the earth • Medical devices where information is missing • Sensors on mobile phones where sensors may be too costly to run • Etc…
  • 42. Conference Talk on Thursday • Learning with Previously Unseen Features Yuan Shi & Craig A. Knoblock • Thursday, August 24 16:30-18:00 (Yuan will talk at 17:15) • ML-TAML3 – Transfer, Adaptation, Multi-Task Learning 3 (212)