SlideShare una empresa de Scribd logo
1 de 18
30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona - 2016
Differential Privacy Preservation for
Deep Auto-Encoders: an Application of
Human Behavior Prediction
NhatHai Phan1, Yue Wang2, Xintao Wu3, and Dejing Dou1
1 University of Oregon, 2 University of North Carolina Charlotte,
3University of Arkansas
{haiphan,dou}@cs.uoregon.edu, ywang91@uncc.edu,
xintaowu@uark.edu
1
Outline
• Deep Learning and Deep Auto-Encoders
• Differential Privacy Preservation for Deep Auto-
Encoders
– Deep Private Auto-encoders (dPA)
• Application
– YesiWell Health Social Network
– Human Behavior Prediction
• Conclusions and Future Works
2
Deep Learning
3
Pixels
1st Layer
“Edges”
2nd Layer
“Object parts”
3rd Layer
“Objects”
[Andrew Ng]
iv
1h
2h
3h
yˆ
1W
2W
3W
4W
v
Deep Auto-Encoders
• Data reconstruction
• Softmax layer
Auto-encoder
y

0W
v
1h
1
1
1W
2h
1
1
………
)(kW
Deep Auto-encoder
0W
T
W0
v
1h
1
1
)(h k
4
  

||
1 1
)~1log()1(~log),(
D
i
d
j
ijijijij vvvvWDR
 

||
1
)ˆ1log()1(ˆlog),(
TY
i
iiiiT yyyyYC 
v~
Motivation
• Deep learning
– Social media, social network analysis,
bioinformatics, medicine and healthcare.
• Privacy issues?
– Users' personal and highly sensitive data, such as
clinical records, user profiles, photo, etc.
• Differential privacy
– Deep Private Auto-Encoders
5
- Differential Privacy Definition
• The goal of a privacy-preserving statistical
database is to
– learn properties of the population as a whole,
– while protecting the privacy of the individuals in the
sample
• Differential privacy (preserving algorithm)
– maximize the accuracy of queries from statistical
databases
– minimize the chances of identifying its records

6
Challenges
• Unprecedented work
• A non-trivial task
– R(D,W) is complicated
– The algorithm must be
efficient on large
datasets
• Guarantee the potential
to use unlabeled data in
a dPA model
7
Amount of DataPerformance
Most learning
algorithms
New AI methods
(deep learning)
[Andrew Ng, 2015]
Deep Private Auto-Encoders
• Functional Mechanism
– injecting Laplace noise Lap(Δ/ε) into
polynomial coefficients of polynomial
functions
yˆ
0W
v
1h
1
1
1
1W
2h
1
1
…………
)(kW
1
Deep Private Auto-encoder
8
Polynomial Approximation
 









||
1 1
)~1log()1(
~log
),(
D
i
d
j ijij
ijij
vv
vv
WDR
v~
•
• Apply Functional Mechanism to inject Laplace noise Lap(Δ/ε)
Polynomial Approximation
Taylor Expansion [Arfken 1985]
Arfken, G. 1985. In Mathematical Methods for Physicists (Third Edition). Academic Press.
9


 
 

 















||
1 1 2
2
1
)2(
2
1
2
1
)1()0(
))(
!2
)0(
(
))0(()0(
),(ˆ
D
i
d
j
ij
l
lj
ij
l l
ljlj
hW
f
hWff
WDR
  

||
1 1
)~1log()1(~log),(
D
i
d
j
ijijijij vvvvWDR
Taylor Expansion Error?
Approximation Error Bounds
• Approximation error bounds
• Our algorithm can be applied on large
datasets
10
2
2
2
2
)1(
12
)
~
,(
~
)ˆ,(
~
)1(
12
)
~
,(
~
)ˆ,(
~
),(ˆminargˆ);,(
~
minarg
~
ee
ee
YCYC
d
ee
ee
WDRWDR
WDRWWDRW
TT
WW









#input units - d
error
Outline
• Deep Learning and Deep Auto-Encoders
• Differential Privacy Preservation for Deep Auto-
Encoders
– Deep Private Auto-encoders (dPA)
• Application
– YesiWell Health Social Network
– Human Behavior Prediction
• Conclusions and Future Works
11
Semantic Mining of Activity, Social, and Health Data
(NIH/NIGMS Funded in 2013, R01 GM103309) (PI: Dou)
12
Human Behavior Prediction
1
2
3
4
5
6
7
1
2
3
4
5
6
7
1
2
3
4
5
6
7
t1t 1t
Decrease exercise Increase exercise 13
Dataset, Features, and Task
• YesiWell dataset
– 254 users
– Oct 2010 – Aug 2011
• BMI
• Wellness Score
• Prediction Task: Try to predict whether a YesiWell user will
increase or decrease exercises in the next week compared
with the current week.
14
2
))(()( mheightkgmass
)1()(
)/()(
3423
121
cHbAULDLU
HDLTGUBMIUy




dPA-based Human Behavior Prediction
(dPAH)
15
Individual
Features
Individual
Past Features
Social
Correlations
1h
1
Human Behavior Prediction
Experimental Results
• Do not enforce differential
privacy
– CRBM, SctRBM
– Deep Auto-Encoder (dA)
– Truncated Deep Auto-Encoder
(TdA)
• Do enforce differential
privacy
– Functional Mechanism (FM)
– DPME, Filter-Priority (FP)
• dPAH: 83.39%
– (ε, sampling rate) = (1, 0.4)
16
data
Conclusions
• Deep private auto-encoders
– Human behavior prediction: 83.392%
• The proposed algorithm can work for
– Deep Belief Networks
– Convolutional Neural Networks
• Extract sensitive information from a deep
private auto-encoder
17
30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona - 2016
SMASH Project: http://aimlab.cs.uoregon.edu/smash/
18
YesiWell Health Social Network
Thank you!

Más contenido relacionado

Similar a Differential Privacy Preservation for Deep Auto-Encoders

Deep Learning Based Voice Activity Detection and Speech Enhancement
Deep Learning Based Voice Activity Detection and Speech EnhancementDeep Learning Based Voice Activity Detection and Speech Enhancement
Deep Learning Based Voice Activity Detection and Speech Enhancement
NAVER Engineering
 
Exploring temporal graph data with Python: 
a study on tensor decomposition o...
Exploring temporal graph data with Python: 
a study on tensor decomposition o...Exploring temporal graph data with Python: 
a study on tensor decomposition o...
Exploring temporal graph data with Python: 
a study on tensor decomposition o...
André Panisson
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
butest
 

Similar a Differential Privacy Preservation for Deep Auto-Encoders (20)

Compressed sensing techniques for sensor data using unsupervised learning
Compressed sensing techniques for sensor data using unsupervised learningCompressed sensing techniques for sensor data using unsupervised learning
Compressed sensing techniques for sensor data using unsupervised learning
 
Deep Learning Based Voice Activity Detection and Speech Enhancement
Deep Learning Based Voice Activity Detection and Speech EnhancementDeep Learning Based Voice Activity Detection and Speech Enhancement
Deep Learning Based Voice Activity Detection and Speech Enhancement
 
Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...
Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...
Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...
 
ICSRS_R038.pptx
ICSRS_R038.pptxICSRS_R038.pptx
ICSRS_R038.pptx
 
Exploring temporal graph data with Python: 
a study on tensor decomposition o...
Exploring temporal graph data with Python: 
a study on tensor decomposition o...Exploring temporal graph data with Python: 
a study on tensor decomposition o...
Exploring temporal graph data with Python: 
a study on tensor decomposition o...
 
Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...
 
The Study of the Wiener Processes Base on Haar Wavelet
The Study of the Wiener Processes Base on Haar WaveletThe Study of the Wiener Processes Base on Haar Wavelet
The Study of the Wiener Processes Base on Haar Wavelet
 
Neural network
Neural networkNeural network
Neural network
 
Hairong Qi V Swaminathan
Hairong Qi V SwaminathanHairong Qi V Swaminathan
Hairong Qi V Swaminathan
 
tracking.ppt
tracking.ppttracking.ppt
tracking.ppt
 
Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and S...
Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and S...Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and S...
Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and S...
 
Morgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 distMorgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 dist
 
Picmet15sasaki20150805.ppt
Picmet15sasaki20150805.pptPicmet15sasaki20150805.ppt
Picmet15sasaki20150805.ppt
 
Open Analytics Environment
Open Analytics EnvironmentOpen Analytics Environment
Open Analytics Environment
 
R4101112115
R4101112115R4101112115
R4101112115
 
lecture_01.ppt
lecture_01.pptlecture_01.ppt
lecture_01.ppt
 
MULTI-DOMAIN UNPAIRED ULTRASOUND IMAGE ARTIFACT REMOVAL USING A SINGLE CONVOL...
MULTI-DOMAIN UNPAIRED ULTRASOUND IMAGE ARTIFACT REMOVAL USING A SINGLE CONVOL...MULTI-DOMAIN UNPAIRED ULTRASOUND IMAGE ARTIFACT REMOVAL USING A SINGLE CONVOL...
MULTI-DOMAIN UNPAIRED ULTRASOUND IMAGE ARTIFACT REMOVAL USING A SINGLE CONVOL...
 
UH Professor Arthur Weglein's M-OSRP Annual Report, 2013
UH Professor Arthur Weglein's M-OSRP Annual Report, 2013UH Professor Arthur Weglein's M-OSRP Annual Report, 2013
UH Professor Arthur Weglein's M-OSRP Annual Report, 2013
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
ddpg seminar
ddpg seminarddpg seminar
ddpg seminar
 

Último

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 

Último (20)

GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 

Differential Privacy Preservation for Deep Auto-Encoders

  • 1. 30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona - 2016 Differential Privacy Preservation for Deep Auto-Encoders: an Application of Human Behavior Prediction NhatHai Phan1, Yue Wang2, Xintao Wu3, and Dejing Dou1 1 University of Oregon, 2 University of North Carolina Charlotte, 3University of Arkansas {haiphan,dou}@cs.uoregon.edu, ywang91@uncc.edu, xintaowu@uark.edu 1
  • 2. Outline • Deep Learning and Deep Auto-Encoders • Differential Privacy Preservation for Deep Auto- Encoders – Deep Private Auto-encoders (dPA) • Application – YesiWell Health Social Network – Human Behavior Prediction • Conclusions and Future Works 2
  • 3. Deep Learning 3 Pixels 1st Layer “Edges” 2nd Layer “Object parts” 3rd Layer “Objects” [Andrew Ng] iv 1h 2h 3h yˆ 1W 2W 3W 4W v
  • 4. Deep Auto-Encoders • Data reconstruction • Softmax layer Auto-encoder y  0W v 1h 1 1 1W 2h 1 1 ……… )(kW Deep Auto-encoder 0W T W0 v 1h 1 1 )(h k 4     || 1 1 )~1log()1(~log),( D i d j ijijijij vvvvWDR    || 1 )ˆ1log()1(ˆlog),( TY i iiiiT yyyyYC  v~
  • 5. Motivation • Deep learning – Social media, social network analysis, bioinformatics, medicine and healthcare. • Privacy issues? – Users' personal and highly sensitive data, such as clinical records, user profiles, photo, etc. • Differential privacy – Deep Private Auto-Encoders 5
  • 6. - Differential Privacy Definition • The goal of a privacy-preserving statistical database is to – learn properties of the population as a whole, – while protecting the privacy of the individuals in the sample • Differential privacy (preserving algorithm) – maximize the accuracy of queries from statistical databases – minimize the chances of identifying its records  6
  • 7. Challenges • Unprecedented work • A non-trivial task – R(D,W) is complicated – The algorithm must be efficient on large datasets • Guarantee the potential to use unlabeled data in a dPA model 7 Amount of DataPerformance Most learning algorithms New AI methods (deep learning) [Andrew Ng, 2015]
  • 8. Deep Private Auto-Encoders • Functional Mechanism – injecting Laplace noise Lap(Δ/ε) into polynomial coefficients of polynomial functions yˆ 0W v 1h 1 1 1 1W 2h 1 1 ………… )(kW 1 Deep Private Auto-encoder 8 Polynomial Approximation            || 1 1 )~1log()1( ~log ),( D i d j ijij ijij vv vv WDR v~
  • 9. • • Apply Functional Mechanism to inject Laplace noise Lap(Δ/ε) Polynomial Approximation Taylor Expansion [Arfken 1985] Arfken, G. 1985. In Mathematical Methods for Physicists (Third Edition). Academic Press. 9                         || 1 1 2 2 1 )2( 2 1 2 1 )1()0( ))( !2 )0( ( ))0(()0( ),(ˆ D i d j ij l lj ij l l ljlj hW f hWff WDR     || 1 1 )~1log()1(~log),( D i d j ijijijij vvvvWDR Taylor Expansion Error?
  • 10. Approximation Error Bounds • Approximation error bounds • Our algorithm can be applied on large datasets 10 2 2 2 2 )1( 12 ) ~ ,( ~ )ˆ,( ~ )1( 12 ) ~ ,( ~ )ˆ,( ~ ),(ˆminargˆ);,( ~ minarg ~ ee ee YCYC d ee ee WDRWDR WDRWWDRW TT WW          #input units - d error
  • 11. Outline • Deep Learning and Deep Auto-Encoders • Differential Privacy Preservation for Deep Auto- Encoders – Deep Private Auto-encoders (dPA) • Application – YesiWell Health Social Network – Human Behavior Prediction • Conclusions and Future Works 11
  • 12. Semantic Mining of Activity, Social, and Health Data (NIH/NIGMS Funded in 2013, R01 GM103309) (PI: Dou) 12
  • 14. Dataset, Features, and Task • YesiWell dataset – 254 users – Oct 2010 – Aug 2011 • BMI • Wellness Score • Prediction Task: Try to predict whether a YesiWell user will increase or decrease exercises in the next week compared with the current week. 14 2 ))(()( mheightkgmass )1()( )/()( 3423 121 cHbAULDLU HDLTGUBMIUy    
  • 15. dPA-based Human Behavior Prediction (dPAH) 15 Individual Features Individual Past Features Social Correlations 1h 1
  • 16. Human Behavior Prediction Experimental Results • Do not enforce differential privacy – CRBM, SctRBM – Deep Auto-Encoder (dA) – Truncated Deep Auto-Encoder (TdA) • Do enforce differential privacy – Functional Mechanism (FM) – DPME, Filter-Priority (FP) • dPAH: 83.39% – (ε, sampling rate) = (1, 0.4) 16 data
  • 17. Conclusions • Deep private auto-encoders – Human behavior prediction: 83.392% • The proposed algorithm can work for – Deep Belief Networks – Convolutional Neural Networks • Extract sensitive information from a deep private auto-encoder 17
  • 18. 30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona - 2016 SMASH Project: http://aimlab.cs.uoregon.edu/smash/ 18 YesiWell Health Social Network Thank you!

Notas del editor

  1. Good morning everyone! I am Hai Phan. It’s my pleasure to introduce you my current work. YesiWell: Human behavior modeling in health social networks. During my presentation, if you have any question, please feel free to ask.
  2. I will give an overview of deep learning and deep Auto-Encoders Then, I will introduce our differential privacy preservation for Deep Auto-Encoders, Deep Private Auto-Encoders. We apply our model in a health social network for human behavior prediction Finally, I will give conclusions and Future Works
  3. Deep Learning is (unsupervised) learning of multiple levels of features or representations of the data. This is an example of a deep neural network. Given pixels of images as input v, we can learn edges in the first hidden layer of the neural network h1. Then, we can learn object parts in the second hidden layer, and objects in the third hidden layer. We can use the hidden features learned by the neural network to do prediction, classification, etc.
  4. Deep Auto-Encoder is a fundamental deep learning model. This is the structure of an auto-encoder, v is the input and h is a hidden layer and the data reconstruction layer. We train the auto-encoder by minimizing the data reconstruction function, which actually is the cross entropy-error function. The better we can reconstruct the data, the better the model is. We can stack multiple auto-encoders on top of each other to construct a deep auto-encoder. The softmax layer only contains a binomial random variable for a binomial prediction task. To train the softmax layer we use back-propagation to minimize the cross-entropy error function and fine-tune all the parameters together.
  5. In recent years, deep learning has spread beyond both academia and industry with many exciting real-world applications, such as image processing, speech recognition, natural language processing, social media, social network analysis, bioinformatics, medicine and healthcare, etc. This presents obvious issues about protecting privacy in deep learning models. when they are built on users' personal and highly sensitive data, such as clinical records, user profiles, photo, etc. In our SMASH project, we also have proposed several deep learning models for human behavior prediction Therefore, one of our motivations is developing solutions to preserve privacy in deep learning. First, I will introduce the definition of differential privacy, then our deep private auto-encoders.
  6. The goal of a privacy-preserving statistical database is to learn properties of the population as a whole, while protecting the privacy of the individuals in the sample. For instance, Say you want to figure out the average grade on a test of people in the room, without revealing anything about your own grade other than what is inherent in the answer.
  7. This is a non-trivial task since The function is complicated compared with simple regression functions. The error of the approximation method must be bounded and independent of the data size D. This is to guarantee the potential to use a large amount of unlabeled data in a deep private auto-encoder model. Since deep learning is powerful with large datasets. The performance of deep learning is proportional to the amount of data.
  8. We preserve differential privacy by applying Functional Mechanism, which aim at injecting Laplace noise into polynomial coefficients of polynomial functions. For instance, this figure illustrate the original function f(W) and the perturbed function upper bar f(W). To apply Functional Mechanism, we derive the polynomial form of the data reconstruction function in the deep auto-encoders. Then, we inject Laplace noise into its polynomial coefficients. There are two main parts in our algorithm. The first part is from step 1 to step 3, which preserve differential privacy for auto-encoder, called private auto-encoder. Since we use functional mechanism, we first derive polynomial approximation of the data reconstruction function. Then the approximation function will be perturbed by injecting noise into its polynomial coffectcients. Then we train the perturbed function. To stack private auto-encoders, we add a normalization layer into private auto-encoder. Then we are ready to stack multiple private auto-encoders to construct a deep private auto-encoder. The output is a binomial approximation function. We also derive and perturb the polynomial approximation of the cross-entropy error. Then back-propagation is used to trained the perturbed deep private auto-encoder.
  9. By applying Taylor Expansion, we can rewrite the data reconstruction function like this formula based on four functions g1, g2, f1, f2. We can apply the Functional mechanism to inject Laplace noise into the approximation function. Delta is the global sensitivity of R_hat over the database D.
  10. The approximation error bounds are given in this analysis. The absolute different between training two functions is always bounded by a product of a small constant and the number of attributes d. The error is completely independent of the data size. This guarantees that our approximation of the auto-encoder can be applied in large datasets. We have a similar result with the cross-entropy error function. We apply our deep private auto-encoder to predict human behavior in our health social network.
  11. I will talk about overweight/obesity and introduce our YesiWell health social network. the motivation of our SMASH project. Then, I will introduce two interesting research topics. Human behavior prediction with our novel Social Restricted Boltzmann Machine (SRBM). Then, I will our privacy preserving layer to protect the privacy of human subjects during the data mining process. Conclusions and our future works.
  12. Our project, Semantic Mining of Activity, Social, and Health Data has been funded by NIH NIGMS since 2013 as an R01 grant. We collect multi-dimensional data from biomarkers, social activities, and physical activities. We then design a Health Ontology to represent the data. We apply different techniques such as data mining, intervention approaches, and privacy preserving models to extract knowledge from our data and health ontology.
  13. Human behavior prediction is a general problem in social network analysis. Given the social network at a specific timestamp, each node is a user, there is an edge between 2 nodes if they are socially connected. For instance, friend connections. Different colors are used to denote different behaviors of users. In YesiWell social network, the blue nodes indicate that the users increase their exercise compared with the previous timestamp, the orange nodes indicate that the users decrease their exercise compared with the previous timestamp. Given the social network in M timestamps, we would like to predict the status of all the users in the next timestamp.
  14. We have 254 users. We consider 30 individual features in total including physical activities, social communications, and biomarkers. In this study, we try to predict whether a YesiWell user will increase or decrease exercises in the next week compared with the current week.
  15. To model human behaviors, we concatenate individual features, individual features in the past, and social correlations as input of a deep private auto-encoder. The output is a binomial variable: “active” or “inactive” in doing exercise.
  16. We compare our model with various competitive models, including non-enforcing differential privacy model and enforcing differential privacy model. CRBM, SctRBM, deep auto-encoder, Truncated Deep Auto-Encoder, Functional Mechanism (FM), DPME, Filter-Priority (FP). This Figure shows the prediction accuracy of each algorithm as a function of the dataset cardinality. There is a gap between the prediction accuracy of dPAH and that of dA and Truncated one TdA. But the gap gets smaller rapidly with the increase of dataset cardinality. With very small sampling rate, the performance of the dPAH model is slightly lower than the SctRBM, which is a non-privacy-enforcing deep learning model. However, the dPAH significantly is better than the SctRBM when the sampling rate goes just a bit higher, i.e., > 0.3. This is a significant result, since 0.3 is a small sampling rate. The other Figure illustrates the privacy budget epsilon and the prediction accuracy. Our model is relatively robust against the change of epsilon. The dPAH model is competitive even with privacy non-enforcing models.
  17. The three most significant lessions learned are: “Online Social Network has a significant impact on physical activity,” “Social communication can propagate physical activities,” and “Good relationships make us healthier.” We have proposed a novel Social Restricted Bolzmann Machine for human behavior prediction. The SRBM outperforms the state-of-the-art models, and we are able to predict human behaviors, e.g., physical exercise level, up to 88.7 percent. We have developed the first differential privacy preserving algorithm in deep learning. In particular, deep auto-encoders. We also proved that the algorithm can work with other models such as , RBM and deep belief networks.