Preserve differential privacy for deep learning, particularly deep auto-encoders. An application of human behavior prediction in health social network.
Differential Privacy Preservation for Deep Auto-Encoders
1. 30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona - 2016
Differential Privacy Preservation for
Deep Auto-Encoders: an Application of
Human Behavior Prediction
NhatHai Phan1, Yue Wang2, Xintao Wu3, and Dejing Dou1
1 University of Oregon, 2 University of North Carolina Charlotte,
3University of Arkansas
{haiphan,dou}@cs.uoregon.edu, ywang91@uncc.edu,
xintaowu@uark.edu
1
2. Outline
• Deep Learning and Deep Auto-Encoders
• Differential Privacy Preservation for Deep Auto-
Encoders
– Deep Private Auto-encoders (dPA)
• Application
– YesiWell Health Social Network
– Human Behavior Prediction
• Conclusions and Future Works
2
4. Deep Auto-Encoders
• Data reconstruction
• Softmax layer
Auto-encoder
y
0W
v
1h
1
1
1W
2h
1
1
………
)(kW
Deep Auto-encoder
0W
T
W0
v
1h
1
1
)(h k
4
||
1 1
)~1log()1(~log),(
D
i
d
j
ijijijij vvvvWDR
||
1
)ˆ1log()1(ˆlog),(
TY
i
iiiiT yyyyYC
v~
5. Motivation
• Deep learning
– Social media, social network analysis,
bioinformatics, medicine and healthcare.
• Privacy issues?
– Users' personal and highly sensitive data, such as
clinical records, user profiles, photo, etc.
• Differential privacy
– Deep Private Auto-Encoders
5
6. - Differential Privacy Definition
• The goal of a privacy-preserving statistical
database is to
– learn properties of the population as a whole,
– while protecting the privacy of the individuals in the
sample
• Differential privacy (preserving algorithm)
– maximize the accuracy of queries from statistical
databases
– minimize the chances of identifying its records
6
7. Challenges
• Unprecedented work
• A non-trivial task
– R(D,W) is complicated
– The algorithm must be
efficient on large
datasets
• Guarantee the potential
to use unlabeled data in
a dPA model
7
Amount of DataPerformance
Most learning
algorithms
New AI methods
(deep learning)
[Andrew Ng, 2015]
8. Deep Private Auto-Encoders
• Functional Mechanism
– injecting Laplace noise Lap(Δ/ε) into
polynomial coefficients of polynomial
functions
yˆ
0W
v
1h
1
1
1
1W
2h
1
1
…………
)(kW
1
Deep Private Auto-encoder
8
Polynomial Approximation
||
1 1
)~1log()1(
~log
),(
D
i
d
j ijij
ijij
vv
vv
WDR
v~
9. •
• Apply Functional Mechanism to inject Laplace noise Lap(Δ/ε)
Polynomial Approximation
Taylor Expansion [Arfken 1985]
Arfken, G. 1985. In Mathematical Methods for Physicists (Third Edition). Academic Press.
9
||
1 1 2
2
1
)2(
2
1
2
1
)1()0(
))(
!2
)0(
(
))0(()0(
),(ˆ
D
i
d
j
ij
l
lj
ij
l l
ljlj
hW
f
hWff
WDR
||
1 1
)~1log()1(~log),(
D
i
d
j
ijijijij vvvvWDR
Taylor Expansion Error?
10. Approximation Error Bounds
• Approximation error bounds
• Our algorithm can be applied on large
datasets
10
2
2
2
2
)1(
12
)
~
,(
~
)ˆ,(
~
)1(
12
)
~
,(
~
)ˆ,(
~
),(ˆminargˆ);,(
~
minarg
~
ee
ee
YCYC
d
ee
ee
WDRWDR
WDRWWDRW
TT
WW
#input units - d
error
11. Outline
• Deep Learning and Deep Auto-Encoders
• Differential Privacy Preservation for Deep Auto-
Encoders
– Deep Private Auto-encoders (dPA)
• Application
– YesiWell Health Social Network
– Human Behavior Prediction
• Conclusions and Future Works
11
12. Semantic Mining of Activity, Social, and Health Data
(NIH/NIGMS Funded in 2013, R01 GM103309) (PI: Dou)
12
14. Dataset, Features, and Task
• YesiWell dataset
– 254 users
– Oct 2010 – Aug 2011
• BMI
• Wellness Score
• Prediction Task: Try to predict whether a YesiWell user will
increase or decrease exercises in the next week compared
with the current week.
14
2
))(()( mheightkgmass
)1()(
)/()(
3423
121
cHbAULDLU
HDLTGUBMIUy
15. dPA-based Human Behavior Prediction
(dPAH)
15
Individual
Features
Individual
Past Features
Social
Correlations
1h
1
16. Human Behavior Prediction
Experimental Results
• Do not enforce differential
privacy
– CRBM, SctRBM
– Deep Auto-Encoder (dA)
– Truncated Deep Auto-Encoder
(TdA)
• Do enforce differential
privacy
– Functional Mechanism (FM)
– DPME, Filter-Priority (FP)
• dPAH: 83.39%
– (ε, sampling rate) = (1, 0.4)
16
data
17. Conclusions
• Deep private auto-encoders
– Human behavior prediction: 83.392%
• The proposed algorithm can work for
– Deep Belief Networks
– Convolutional Neural Networks
• Extract sensitive information from a deep
private auto-encoder
17
18. 30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona - 2016
SMASH Project: http://aimlab.cs.uoregon.edu/smash/
18
YesiWell Health Social Network
Thank you!
Notas del editor
Good morning everyone! I am Hai Phan.
It’s my pleasure to introduce you my current work. YesiWell: Human behavior modeling in health social networks.
During my presentation, if you have any question, please feel free to ask.
I will give an overview of deep learning and deep Auto-Encoders
Then, I will introduce our differential privacy preservation for Deep Auto-Encoders, Deep Private Auto-Encoders.
We apply our model in a health social network for human behavior prediction
Finally, I will give conclusions and Future Works
Deep Learning is (unsupervised) learning of multiple levels of features or representations of the data.
This is an example of a deep neural network.
Given pixels of images as input v, we can learn edges in the first hidden layer of the neural network h1.
Then, we can learn object parts in the second hidden layer, and objects in the third hidden layer.
We can use the hidden features learned by the neural network to do prediction, classification, etc.
Deep Auto-Encoder is a fundamental deep learning model.
This is the structure of an auto-encoder, v is the input and h is a hidden layer and the data reconstruction layer.
We train the auto-encoder by minimizing the data reconstruction function, which actually is the cross entropy-error function.
The better we can reconstruct the data, the better the model is.
We can stack multiple auto-encoders on top of each other to construct a deep auto-encoder.
The softmax layer only contains a binomial random variable for a binomial prediction task.
To train the softmax layer we use back-propagation to minimize the cross-entropy error function and fine-tune all the parameters together.
In recent years, deep learning has spread beyond both academia and industry with many exciting real-world applications, such as image processing, speech recognition, natural language processing, social media, social network analysis, bioinformatics, medicine and healthcare, etc.
This presents obvious issues about protecting privacy in deep learning models.
when they are built on users' personal and highly sensitive data, such as clinical records, user profiles, photo, etc.
In our SMASH project, we also have proposed several deep learning models for human behavior prediction
Therefore, one of our motivations is developing solutions to preserve privacy in deep learning.
First, I will introduce the definition of differential privacy, then our deep private auto-encoders.
The goal of a privacy-preserving statistical database is to learn properties of the population as a whole, while protecting the privacy of the individuals in the sample.
For instance, Say you want to figure out the average grade on a test of people in the room, without revealing anything about your own grade other than what is inherent in the answer.
This is a non-trivial task since
The function is complicated compared with simple regression functions.
The error of the approximation method must be bounded and independent of the data size D.
This is to guarantee the potential to use a large amount of unlabeled data in a deep private auto-encoder model.
Since deep learning is powerful with large datasets.
The performance of deep learning is proportional to the amount of data.
We preserve differential privacy by applying Functional Mechanism, which aim at injecting Laplace noise into polynomial coefficients of polynomial functions.
For instance, this figure illustrate the original function f(W) and the perturbed function upper bar f(W).
To apply Functional Mechanism, we derive the polynomial form of the data reconstruction function in the deep auto-encoders.
Then, we inject Laplace noise into its polynomial coefficients.
There are two main parts in our algorithm.
The first part is from step 1 to step 3, which preserve differential privacy for auto-encoder, called private auto-encoder.
Since we use functional mechanism, we first derive polynomial approximation of the data reconstruction function.
Then the approximation function will be perturbed by injecting noise into its polynomial coffectcients.
Then we train the perturbed function.
To stack private auto-encoders, we add a normalization layer into private auto-encoder.
Then we are ready to stack multiple private auto-encoders to construct a deep private auto-encoder.
The output is a binomial approximation function.
We also derive and perturb the polynomial approximation of the cross-entropy error.
Then back-propagation is used to trained the perturbed deep private auto-encoder.
By applying Taylor Expansion, we can rewrite the data reconstruction function like this formula based on four functions g1, g2, f1, f2.
We can apply the Functional mechanism to inject Laplace noise into the approximation function.
Delta is the global sensitivity of R_hat over the database D.
The approximation error bounds are given in this analysis.
The absolute different between training two functions is always bounded by a product of a small constant and the number of attributes d.
The error is completely independent of the data size.
This guarantees that our approximation of the auto-encoder can be applied in large datasets.
We have a similar result with the cross-entropy error function.
We apply our deep private auto-encoder to predict human behavior in our health social network.
I will talk about overweight/obesity and introduce our YesiWell health social network.
the motivation of our SMASH project.
Then, I will introduce two interesting research topics.
Human behavior prediction with our novel Social Restricted Boltzmann Machine (SRBM).
Then, I will our privacy preserving layer to protect the privacy of human subjects during the data mining process.
Conclusions and our future works.
Our project, Semantic Mining of Activity, Social, and Health Data has been funded by NIH NIGMS since 2013 as an R01 grant.
We collect multi-dimensional data from biomarkers, social activities, and physical activities.
We then design a Health Ontology to represent the data.
We apply different techniques such as data mining, intervention approaches, and privacy preserving models to extract knowledge from our data and health ontology.
Human behavior prediction is a general problem in social network analysis.
Given the social network at a specific timestamp, each node is a user, there is an edge between 2 nodes if they are socially connected. For instance, friend connections.
Different colors are used to denote different behaviors of users.
In YesiWell social network, the blue nodes indicate that the users increase their exercise compared with the previous timestamp, the orange nodes indicate that the users decrease their exercise compared with the previous timestamp.
Given the social network in M timestamps, we would like to predict the status of all the users in the next timestamp.
We have 254 users.
We consider 30 individual features in total including physical activities, social communications, and biomarkers.
In this study, we try to predict whether a YesiWell user will increase or decrease exercises in the next week compared with the current week.
To model human behaviors, we concatenate individual features, individual features in the past, and social correlations as input of a deep private auto-encoder.
The output is a binomial variable: “active” or “inactive” in doing exercise.
We compare our model with various competitive models, including non-enforcing differential privacy model and enforcing differential privacy model.
CRBM, SctRBM, deep auto-encoder, Truncated Deep Auto-Encoder, Functional Mechanism (FM), DPME, Filter-Priority (FP).
This Figure shows the prediction accuracy of each algorithm as a function of the dataset cardinality.
There is a gap between the prediction accuracy of dPAH and that of dA and Truncated one TdA.
But the gap gets smaller rapidly with the increase of dataset cardinality.
With very small sampling rate, the performance of the dPAH model is slightly lower than the SctRBM, which is a non-privacy-enforcing deep learning model.
However, the dPAH significantly is better than the SctRBM when the sampling rate goes just a bit higher, i.e., > 0.3.
This is a significant result, since 0.3 is a small sampling rate.
The other Figure illustrates the privacy budget epsilon and the prediction accuracy.
Our model is relatively robust against the change of epsilon.
The dPAH model is competitive even with privacy non-enforcing models.
The three most significant lessions learned are: “Online Social Network has a significant impact on physical activity,” “Social communication can propagate physical activities,” and “Good relationships make us healthier.”
We have proposed a novel Social Restricted Bolzmann Machine for human behavior prediction.
The SRBM outperforms the state-of-the-art models, and we are able to predict human behaviors, e.g., physical exercise level, up to 88.7 percent.
We have developed the first differential privacy preserving algorithm in deep learning.
In particular, deep auto-encoders. We also proved that the algorithm can work with other models such as , RBM and deep belief networks.