SlideShare una empresa de Scribd logo
1 de 34
Differential Privacy
And Machine Learning
Aaron Roth
March 24, 2017
Protecting Privacy is Important
Class action lawsuit accuses AOL of violating the Electronic Communications
Privacy Act, seeks $5,000 in damages per user. AOL’s director of research is fired.
Protecting Privacy is Important
Class action lawsuit (Doe v. Netflix) accuses Netflix of violating the Video Privacy
Protection Act, seeks $2,000 in compensation for each of Netflix’s 2,000,000
subscribers. Settled for undisclosed sum, 2nd Netflix Challenge is cancelled.
Protecting Privacy is Important
The National Human Genome Research Institute (NHGRI) immediately restricted
pooled genomic data that had previously been publically available.
So What is Differential Privacy?
So What is Differential Privacy?
• Differential Privacy is about promising people freedom from
harm.
“An analysis of a dataset D is differentially private if the data analyst
knows almost no more about Alice after the analysis than he would
have known had he conducted the same analysis on an identical data
set with Alice’s data removed.”
D
Differential Privacy
[Dwork-McSherry-Nissim-Smith 06]
Algorithm
Pr [r]
ratio bounded
Alice Bob Chris Donna ErnieXavier
Differential Privacy
𝑋: The data universe.
𝐷 ⊂ 𝑋: The dataset (one element per person)
Definition: Two datasets 𝐷, 𝐷′
⊂ 𝑋 are
neighbors if they differ in the data of a single
individual.
Differential Privacy
𝑋: The data universe.
𝐷 ⊂ 𝑋: The dataset (one element per person)
Definition: An algorithm 𝑀 is 𝜖-differentially
private if for all pairs of neighboring datasets
𝐷, 𝐷′
, and for all outputs x:
Pr 𝑀 𝐷 = 𝑥 ≤ (1 + 𝜖) Pr 𝑀 𝐷′
= 𝑥
Some Useful Properties
Theorem (Postprocessing): If 𝑀(𝐷) is 𝜖-private, and 𝑓 is any
(randomized) function, then 𝑓(𝑀 𝐷 ) is 𝜖-private.
So…
𝑥 =
Definition: An algorithm 𝑀 is 𝜖-differentially
private if for all pairs of neighboring datasets
𝐷, 𝐷′
, and for all outputs x:
Pr 𝑀 𝐷 = 𝑥 ≤ (1 + 𝜖) Pr 𝑀 𝐷′
= 𝑥
So…
𝑥 =
Definition: An algorithm 𝑀 is 𝜖-differentially
private if for all pairs of neighboring datasets
𝐷, 𝐷′
, and for all outputs x:
Pr 𝑀 𝐷 = 𝑥 ≤ (1 + 𝜖) Pr 𝑀 𝐷′
= 𝑥
So…
𝑥 =
Definition: An algorithm 𝑀 is 𝜖-differentially
private if for all pairs of neighboring datasets
𝐷, 𝐷′
, and for all outputs x:
Pr 𝑀 𝐷 = 𝑥 ≤ (1 + 𝜖) Pr 𝑀 𝐷′
= 𝑥
Some Useful Properties
Theorem (Composition): If 𝑀1, … , 𝑀 𝑘
are 𝜖-private, then:
𝑀 𝐷 ≡ (𝑀1 𝐷 , … , 𝑀 𝑘 𝐷 )
is 𝑘𝜖-private.
So…
You can go about designing algorithms as you normally would.
Just access the data using differentially private “subroutines”,
and keep track of your “privacy budget” as a resource.
Private algorithm design, like regular algorithm design, can be
modular.
Some simple operations:
Answering Numeric Queries
Def: A numeric function 𝑓 has sensitivity 𝑐 if for all neighboring
𝐷, 𝐷′
:
𝑓 𝐷 − 𝑓 𝐷′
≤ 𝑐
Write 𝑠 𝑓 ≡ 𝑐
• e.g. “How many software engineers are in the room?” has
sensitivity 1.
• “What fraction of people in the room are software engineers?”
has sensitivity
1
𝑛
.
Some simple operations:
Answering Numeric Queries
The Laplace Mechanism:
𝑀𝐿𝑎𝑝 𝐷, 𝑓, 𝜖 = 𝑓 𝐷 + 𝐿𝑎𝑝
𝑠 𝑓
𝜖
Theorem: 𝑀𝐿𝑎𝑝(⋅, 𝑓, 𝜖) is 𝜖-private.
Some simple operations:
Answering Numeric Queries
The Laplace Mechanism:
𝑀𝐿𝑎𝑝 𝐷, 𝑓, 𝜖 = 𝑓 𝐷 + 𝐿𝑎𝑝
𝑠 𝑓
𝜖
Theorem: The expected error is
𝑠 𝑓
𝜖
(can answer “what fraction of people in the room are software
engineers?” with error 0.2%)
Some simple operations:
Answering Non-numeric Queries
“What is the modal eye color in the room?”
𝑅 = {Blue, Green, Brown, Red}
• If you can define a function that determines how “good” each
outcome is for a fixed input:
– E.g.
𝑞 𝐷, Red =“fraction of people in D with red eyes”
Some simple operations:
Answering Non-numeric Queries
𝑀 𝐸𝑥𝑝(𝐷, 𝑅, 𝑞, 𝜖):
Output 𝑟 ∈ 𝑅 w.p. ∝ 𝑒2𝜖⋅𝑞 𝐷,𝑟
Theorem: 𝑀 𝐸𝑥𝑝(𝐷, 𝑅, 𝑞, 𝜖) is 𝑠 𝑞 ⋅ 𝜖-private, and outputs 𝑟 ∈ 𝑅 such
that:
𝐸 𝑞 𝐷, 𝑟 − max
𝑟∗∈𝑅
𝑞 𝐷, 𝑟∗
≤
2𝑠 𝑞
𝜖
⋅ ln 𝑅
(can find a color that has frequency within 0.5% of the modal color in
the room)
So what can we do with that?
Empirical Risk Minimization:
*i.e. almost all of supervised learning
Find 𝜃 to minimize:
𝐿 𝜃 =
𝑖=1
𝑛
ℓ 𝜃, 𝑥𝑖, 𝑦𝑖
So what can we do with that?
Empirical Risk Minimization:
Simple Special Case: Linear Regression
Find 𝜃 ∈ 𝑅 𝑑
to minimize:
𝑖=1
𝑛
𝜃, 𝑥𝑖 − 𝑦𝑖
2
So what can we do with that?
• First Attempt: Try the exponential mechanism!
• Define 𝑞 𝜃, 𝐷 = − 𝑋𝜃 − 𝑦 𝑇
(𝑋𝜃 − 𝑦)
• Output 𝜃 ∈ ℝ 𝑑
w.p. ∝ 𝑒2𝜖𝑞(𝜃,𝐷)
– How?
• In this case, reduces to:
Output 𝜃∗
+ 𝑁 0,
𝑋 𝑇 𝑋
−1
2𝜖
Where 𝜃∗
= arg min 𝑖=1
𝑛
𝜃, 𝑥𝑖 − 𝑦𝑖
2
So what can we do with that?
Empirical Risk Minimization:
The General Case (e.g. deep learning)
Find 𝜃 to minimize:
𝐿 𝜃 =
𝑖=1
𝑛
ℓ 𝜃, 𝑥𝑖, 𝑦𝑖
Stochastic Gradient Descent
Convergence depends on the fact that at each round: 𝔼 𝑔𝑡 = 𝛻L(𝜃)
Let 𝜃1 = 0 𝑑
For 𝑡 = 1 to 𝑇:
Pick 𝑖 at random. Let 𝑔𝑡 ← 𝛻ℓ 𝜃 𝑡, 𝑥𝑖, 𝑦𝑖
Let 𝜃 𝑡+1 ← 𝜃 𝑡 − 𝜂 ⋅ 𝑔𝑡
Private Stochastic Gradient Descent
Still have: 𝔼 𝑔𝑡 = 𝛻L 𝜃 !
(Can still prove convergence theorems, and run the algorithm…)
Privacy guarantees can be computed from:
1) The privacy of the Laplace mechanism
2) Preservation of privacy under post-processing, and
3) Composition of privacy guarantees.
Let 𝜃1 = 0 𝑑
For 𝑡 = 1 to 𝑇:
Pick 𝑖 at random. Let 𝑔𝑡 ← 𝛻ℓ 𝜃 𝑡, 𝑥𝑖, 𝑦𝑖 + 𝐿𝑎𝑝 𝜎 𝑑
Let 𝜃 𝑡+1 ← 𝜃 𝑡 − 𝜂 ⋅ 𝑔𝑡
What else can we do?
• Statistical Estimation
• Graph Analysis
• Combinatorial Optimization
• Spectral Analysis of Matrices
• Anomaly Detection/Analysis of Data Streams
• Convex Optimization
• Equilibrium computation
• Computation of optimal 1-sided and 2-sided matchings
• Pareto Optimal Exchanges
• …
Differential Privacy ⇒ Learning
Theorem*: An 𝜖-differentially private algorithm cannot overfit its
training set by more than 𝜖.
*Lots of interesting details missing! See our paper in Science.
from numpy import *
def Thresholdout(sample, holdout, q, sigma, threshold):
sample_mean = mean([q(x) for x in sample])
holdout_mean = mean([q(x) for x in holdout])
if (abs(sample_mean - holdout_mean)
< random.normal(threshold, sigma)):
# q does not overfit
return sample_mean
else:
# q overfits
return holdout_mean + random.normal(0, sigma)
thresholdout.py:
Thresholdout [DFHPRR15]
Reusable holdout example
• Data set with 2n = 20,000 rows and d = 10,000
variables. Class labels in {-1,1}
• Analyst performs stepwise variable selection:
1. Split data into training/holdout of size n
2. Select “best” k variables on training data
3. Only use variables also good on holdout
4. Build linear predictor out of k variables
5. Find best k = 10,20,30,…
Classification after feature selection
No signal: data are random gaussians
labels are drawn independently at random from {-1,1}
Thresholdout correctly detects overfitting!
Strong signal: 20 features are mildly correlated with target
remaining attributes are uncorrelated
Thresholdout correctly detects right model size!
Classification after feature selection
So…
• Differential privacy provides:
– A rigorous, provable guarantee with a strong privacy semantics.
– A set of tools and composition theorems that allow for modular, easy
design of privacy preserving algorithms.
– Protection against overfitting even when privacy is not a concern.
Thanks!
To learn more:
• Our textbook on differential privacy:
– Available for free on my website: http://www.cis.upenn.edu/~aaroth
• Connections between Privacy and Overfitting:
– Dwork, Feldman, Hardt, Pitassi, Reingold, Roth, “The Reusable Holdout: Preserving Validity in Adaptive Data Analysis”,
Science, August 7 2015.
– Dwork, Feldman, Hardt, Pitassi, Reingold, Roth, “Preserving Statistical Validity in Adaptive Data Analysis”, STOC 2015.
– Bassily, Nissim, Stemmer, Smith, Steinke, Ullman, “Algorithmic Stability for Adaptive Data Analysis”, STOC 2016.
– Rogers, Roth, Smith, Thakkar, “Max Information, Differential Privacy, and Post-Selection Hypothesis Testing”, FOCS 2016.
– Cummings, Ligett, Nissim, Roth, Wu, “Adaptive Learning with Robust Generalization Guarantees”, COLT 2016.

Más contenido relacionado

La actualidad más candente

Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Sparkdatamantra
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flinkdatamantra
 
Real-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkReal-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkDataWorks Summit
 
Golang and Eco-System Introduction / Overview
Golang and Eco-System Introduction / OverviewGolang and Eco-System Introduction / Overview
Golang and Eco-System Introduction / OverviewMarkus Schneider
 
PySpark in practice slides
PySpark in practice slidesPySpark in practice slides
PySpark in practice slidesDat Tran
 
Hadoop & MapReduce
Hadoop & MapReduceHadoop & MapReduce
Hadoop & MapReduceNewvewm
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySparkRussell Jurney
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkTimo Walther
 
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...Databricks
 
Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced Flink Forward
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System OverviewFlink Forward
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used forAljoscha Krettek
 
Boost your productivity with Scala tooling!
Boost your productivity  with Scala tooling!Boost your productivity  with Scala tooling!
Boost your productivity with Scala tooling!MeriamLachkar1
 
Resilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARKResilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARKTaposh Roy
 
Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advancedChirag Ahuja
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkRahul Jain
 
Tech Talk - Overview of Dash framework for building dashboards
Tech Talk - Overview of Dash framework for building dashboardsTech Talk - Overview of Dash framework for building dashboards
Tech Talk - Overview of Dash framework for building dashboardsAppsilon Data Science
 

La actualidad más candente (20)

Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 
Real-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkReal-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache Flink
 
Golang and Eco-System Introduction / Overview
Golang and Eco-System Introduction / OverviewGolang and Eco-System Introduction / Overview
Golang and Eco-System Introduction / Overview
 
PySpark in practice slides
PySpark in practice slidesPySpark in practice slides
PySpark in practice slides
 
Dask: Scaling Python
Dask: Scaling PythonDask: Scaling Python
Dask: Scaling Python
 
Apache Spark & Streaming
Apache Spark & StreamingApache Spark & Streaming
Apache Spark & Streaming
 
Hadoop & MapReduce
Hadoop & MapReduceHadoop & MapReduce
Hadoop & MapReduce
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
 
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
 
Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Boost your productivity with Scala tooling!
Boost your productivity  with Scala tooling!Boost your productivity  with Scala tooling!
Boost your productivity with Scala tooling!
 
Resilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARKResilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARK
 
Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advanced
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Tech Talk - Overview of Dash framework for building dashboards
Tech Talk - Overview of Dash framework for building dashboardsTech Talk - Overview of Dash framework for building dashboards
Tech Talk - Overview of Dash framework for building dashboards
 

Destacado

Ben Lau, Quantitative Researcher, Hobbyist, at MLconf NYC 2017
Ben Lau, Quantitative Researcher, Hobbyist, at MLconf NYC 2017Ben Lau, Quantitative Researcher, Hobbyist, at MLconf NYC 2017
Ben Lau, Quantitative Researcher, Hobbyist, at MLconf NYC 2017MLconf
 
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016MLconf
 
Jeff Bradshaw, Founder, Adaptris
Jeff Bradshaw, Founder, AdaptrisJeff Bradshaw, Founder, Adaptris
Jeff Bradshaw, Founder, AdaptrisMLconf
 
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016MLconf
 
Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...
Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...
Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...MLconf
 
Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016
Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016
Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016MLconf
 
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016MLconf
 
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016MLconf
 
Luna Dong, Principal Scientist, Amazon at MLconf Seattle 2017
Luna Dong, Principal Scientist, Amazon at MLconf Seattle 2017Luna Dong, Principal Scientist, Amazon at MLconf Seattle 2017
Luna Dong, Principal Scientist, Amazon at MLconf Seattle 2017MLconf
 
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017MLconf
 
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...MLconf
 
Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017
Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017
Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017MLconf
 
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016MLconf
 
Scott Clark, CEO, SigOpt, at The AI Conference 2017
Scott Clark, CEO, SigOpt, at The AI Conference 2017Scott Clark, CEO, SigOpt, at The AI Conference 2017
Scott Clark, CEO, SigOpt, at The AI Conference 2017MLconf
 
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016MLconf
 
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...MLconf
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016MLconf
 
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017MLconf
 
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016MLconf
 
Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017
Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017
Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017MLconf
 

Destacado (20)

Ben Lau, Quantitative Researcher, Hobbyist, at MLconf NYC 2017
Ben Lau, Quantitative Researcher, Hobbyist, at MLconf NYC 2017Ben Lau, Quantitative Researcher, Hobbyist, at MLconf NYC 2017
Ben Lau, Quantitative Researcher, Hobbyist, at MLconf NYC 2017
 
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
 
Jeff Bradshaw, Founder, Adaptris
Jeff Bradshaw, Founder, AdaptrisJeff Bradshaw, Founder, Adaptris
Jeff Bradshaw, Founder, Adaptris
 
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
 
Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...
Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...
Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...
 
Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016
Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016
Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016
 
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
 
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
 
Luna Dong, Principal Scientist, Amazon at MLconf Seattle 2017
Luna Dong, Principal Scientist, Amazon at MLconf Seattle 2017Luna Dong, Principal Scientist, Amazon at MLconf Seattle 2017
Luna Dong, Principal Scientist, Amazon at MLconf Seattle 2017
 
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
 
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
 
Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017
Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017
Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017
 
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
 
Scott Clark, CEO, SigOpt, at The AI Conference 2017
Scott Clark, CEO, SigOpt, at The AI Conference 2017Scott Clark, CEO, SigOpt, at The AI Conference 2017
Scott Clark, CEO, SigOpt, at The AI Conference 2017
 
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
 
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
 
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017
Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017
 
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
 
Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017
Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017
Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017
 

Similar a Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017

know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfhemangppatel
 
Machine Learning ebook.pdf
Machine Learning ebook.pdfMachine Learning ebook.pdf
Machine Learning ebook.pdfHODIT12
 
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 11_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1MostafaHazemMostafaa
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningAI Summary
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptx06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptxSaharA84
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesXavier Rafael Palou
 
Machine learning cyphort_malware_most_wanted
Machine learning cyphort_malware_most_wantedMachine learning cyphort_malware_most_wanted
Machine learning cyphort_malware_most_wantedCyphort
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepSanjanaSaxena17
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Jonathan Stray
 
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...IJCNCJournal
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkDessy Amirudin
 
coppin chapter 10e.ppt
coppin chapter 10e.pptcoppin chapter 10e.ppt
coppin chapter 10e.pptbutest
 
Game theory for neural networks
Game theory for neural networksGame theory for neural networks
Game theory for neural networksDavid Balduzzi
 

Similar a Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017 (20)

know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdf
 
Machine Learning ebook.pdf
Machine Learning ebook.pdfMachine Learning ebook.pdf
Machine Learning ebook.pdf
 
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 11_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptx06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptx
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniques
 
Machine learning cyphort_malware_most_wanted
Machine learning cyphort_malware_most_wantedMachine learning cyphort_malware_most_wanted
Machine learning cyphort_malware_most_wanted
 
1 public embedd
1 public embedd1 public embedd
1 public embedd
 
related
relatedrelated
related
 
main
mainmain
main
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
 
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
 
Ml ppt at
Ml ppt atMl ppt at
Ml ppt at
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
coppin chapter 10e.ppt
coppin chapter 10e.pptcoppin chapter 10e.ppt
coppin chapter 10e.ppt
 
tutorial.ppt
tutorial.ppttutorial.ppt
tutorial.ppt
 
Game theory for neural networks
Game theory for neural networksGame theory for neural networks
Game theory for neural networks
 

Más de MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

Más de MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Último

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017

  • 1. Differential Privacy And Machine Learning Aaron Roth March 24, 2017
  • 2. Protecting Privacy is Important Class action lawsuit accuses AOL of violating the Electronic Communications Privacy Act, seeks $5,000 in damages per user. AOL’s director of research is fired.
  • 3. Protecting Privacy is Important Class action lawsuit (Doe v. Netflix) accuses Netflix of violating the Video Privacy Protection Act, seeks $2,000 in compensation for each of Netflix’s 2,000,000 subscribers. Settled for undisclosed sum, 2nd Netflix Challenge is cancelled.
  • 4. Protecting Privacy is Important The National Human Genome Research Institute (NHGRI) immediately restricted pooled genomic data that had previously been publically available.
  • 5. So What is Differential Privacy?
  • 6. So What is Differential Privacy? • Differential Privacy is about promising people freedom from harm. “An analysis of a dataset D is differentially private if the data analyst knows almost no more about Alice after the analysis than he would have known had he conducted the same analysis on an identical data set with Alice’s data removed.”
  • 7. D Differential Privacy [Dwork-McSherry-Nissim-Smith 06] Algorithm Pr [r] ratio bounded Alice Bob Chris Donna ErnieXavier
  • 8. Differential Privacy 𝑋: The data universe. 𝐷 ⊂ 𝑋: The dataset (one element per person) Definition: Two datasets 𝐷, 𝐷′ ⊂ 𝑋 are neighbors if they differ in the data of a single individual.
  • 9. Differential Privacy 𝑋: The data universe. 𝐷 ⊂ 𝑋: The dataset (one element per person) Definition: An algorithm 𝑀 is 𝜖-differentially private if for all pairs of neighboring datasets 𝐷, 𝐷′ , and for all outputs x: Pr 𝑀 𝐷 = 𝑥 ≤ (1 + 𝜖) Pr 𝑀 𝐷′ = 𝑥
  • 10. Some Useful Properties Theorem (Postprocessing): If 𝑀(𝐷) is 𝜖-private, and 𝑓 is any (randomized) function, then 𝑓(𝑀 𝐷 ) is 𝜖-private.
  • 11. So… 𝑥 = Definition: An algorithm 𝑀 is 𝜖-differentially private if for all pairs of neighboring datasets 𝐷, 𝐷′ , and for all outputs x: Pr 𝑀 𝐷 = 𝑥 ≤ (1 + 𝜖) Pr 𝑀 𝐷′ = 𝑥
  • 12. So… 𝑥 = Definition: An algorithm 𝑀 is 𝜖-differentially private if for all pairs of neighboring datasets 𝐷, 𝐷′ , and for all outputs x: Pr 𝑀 𝐷 = 𝑥 ≤ (1 + 𝜖) Pr 𝑀 𝐷′ = 𝑥
  • 13. So… 𝑥 = Definition: An algorithm 𝑀 is 𝜖-differentially private if for all pairs of neighboring datasets 𝐷, 𝐷′ , and for all outputs x: Pr 𝑀 𝐷 = 𝑥 ≤ (1 + 𝜖) Pr 𝑀 𝐷′ = 𝑥
  • 14. Some Useful Properties Theorem (Composition): If 𝑀1, … , 𝑀 𝑘 are 𝜖-private, then: 𝑀 𝐷 ≡ (𝑀1 𝐷 , … , 𝑀 𝑘 𝐷 ) is 𝑘𝜖-private.
  • 15. So… You can go about designing algorithms as you normally would. Just access the data using differentially private “subroutines”, and keep track of your “privacy budget” as a resource. Private algorithm design, like regular algorithm design, can be modular.
  • 16. Some simple operations: Answering Numeric Queries Def: A numeric function 𝑓 has sensitivity 𝑐 if for all neighboring 𝐷, 𝐷′ : 𝑓 𝐷 − 𝑓 𝐷′ ≤ 𝑐 Write 𝑠 𝑓 ≡ 𝑐 • e.g. “How many software engineers are in the room?” has sensitivity 1. • “What fraction of people in the room are software engineers?” has sensitivity 1 𝑛 .
  • 17. Some simple operations: Answering Numeric Queries The Laplace Mechanism: 𝑀𝐿𝑎𝑝 𝐷, 𝑓, 𝜖 = 𝑓 𝐷 + 𝐿𝑎𝑝 𝑠 𝑓 𝜖 Theorem: 𝑀𝐿𝑎𝑝(⋅, 𝑓, 𝜖) is 𝜖-private.
  • 18. Some simple operations: Answering Numeric Queries The Laplace Mechanism: 𝑀𝐿𝑎𝑝 𝐷, 𝑓, 𝜖 = 𝑓 𝐷 + 𝐿𝑎𝑝 𝑠 𝑓 𝜖 Theorem: The expected error is 𝑠 𝑓 𝜖 (can answer “what fraction of people in the room are software engineers?” with error 0.2%)
  • 19. Some simple operations: Answering Non-numeric Queries “What is the modal eye color in the room?” 𝑅 = {Blue, Green, Brown, Red} • If you can define a function that determines how “good” each outcome is for a fixed input: – E.g. 𝑞 𝐷, Red =“fraction of people in D with red eyes”
  • 20. Some simple operations: Answering Non-numeric Queries 𝑀 𝐸𝑥𝑝(𝐷, 𝑅, 𝑞, 𝜖): Output 𝑟 ∈ 𝑅 w.p. ∝ 𝑒2𝜖⋅𝑞 𝐷,𝑟 Theorem: 𝑀 𝐸𝑥𝑝(𝐷, 𝑅, 𝑞, 𝜖) is 𝑠 𝑞 ⋅ 𝜖-private, and outputs 𝑟 ∈ 𝑅 such that: 𝐸 𝑞 𝐷, 𝑟 − max 𝑟∗∈𝑅 𝑞 𝐷, 𝑟∗ ≤ 2𝑠 𝑞 𝜖 ⋅ ln 𝑅 (can find a color that has frequency within 0.5% of the modal color in the room)
  • 21. So what can we do with that? Empirical Risk Minimization: *i.e. almost all of supervised learning Find 𝜃 to minimize: 𝐿 𝜃 = 𝑖=1 𝑛 ℓ 𝜃, 𝑥𝑖, 𝑦𝑖
  • 22. So what can we do with that? Empirical Risk Minimization: Simple Special Case: Linear Regression Find 𝜃 ∈ 𝑅 𝑑 to minimize: 𝑖=1 𝑛 𝜃, 𝑥𝑖 − 𝑦𝑖 2
  • 23. So what can we do with that? • First Attempt: Try the exponential mechanism! • Define 𝑞 𝜃, 𝐷 = − 𝑋𝜃 − 𝑦 𝑇 (𝑋𝜃 − 𝑦) • Output 𝜃 ∈ ℝ 𝑑 w.p. ∝ 𝑒2𝜖𝑞(𝜃,𝐷) – How? • In this case, reduces to: Output 𝜃∗ + 𝑁 0, 𝑋 𝑇 𝑋 −1 2𝜖 Where 𝜃∗ = arg min 𝑖=1 𝑛 𝜃, 𝑥𝑖 − 𝑦𝑖 2
  • 24. So what can we do with that? Empirical Risk Minimization: The General Case (e.g. deep learning) Find 𝜃 to minimize: 𝐿 𝜃 = 𝑖=1 𝑛 ℓ 𝜃, 𝑥𝑖, 𝑦𝑖
  • 25. Stochastic Gradient Descent Convergence depends on the fact that at each round: 𝔼 𝑔𝑡 = 𝛻L(𝜃) Let 𝜃1 = 0 𝑑 For 𝑡 = 1 to 𝑇: Pick 𝑖 at random. Let 𝑔𝑡 ← 𝛻ℓ 𝜃 𝑡, 𝑥𝑖, 𝑦𝑖 Let 𝜃 𝑡+1 ← 𝜃 𝑡 − 𝜂 ⋅ 𝑔𝑡
  • 26. Private Stochastic Gradient Descent Still have: 𝔼 𝑔𝑡 = 𝛻L 𝜃 ! (Can still prove convergence theorems, and run the algorithm…) Privacy guarantees can be computed from: 1) The privacy of the Laplace mechanism 2) Preservation of privacy under post-processing, and 3) Composition of privacy guarantees. Let 𝜃1 = 0 𝑑 For 𝑡 = 1 to 𝑇: Pick 𝑖 at random. Let 𝑔𝑡 ← 𝛻ℓ 𝜃 𝑡, 𝑥𝑖, 𝑦𝑖 + 𝐿𝑎𝑝 𝜎 𝑑 Let 𝜃 𝑡+1 ← 𝜃 𝑡 − 𝜂 ⋅ 𝑔𝑡
  • 27. What else can we do? • Statistical Estimation • Graph Analysis • Combinatorial Optimization • Spectral Analysis of Matrices • Anomaly Detection/Analysis of Data Streams • Convex Optimization • Equilibrium computation • Computation of optimal 1-sided and 2-sided matchings • Pareto Optimal Exchanges • …
  • 28. Differential Privacy ⇒ Learning Theorem*: An 𝜖-differentially private algorithm cannot overfit its training set by more than 𝜖. *Lots of interesting details missing! See our paper in Science.
  • 29. from numpy import * def Thresholdout(sample, holdout, q, sigma, threshold): sample_mean = mean([q(x) for x in sample]) holdout_mean = mean([q(x) for x in holdout]) if (abs(sample_mean - holdout_mean) < random.normal(threshold, sigma)): # q does not overfit return sample_mean else: # q overfits return holdout_mean + random.normal(0, sigma) thresholdout.py: Thresholdout [DFHPRR15]
  • 30. Reusable holdout example • Data set with 2n = 20,000 rows and d = 10,000 variables. Class labels in {-1,1} • Analyst performs stepwise variable selection: 1. Split data into training/holdout of size n 2. Select “best” k variables on training data 3. Only use variables also good on holdout 4. Build linear predictor out of k variables 5. Find best k = 10,20,30,…
  • 31. Classification after feature selection No signal: data are random gaussians labels are drawn independently at random from {-1,1} Thresholdout correctly detects overfitting!
  • 32. Strong signal: 20 features are mildly correlated with target remaining attributes are uncorrelated Thresholdout correctly detects right model size! Classification after feature selection
  • 33. So… • Differential privacy provides: – A rigorous, provable guarantee with a strong privacy semantics. – A set of tools and composition theorems that allow for modular, easy design of privacy preserving algorithms. – Protection against overfitting even when privacy is not a concern.
  • 34. Thanks! To learn more: • Our textbook on differential privacy: – Available for free on my website: http://www.cis.upenn.edu/~aaroth • Connections between Privacy and Overfitting: – Dwork, Feldman, Hardt, Pitassi, Reingold, Roth, “The Reusable Holdout: Preserving Validity in Adaptive Data Analysis”, Science, August 7 2015. – Dwork, Feldman, Hardt, Pitassi, Reingold, Roth, “Preserving Statistical Validity in Adaptive Data Analysis”, STOC 2015. – Bassily, Nissim, Stemmer, Smith, Steinke, Ullman, “Algorithmic Stability for Adaptive Data Analysis”, STOC 2016. – Rogers, Roth, Smith, Thakkar, “Max Information, Differential Privacy, and Post-Selection Hypothesis Testing”, FOCS 2016. – Cummings, Ligett, Nissim, Roth, Wu, “Adaptive Learning with Robust Generalization Guarantees”, COLT 2016.