SlideShare una empresa de Scribd logo
1 de 109
Descargar para leer sin conexión
Digging into the Dirichlet
Max Sklar
@maxsklar

New York Machine Learning Meetup
December 19th, 2013
Dedication

Meyer Marks
1925 - 2013
The Dirichlet Distribution
Let’s start with something simpler
Let’s start with something simpler
A Pie Chart!
Let’s start with something simpler
A Pie Chart!
AKA
Discrete Distribution
Multinomial Distribution
Let’s start with something simpler
A Pie Chart!
K = The number of
categories.
K=5
Examples of Multinomial Distributions
Examples of Multinomial Distributions
Examples of Multinomial Distributions
Examples of Multinomial Distributions
Examples of Multinomial Distributions
What does the raw data look like?
What does the raw data look like?
id

# dislikes

1

231

23

2

Counts!

# likes

81

40

3

67

9

4

121

14

5

9

31

6

18

0

7

1

1
What does the raw data look like?
More specifically:
- K columns of counts
- N rows of data

id

# likes

# dislikes

1

231

23

2

81

40

3

67

9

4

121

14

5

9

31

6

18

0

7

1

1
BUT...

Counts != Multinomial Distribution
BUT...
We can estimate the multinomial
distribution with the counts, using
the maximum likelihood estimate

366

181

203
BUT...
We can estimate the multinomial
distribution with the counts, using
the maximum likelihood estimate

Sum =
366 + 181 + 203 =
750

366

181

203
BUT...
We can estimate the multinomial
distribution with the counts, using
the maximum likelihood estimate

366 / 750
181 / 750
203 / 750

366

181

203
BUT...
We can estimate the multinomial
distribution with the counts, using
the maximum likelihood estimate

48.8%
24.1%
27.1%

366

181

203
BUT...
366

203

1

Uh Oh

181
2

1
BUT...
366

This column
will be all
Yellow
right?

181

203

1

2

1

0

1

0
BUT...
366

203

1

Panic!!!!

181
2

1

0

1

0

0

0

0
Bayesian Statistics to the Rescue
Bayesian Statistics to the Rescue
Still assume each row was
generated by a
multinomial distribution
Bayesian Statistics to the Rescue
Still assume each row was
generated by a
multinomial distribution
We just don’t know
which one!
The Dirichlet Distribution
Is a probability
distribution over all
possible multinomial
distributions, p.
The Dirichlet Distribution

?

Represents our
uncertainty over the
actual distribution
that created the row.

?
The Dirichlet Distribution

p: represents a multinomial distribution
alpha: the parameters of the dirichlet
K: the number of categories
Bayesian Updates
Bayesian Updates

Also a Dirichlet!
Bayesian Updates

Also a Dirichlet!
(Conjugate Prior)
Bayesian Updates
Bayesian Updates
Bayesian Updates

+1
Why Does
this Work?
Let’s look at it
again.
Entropy
Entropy
Information Content
Entropy
Information Content
Energy
Entropy
Information Content
Energy
Log Likelihood
Entropy
Information Content
Energy
Log Likelihood
The Dirichlet Distribution
The Dirichlet Distribution

Normalizing Constant
The Dirichlet Distribution

Normalizing Constant
The Dirichlet Distribution
The Dirichlet Distribution
The Dirichlet Distribution
The Dirichlet Distribution

Linear
The Dirichlet MACHINE

Prior

1.2

3.0

0.3
The Dirichlet MACHINE

Prior

1.2

3.0

0.3
The Dirichlet MACHINE

Update

2.2

3.0

0.3
The Dirichlet MACHINE

Prior

2.2

3.0

0.3
The Dirichlet MACHINE

Update

2.2

3.0

1.3
The Dirichlet MACHINE

Prior

2.2

3.0

1.3
The Dirichlet MACHINE

Update

2.2

3.0

2.3
Interpreting the Parameters
1

3

2

What does this alpha
vector really mean?
Interpreting the Parameters
1

normalized

1/6

3/6

3

2

sum

2/6

6
Interpreting the Parameters
1

Expected
Value

3

2

6
Interpreting the Parameters
1

Expected
Value

3

2

6

Weight
ANALOGY: Normal Distribution

Precision =
1 / variance
ANALOGY: Normal Distribution
High precision:
data is close to
the mean
Low precision:
far away from
the mean
Interpreting the Parameters
1

Expected
Value

3

2

6

Precision
High Weight Dirichlet
0.4

0.4

0.2
Low Weight Dirichlet
0.4

0.4

0.2
Urn Model
At each step,
pick a ball from
the urn..
Replace it, and
add another ball
of that color into
the urn
Urn Model
At each step,
pick a ball from
the urn..
Replace it, and
add another ball
of that color into
the urn
Urn Model
At each step,
pick a ball from
the urn..
Replace it, and
add another ball
of that color into
the urn
Urn Model
At each step,
pick a ball from
the urn..
Replace it, and
add another ball
of that color into
the urn
Urn Model
At each step,
pick a ball from
the urn..
Replace it, and
add another ball
of that color into
the urn
Urn Model
Rich get richer...
Urn Model
Rich get richer...
Urn Model
Finally yellow
catches a break
Urn Model
Finally yellow
catches a break
Urn Model
But it’s too late...
Urn Model
As the urn gets
more populated,
the distribution
gets “stuck” in
place.
Urn Model
Once lots of data
has been collected,
or the dirichlet has
high precision, it’s
hard to overturn that
with new data
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
The expected
infinite
distribution
(mean) is
exponential.
# of white balls
controls the
exponent
So coming back to the count data:
20

0

2

What Dirichlet
parameters explain
the data?

0
1

17

14

6

0

15

5

0

0

20

0

0

14

6
So coming back to the count data:
20

0

2

Newton’s Method:
Requires Gradient
+ Hessian

0
1

17

14

6

0

15

5

0

0

20

0

0

14

6
So coming back to the count data:
20

0

2

Reads all of the
data...

0
1

17

14

6

0

15

5

0

0

20

0

0

14

6
So coming back to the count data:
20

https://github.
com/maxsklar/Baye
sPy/tree/master/Con
jugatePriorTools

0

0

2

1

17

14

6

0

15

5

0

0

20

0

0

14

6
So coming back to the count data:
Compress the data
into a Matrix and a
Vector:
Works for lots of
sparsely populated
rows

20

0

0

2

1

17

14

6

0

15

5

0

0

20

0

0

14

6
The Compression MACHINE

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0
The Compression MACHINE
K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0
The Compression MACHINE
1

0

0

K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0
The Compression MACHINE
K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0
The Compression MACHINE
1

3

2

K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0
The Compression MACHINE
K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

2

0

0

0

0

0

1

1

1

0

0

0

1

1

0

0

0

0

2

1

1

1

1

1
The Compression MACHINE
0

6

0

K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

2

0

0

0

0

0

1

1

1

0

0

0

1

1

0

0

0

0

2

1

1

1

1

1
The Compression MACHINE
K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

2

0

0

0

0

0

2

2

2

1

1

1

1

1

0

0

0

0

3

2

2

2

2

2
The Compression MACHINE
2

1

1

K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

2

0

0

0

0

0

2

2

2

1

1

1

1

1

0

0

0

0

3

2

2

2

2

2
The Compression MACHINE
K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

3

1

0

0

0

0

3

2

2

1

1

1

2

1

0

0

0

0

4

3

3

3

2

2
DEMO
Our Popularity Prior
Dirichlet Mixture Models
Anything you can
go with a
Gaussian, you
can also do with a
Dirichlet
Dirichlet Mixture Models
Example:
Mixture of
Gaussians using
ExpectationMaximization
Dirichlet Mixture Models
Assume each row is
a mixture of
multinomials.
And the parameters
of that mixture are
pulled from a
Dirichlet.

?
Dirichlet Mixture Models
Latent Dirichlet
Allocation
Topic Model
Questions

Más contenido relacionado

La actualidad más candente

word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vecword2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec👋 Christopher Moody
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysishktripathy
 
Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesAbhishekKumar4995
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)Learnbay Datascience
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer PerceptronsESCOM
 
Neural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear RegressionNeural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear RegressionMostafa G. M. Mostafa
 
K means Clustering
K means ClusteringK means Clustering
K means ClusteringEdureka!
 
Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is funZhen Li
 
Data mining with differential privacy
Data mining with differential privacy Data mining with differential privacy
Data mining with differential privacy Wei-Yuan Chang
 
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo SupervisionPR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo SupervisionSungchul Kim
 
Scikit Learn intro
Scikit Learn introScikit Learn intro
Scikit Learn intro9xdot
 
Data Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data AnalysisData Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data AnalysisEva Durall
 
Resilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARKResilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARKTaposh Roy
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2izahn
 
Introduction to Rstudio
Introduction to RstudioIntroduction to Rstudio
Introduction to RstudioOlga Scrivner
 

La actualidad más candente (20)

word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vecword2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysis
 
Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT Slides
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
Neural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear RegressionNeural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear Regression
 
K means Clustering
K means ClusteringK means Clustering
K means Clustering
 
Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is fun
 
Data mining with differential privacy
Data mining with differential privacy Data mining with differential privacy
Data mining with differential privacy
 
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo SupervisionPR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
Scikit Learn intro
Scikit Learn introScikit Learn intro
Scikit Learn intro
 
Data Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data AnalysisData Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data Analysis
 
Resilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARKResilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARK
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component Analysis
 
K means clustring @jax
K means clustring @jaxK means clustring @jax
K means clustring @jax
 
Introduction to Rstudio
Introduction to RstudioIntroduction to Rstudio
Introduction to Rstudio
 
Data Mining: Association Rules Basics
Data Mining: Association Rules BasicsData Mining: Association Rules Basics
Data Mining: Association Rules Basics
 

Destacado

A Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's DatasetsA Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's DatasetsOscar Corcho
 
NIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentationNIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentationKieran Campbell
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Daniel Lewis
 
אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל
 אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל
אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראלElad Goldenberg
 
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)Jin Young Kim
 
Image segmentation hj_cho
Image segmentation hj_choImage segmentation hj_cho
Image segmentation hj_choHyungjoo Cho
 
Normalization 방법
Normalization 방법 Normalization 방법
Normalization 방법 홍배 김
 
[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민NAVER D2
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Modelspetitegeek
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesAnirudh Koul
 

Destacado (11)

A Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's DatasetsA Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's Datasets
 
NIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentationNIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentation
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
 
אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל
 אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל
אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל
 
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)
 
Image segmentation hj_cho
Image segmentation hj_choImage segmentation hj_cho
Image segmentation hj_cho
 
Normalization 방법
Normalization 방법 Normalization 방법
Normalization 방법
 
[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Models
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile Phones
 

Similar a Dirichlet Distribution Explained

Number system
Number systemNumber system
Number systemAmit Shaw
 
Reliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy conditionReliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy conditionShahrukh Ali Khan
 
2012 2013 3rd 9 weeks midterm review
2012 2013 3rd 9 weeks midterm review2012 2013 3rd 9 weeks midterm review
2012 2013 3rd 9 weeks midterm reviewarinedge
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Spencer Fox
 
Representation Of Data
Representation Of DataRepresentation Of Data
Representation Of Datagavhays
 
Ezgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with REzgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with RRehgan Avon
 
Dee 2034 chapter 1 number and code system (Baia)
Dee 2034 chapter 1 number and code system (Baia)Dee 2034 chapter 1 number and code system (Baia)
Dee 2034 chapter 1 number and code system (Baia)SITI SABARIAH SALIHIN
 
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...Prashant Borkar
 
Data repersentation.
Data repersentation.Data repersentation.
Data repersentation.Ritesh Saini
 
Chapter 1 number and code system sss
Chapter 1 number and code system sssChapter 1 number and code system sss
Chapter 1 number and code system sssBaia Salihin
 
Data Representation
Data RepresentationData Representation
Data RepresentationRick Jamil
 
An introduction to probabilistic data structures
An introduction to probabilistic data structuresAn introduction to probabilistic data structures
An introduction to probabilistic data structuresMiguel Ping
 
Beyond PFCount: Shrif Nada
Beyond PFCount: Shrif NadaBeyond PFCount: Shrif Nada
Beyond PFCount: Shrif NadaRedis Labs
 
Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10westy67968
 
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...REYBETH RACELIS
 

Similar a Dirichlet Distribution Explained (20)

Dynamic Itemset Counting
Dynamic Itemset CountingDynamic Itemset Counting
Dynamic Itemset Counting
 
Dynamic Itemset Counting
Dynamic Itemset CountingDynamic Itemset Counting
Dynamic Itemset Counting
 
Number system
Number systemNumber system
Number system
 
Cis435 week06
Cis435 week06Cis435 week06
Cis435 week06
 
Reliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy conditionReliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy condition
 
2012 2013 3rd 9 weeks midterm review
2012 2013 3rd 9 weeks midterm review2012 2013 3rd 9 weeks midterm review
2012 2013 3rd 9 weeks midterm review
 
Number system
Number systemNumber system
Number system
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
 
Representation Of Data
Representation Of DataRepresentation Of Data
Representation Of Data
 
Ezgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with REzgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with R
 
Dee 2034 chapter 1 number and code system (Baia)
Dee 2034 chapter 1 number and code system (Baia)Dee 2034 chapter 1 number and code system (Baia)
Dee 2034 chapter 1 number and code system (Baia)
 
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...
 
Data repersentation.
Data repersentation.Data repersentation.
Data repersentation.
 
Chapter 1 number and code system sss
Chapter 1 number and code system sssChapter 1 number and code system sss
Chapter 1 number and code system sss
 
Data Representation
Data RepresentationData Representation
Data Representation
 
An introduction to probabilistic data structures
An introduction to probabilistic data structuresAn introduction to probabilistic data structures
An introduction to probabilistic data structures
 
Beyond PFCount: Shrif Nada
Beyond PFCount: Shrif NadaBeyond PFCount: Shrif Nada
Beyond PFCount: Shrif Nada
 
Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10
 
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...
 
Number system
Number systemNumber system
Number system
 

Más de Hakka Labs

Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Hakka Labs
 
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchDataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchHakka Labs
 
DataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceDataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceHakka Labs
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
DataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at InstacartDataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at InstacartHakka Labs
 
DataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleDataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleHakka Labs
 
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataDataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataHakka Labs
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale Hakka Labs
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQHakka Labs
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...Hakka Labs
 
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...Hakka Labs
 
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestDataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestHakka Labs
 
DataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringDataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringHakka Labs
 
DataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data StructuresDataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data StructuresHakka Labs
 
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkDataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkHakka Labs
 
DataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesDataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesHakka Labs
 
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityDataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityHakka Labs
 
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...Hakka Labs
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInHakka Labs
 
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopHakka Labs
 

Más de Hakka Labs (20)

Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)
 
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchDataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series search
 
DataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceDataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data Science
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
DataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at InstacartDataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at Instacart
 
DataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleDataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scale
 
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataDataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
 
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...
 
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestDataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
 
DataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringDataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineering
 
DataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data StructuresDataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data Structures
 
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkDataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
 
DataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesDataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with Ourselves
 
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityDataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
 
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
 
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
 

Último

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Último (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

Dirichlet Distribution Explained