SlideShare una empresa de Scribd logo
1 de 3
Descargar para leer sin conexión
Data mining
Assignment week 2
Exercise 1: Probabilities
How can Bayes' rule be derived from simpler definitions, such as the definition of conditional
probability, symmetry of joint probability, the chain rule? Give a step-wise derivation,
mentioning which rule you applied at each step.
	
  
We have a set of possible outcomes for values of x and y:


  x = { x1, x2…,xn }
  y = { y1, y2…,yn }



We need to show how the Bayes rule is implemented. The Bayes rule is as following:


  P( X = x | Y = y ) = P( Y = y | X = x ) * P(X = x) / P(Y = y)



We use the chain rule:


  P(X = x , Y = y)

  Joint                            = condition        * marginal distance
  P(X = x, Y = y)                  = P(Y = y | X = x) * P(X = x)

  P(Y = y | X = x) * P(Y = y) = P(Y = y | X = x) * P(X = x)

  In conclusion: P(X = x) = P(x)
Exercise 2: Entropy
2.1 Assume a variable X with three possible values: a, b, and c. If p(a) = 0:4, and
p(b) = 0:25, what is the entropy of of X, i.e., what is H(X)?

To know the probability for C we calculate the P.


  P(total)   =   1
  P(a)       =   0.4
  P(b)       =   0.25
  P(c)       =   P(total) – P(a) – P(b)
  P(c)       =   0.35



Now we calculate the Entropy by using all probabilities:


H = 0.4 log2(0.4) + 0.25 log2(0.25)+ 0.35log2(0.35)
H = 1.5589



2.2 Assume a variable X with three possible values: a, b, and c. What is the probability
distribution with the highest entropy? Which one(s) has/have the lowest one? Explain in a
sentence or two and in your in own words why these distributions have the highest and lowest
entropies.

We need to see what ‘P’ value is responsible for the highest entropy (so the maximum uncertainty).

If we don’t know anything about the values ‘a’, ‘b’ and ‘c’ then we can give now prediction on the
possible chances of having any of these values. Because of this we can state that these values are
indistinguishable. So the change of having an ‘a’-value is equal to the ‘b’ and ‘c’. We call this uniform
distribution.


  P(a) = P(b) = P(c)
  P(total) = P(a) – P(b) + P(c)

  P(total) = 1
  P(a) – P(b) + P(c) = 1/3



The lowest entropy would be when we know on forehand which value will be the outcome. So there
should be a 100% of having a ‘a’, ‘b’ or ‘c’ value.

2.3 In general, if a variable X has n possible values, what is the maximum entropy?

We can just sum up the change for P, we only need a uniform distribution:


  P(x) = 1/ni
  i    = 1, 2, …, n

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

3.6 applications in optimization
3.6 applications in optimization3.6 applications in optimization
3.6 applications in optimization
 
properties of multiplication of integers
properties of multiplication of integersproperties of multiplication of integers
properties of multiplication of integers
 
multiplication of integers
multiplication of integersmultiplication of integers
multiplication of integers
 
Factor theorem
Factor theoremFactor theorem
Factor theorem
 
Sample Space And Events
Sample Space And EventsSample Space And Events
Sample Space And Events
 
2 integration and the substitution methods x
2 integration and the substitution methods x2 integration and the substitution methods x
2 integration and the substitution methods x
 
Taylor series
Taylor seriesTaylor series
Taylor series
 
Remainder theorem
Remainder theoremRemainder theorem
Remainder theorem
 
Set Operations
Set OperationsSet Operations
Set Operations
 
Numerical methods presentation 11 iteration method
Numerical methods presentation 11 iteration methodNumerical methods presentation 11 iteration method
Numerical methods presentation 11 iteration method
 
Remainder theorem
Remainder theoremRemainder theorem
Remainder theorem
 
3.2 Power sets
3.2 Power sets3.2 Power sets
3.2 Power sets
 
Taylor series in 1 and 2 variable
Taylor series in 1 and 2 variableTaylor series in 1 and 2 variable
Taylor series in 1 and 2 variable
 
Linear equations in two variables
Linear equations in two variablesLinear equations in two variables
Linear equations in two variables
 
4.5 continuous functions and differentiable functions
4.5 continuous functions and differentiable functions4.5 continuous functions and differentiable functions
4.5 continuous functions and differentiable functions
 
Maxima and minima
Maxima and minimaMaxima and minima
Maxima and minima
 
Slides lln-risques
Slides lln-risquesSlides lln-risques
Slides lln-risques
 
Project in Calcu
Project in CalcuProject in Calcu
Project in Calcu
 
Newton raphson method
Newton raphson methodNewton raphson method
Newton raphson method
 
Polynomials
PolynomialsPolynomials
Polynomials
 

Similar a Data mining assignment 2

Similar a Data mining assignment 2 (20)

Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheet
 
Unit II PPT.pptx
Unit II PPT.pptxUnit II PPT.pptx
Unit II PPT.pptx
 
Slides ACTINFO 2016
Slides ACTINFO 2016Slides ACTINFO 2016
Slides ACTINFO 2016
 
Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheet
 
Probability Cheatsheet.pdf
Probability Cheatsheet.pdfProbability Cheatsheet.pdf
Probability Cheatsheet.pdf
 
Probability Formula sheet
Probability Formula sheetProbability Formula sheet
Probability Formula sheet
 
0202 fmc3
0202 fmc30202 fmc3
0202 fmc3
 
Probability Theory.pdf
Probability Theory.pdfProbability Theory.pdf
Probability Theory.pdf
 
Lemh1a1
Lemh1a1Lemh1a1
Lemh1a1
 
pattern recognition
pattern recognition pattern recognition
pattern recognition
 
Bayesian statistics
Bayesian statisticsBayesian statistics
Bayesian statistics
 
HPWFcorePRES--FUR2016
HPWFcorePRES--FUR2016HPWFcorePRES--FUR2016
HPWFcorePRES--FUR2016
 
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
 
Approximation Methods Of Solutions For Equilibrium Problem In Hilbert Spaces
Approximation Methods Of Solutions For Equilibrium Problem In Hilbert SpacesApproximation Methods Of Solutions For Equilibrium Problem In Hilbert Spaces
Approximation Methods Of Solutions For Equilibrium Problem In Hilbert Spaces
 
Microeconomics Theory Exam Help
Microeconomics Theory Exam HelpMicroeconomics Theory Exam Help
Microeconomics Theory Exam Help
 
Deep learning .pdf
Deep learning .pdfDeep learning .pdf
Deep learning .pdf
 
Ch5
Ch5Ch5
Ch5
 
Microeconomics Theory Homework Help
Microeconomics Theory Homework HelpMicroeconomics Theory Homework Help
Microeconomics Theory Homework Help
 
Statistics Homework Help
Statistics Homework HelpStatistics Homework Help
Statistics Homework Help
 
polynomials_.pdf
polynomials_.pdfpolynomials_.pdf
polynomials_.pdf
 

Más de BarryK88

Data mining test notes (back)
Data mining test notes (back)Data mining test notes (back)
Data mining test notes (back)BarryK88
 
Data mining test notes (front)
Data mining test notes (front)Data mining test notes (front)
Data mining test notes (front)BarryK88
 
Data mining Computerassignment 3
Data mining Computerassignment 3Data mining Computerassignment 3
Data mining Computerassignment 3BarryK88
 
Data mining assignment 4
Data mining assignment 4Data mining assignment 4
Data mining assignment 4BarryK88
 
Data mining assignment 3
Data mining assignment 3Data mining assignment 3
Data mining assignment 3BarryK88
 
Data mining assignment 5
Data mining assignment 5Data mining assignment 5
Data mining assignment 5BarryK88
 
Data mining assignment 6
Data mining assignment 6Data mining assignment 6
Data mining assignment 6BarryK88
 
Data mining assignment 1
Data mining assignment 1Data mining assignment 1
Data mining assignment 1BarryK88
 
Data mining Computerassignment 2
Data mining Computerassignment 2Data mining Computerassignment 2
Data mining Computerassignment 2BarryK88
 
Data mining Computerassignment 1
Data mining Computerassignment 1Data mining Computerassignment 1
Data mining Computerassignment 1BarryK88
 
Semantic web final assignment
Semantic web final assignmentSemantic web final assignment
Semantic web final assignmentBarryK88
 
Semantic web assignment 3
Semantic web assignment 3Semantic web assignment 3
Semantic web assignment 3BarryK88
 
Semantic web assignment 2
Semantic web assignment 2Semantic web assignment 2
Semantic web assignment 2BarryK88
 
Semantic web assignment1
Semantic web assignment1Semantic web assignment1
Semantic web assignment1BarryK88
 

Más de BarryK88 (14)

Data mining test notes (back)
Data mining test notes (back)Data mining test notes (back)
Data mining test notes (back)
 
Data mining test notes (front)
Data mining test notes (front)Data mining test notes (front)
Data mining test notes (front)
 
Data mining Computerassignment 3
Data mining Computerassignment 3Data mining Computerassignment 3
Data mining Computerassignment 3
 
Data mining assignment 4
Data mining assignment 4Data mining assignment 4
Data mining assignment 4
 
Data mining assignment 3
Data mining assignment 3Data mining assignment 3
Data mining assignment 3
 
Data mining assignment 5
Data mining assignment 5Data mining assignment 5
Data mining assignment 5
 
Data mining assignment 6
Data mining assignment 6Data mining assignment 6
Data mining assignment 6
 
Data mining assignment 1
Data mining assignment 1Data mining assignment 1
Data mining assignment 1
 
Data mining Computerassignment 2
Data mining Computerassignment 2Data mining Computerassignment 2
Data mining Computerassignment 2
 
Data mining Computerassignment 1
Data mining Computerassignment 1Data mining Computerassignment 1
Data mining Computerassignment 1
 
Semantic web final assignment
Semantic web final assignmentSemantic web final assignment
Semantic web final assignment
 
Semantic web assignment 3
Semantic web assignment 3Semantic web assignment 3
Semantic web assignment 3
 
Semantic web assignment 2
Semantic web assignment 2Semantic web assignment 2
Semantic web assignment 2
 
Semantic web assignment1
Semantic web assignment1Semantic web assignment1
Semantic web assignment1
 

Data mining assignment 2

  • 2. Exercise 1: Probabilities How can Bayes' rule be derived from simpler definitions, such as the definition of conditional probability, symmetry of joint probability, the chain rule? Give a step-wise derivation, mentioning which rule you applied at each step.   We have a set of possible outcomes for values of x and y: x = { x1, x2…,xn } y = { y1, y2…,yn } We need to show how the Bayes rule is implemented. The Bayes rule is as following: P( X = x | Y = y ) = P( Y = y | X = x ) * P(X = x) / P(Y = y) We use the chain rule: P(X = x , Y = y) Joint = condition * marginal distance P(X = x, Y = y) = P(Y = y | X = x) * P(X = x) P(Y = y | X = x) * P(Y = y) = P(Y = y | X = x) * P(X = x) In conclusion: P(X = x) = P(x)
  • 3. Exercise 2: Entropy 2.1 Assume a variable X with three possible values: a, b, and c. If p(a) = 0:4, and p(b) = 0:25, what is the entropy of of X, i.e., what is H(X)? To know the probability for C we calculate the P. P(total) = 1 P(a) = 0.4 P(b) = 0.25 P(c) = P(total) – P(a) – P(b) P(c) = 0.35 Now we calculate the Entropy by using all probabilities: H = 0.4 log2(0.4) + 0.25 log2(0.25)+ 0.35log2(0.35) H = 1.5589 2.2 Assume a variable X with three possible values: a, b, and c. What is the probability distribution with the highest entropy? Which one(s) has/have the lowest one? Explain in a sentence or two and in your in own words why these distributions have the highest and lowest entropies. We need to see what ‘P’ value is responsible for the highest entropy (so the maximum uncertainty). If we don’t know anything about the values ‘a’, ‘b’ and ‘c’ then we can give now prediction on the possible chances of having any of these values. Because of this we can state that these values are indistinguishable. So the change of having an ‘a’-value is equal to the ‘b’ and ‘c’. We call this uniform distribution. P(a) = P(b) = P(c) P(total) = P(a) – P(b) + P(c) P(total) = 1 P(a) – P(b) + P(c) = 1/3 The lowest entropy would be when we know on forehand which value will be the outcome. So there should be a 100% of having a ‘a’, ‘b’ or ‘c’ value. 2.3 In general, if a variable X has n possible values, what is the maximum entropy? We can just sum up the change for P, we only need a uniform distribution: P(x) = 1/ni i = 1, 2, …, n