Random Variables

1. Bernoulli trials If a set has n elements then the total number of its subsets consisting of k elements each equals n k = n! k!(n − k)! Example: If we place at random n points in the interval [0, T]. What is the probability that k of these points are in the interval (t1, t2)? solution: A = {a point is in the interval(t1, t2)} ⇒ P(A) = t2−t1 T , A occurs k times, it means that k of the n points AKU-EE/1-9/HA, 1st Semester, 85-86 – p.1/102

2. lie in the interval (t1, t2) and the rest; (n − k) points are outside of this interval with probability q = 1 − p Prob(A) = n k pk qn−k Example: An order of 104 parts is received. The probability of the even A that the part is defective equals 0.1. Find the probability that in 104 trials, A will occur at most 1000 times. Solution: p = 0.1, n = 104 Prob{0 6 n 6 1000} = 1000 X n=0 104 k 0.1k 0.9104−k AKU-EE/1-9/HA, 1st Semester, 85-86 – p.2/102

3. g(x) = 1 √ 2π e−x2/2 , G(x) = Z x −∞ g(y)dy G(∞) = 1, G(0) = 0.5, G(−∞) = 0, G(−x) = 1 − G(x) 1 σ √ 2π Z x2 x1 exp −(x − µ)2 2σ2 dx = G x2 − µ σ −G x1 − µ σ erf(x) = 1 √ 2π Z x 0 e−y2/2 dy = 1 √ 2π Z x −∞ e−y2/2 dy − Z 0 −∞ e−y2/2 dy = G(x) − 0.5 erfc(x) = 1 √ 2π Z ∞ x e−y2/2 dy, erf(x) = 2 √ 2π Z ∞ x e−y2/2 dy AKU-EE/1-9/HA, 1st Semester, 85-86 – p.3/102

4. Z e−x2/2 dx = Z −1 x e−x2/2 ′ dx Using integration by parts: R udv = uv − R vdu Q(x) = 1 √ 2π Z ∞ x e−y2/2 dy = e−x2/2 x √ 2π 1 − 1 x2 + 1 · 3 x4 − 1 · ·5 x6 + · · · Q(x) = 0.5erfc(x/ √ 2) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.4/102

5. Watson’s lemma I(x) = Z b 0 f(t)e−xt dt, b 0 Watson’s lemma gives the full asymptotic expansion of I(x) provided f(t) is continuous on the interval 0 6 t 6 b and f(t) has the asymptotic series expansion f(t) ∼ tα ∞ X n=0 antβn t → 0+ ⇒ I(x) ∼ ∞ X n=0 an Γ(α + βn + 1) xα+βn+1 , x → ∞ AKU-EE/1-9/HA, 1st Semester, 85-86 – p.5/102

6. Q(x) = 1 √ 2π Z ∞ x e−y2/2 dy, u = y − x = 1 √ 2π e−x2/2 Z ∞ 0 e−u2/2 e−xu du = 1 √ 2π e−x2/2 ∞ X n=0 (−1)n n!2n Z ∞ 0 u2n e−ux du Related topics: Laplace’s and Fourier’s methods, steepest descent and saddle point approximation AKU-EE/1-9/HA, 1st Semester, 85-86 – p.6/102

7. DeMoivre-Laplace theorem if npq 1, and np − √ npq 6 k 6 np + √ npq ⇒ n k pk qn−k ≈ 1 √ 2πnpq exp −(k − np)2 2npq Example: A fair coin is tossed 1000 times. Find the probability that heads show 510 times. n = 1000, p = 0.5, np = 500, √ npq = 5 √ 10 = 15.81 Solution: 1000 500 .5500 .5500 ≈ e−100/500 √ 2πnpq = 0.0207 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.7/102

8. Example: A fair coin is tossed 104 times. Find the probability that number of heads is between 4995 and 5005. n = 104 , np = 5000, npq = 50, √ npq = 7.07 Solution: Prob{4995 6 k 6 5005} = 5005 X k=4995 104 k 0.5k 0.5104−k k2 X k=k1 g k − np √ npq ≈ Z k2 k1 g x − np √ npq dx = G k2 − np √ npq − G k1 − np √ npq = 0.041 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.8/102

9. Law of large numbers An event A with Prob{A} = p occurs k times in n trials ⇒ k ≃ np , this is a heuristic statement. Let A denote an event whose probability of occurrence in a single trial is p. If k denotes the number of occurrences of A in n independent trials, then lim n→∞ Prob{k = np} ≃ 1 √ 2πnpq → 0 Never occurs! The approximation k ≃ np means that the ratio k/n is close to p in the sense that, for any ǫ 0, lim n→∞ Prob

13. k n − p

17. ǫ → 1, ∀ǫ 0 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.9/102

22. k n − p

26. ǫ ⇒ −ǫ 6 k n − p 6 ǫ ⇒ n(p − ǫ) 6 k 6 n(p + ǫ) Prob

30. k n − p

34. ǫ = Prob{k1 6 k 6 k2} k2 X k=k1 n k pk qn−k ≈ G k2 − np √ npq − G k1 − np √ npq , k2−np =nǫ, k1−np=nǫ = 2G nǫ √ npq − 1, G(−x) = 1 − G(x) lim n→∞ 2G nǫ √ npq − 1 = 2 − 1 = 1 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.10/102

35. Example: p = 0.5, ǫ = 0.05 n(p − ǫ) = 0.45n, n(p + ǫ) = 0.55n, ǫ r n pq = 0.1 √ n Solution: n 100 900 0.1 √ n 1 3 2G(0.1 √ n) − 1 0.682 0.997 The last row indicates that after 900 independent trials we may have some confidence in accepting k/n ≈ p AKU-EE/1-9/HA, 1st Semester, 85-86 – p.11/102

36. Generalized Bernoulli trials U = {A1 occurs k1 times, A2 occurs k2 times, · · · , Ar occurs kr times} The number of occurrence of U is n! k1!k2! · · · kr! , n = r X i=1 ki Since the trials are independent, the probability of each event is pk1 1 pk2 2 · · · pkr r ⇒ Prob{U} = n! k1!k2! · · · kr! pk1 1 pk2 2 · · · pkr r AKU-EE/1-9/HA, 1st Semester, 85-86 – p.12/102

37. Example: A fair die is rolled 10 times. Determine the probability that ones shows 3 times, and an even number shows 6 times. Solution: A1 = {1}, A2 = {2, 4, 6}, A3 = {3, 5} ⇒ p1 = 1 6 , p2 = 3 6 , p3 = 2 6 k1 = 3, k2 = 6, k3 = 1 Prob{U} = 10! 3!6!1! 1 6 3 1 2 6 1 3 = 0.0203 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.13/102

38. Poisson theorem Prob{An event A occurs k times in n trials} = n k pk qn−k if p 1 and n → ∞ ⇒ np ≈ npq 1. However, if np is of order 1, then the Gaussian approximation is no longer valid. We use the following n k pk qn−k ≈ e−np (np)k k! , Poisson theorem if k is of order np, then k n and kp 1 and n(n − 1)(n − 2) · · · (n − k + 1) ≈ n · n · n · · · n = nk and q = 1 − p ≈ e−p , qn−k ≈ e−(n−k)p ≈ e−np . Hence, we have n k pk qn−k ≈ e−np (np)k k! AKU-EE/1-9/HA, 1st Semester, 85-86 – p.14/102

39. Binomial(n, k)= as n → ∞, p → 0 n! (n − k)!k! λk nk 1 − λ n n−k √ 2πne−n nn √ 2π(n − k)n−k+0.5e−n+knk λk nk e−λ 1 (1 − λ n )nek λk k! e−λ AKU-EE/1-9/HA, 1st Semester, 85-86 – p.15/102

40. n → ∞, p → 0, np → a n k pk qn−k → e−a ak k! Example: A system contains 1000 components. Each component fails independently of the others and the probability its failure in one month equals 10−3 . Find the probability that the system will function at the end of one month. Solution: This can be considered as a problem in repeated trials with p = 10−3 , q = 0.999, n = 1000, k = 0 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.16/102

41. Prob{k = 0} = 1000 0 p0 q1000 = 0.9991000 , Exact Prob{k = 0} ≈ e−1 (np)0 0! = 0.368 Applying the same idea as before: Prob{k1 6 k 6 k2} = k2 X k=k1 n k pk qn−k ≈ e−np k2 X k=k1 (np)k k! AKU-EE/1-9/HA, 1st Semester, 85-86 – p.17/102

42. Generalization of Poisson theorem Let’s assume A1, A2, · · · ,Am+1, are the m + 1 events of a partition with Prob{Ai} = pi, pm+1 = 1 − Pm i=1 pi. we can show that n! k1! · · · km+1! pk1 1 · · · p km+1 m+1 ≈ e−a1 ak1 1 k1! · · · e−am akm m km! where ai = npi. The reason for having m terms on the right hand side whereas m + 1 terms on the left hand side is pm+1 = 1 − Pm i=1 pi AKU-EE/1-9/HA, 1st Semester, 85-86 – p.18/102

43. Random Poisson points n random points in the interval (−T/2, T/2). Prob{k points in ta = t2 − t1} = n k pk qn−k , p = ta T If n, T → ∞, Prob{k points in ta =} ≈ e−nta T (nta T )k k! If λ = n/T, the rate at which the events occur, is constant, the resulting process is an infinite set of points covering the entire t axis from −∞ to ∞. Prob{k points in ta} = e−λta (λta)k k! AKU-EE/1-9/HA, 1st Semester, 85-86 – p.19/102

44. Points in non overlapping intervals Let’s consider the interval (−T/2, T/2) containing n points, and two non-overlapping intervals ta tb Prob{ka points in ta, kb points in tb} = n! ka!kb!kc! ta T ka tb T kb (1− ta T − tb T )kc Suppose now that, λ = n/T, n, T → ∞ we have nta/T = λta, ntb/T = λtb we can conclude that Prob{ka points in ta, kb points in tb} ≈ e−λta (λta)ka ka! e−λtb (λtb)kb kb! AKU-EE/1-9/HA, 1st Semester, 85-86 – p.20/102

45. Prob{ka points in ta,kb points in tb}= Prob{ka points in ta} Prob{kb points in tb} then the events {ka in ta} and {kb in tb} are independent. These outcomes are called random Poisson points. Properties: 1. Prob{ka points in ta} = e−λta (λta)ka ka! 2. if two intervals (t1, t2) and (t3, t4) are non-overlapping then the events in these intervals are independent. Telephone calls, car crossing a bridge, shot noise, . . . AKU-EE/1-9/HA, 1st Semester, 85-86 – p.21/102

46. Baye’s theorem Let’s assume we have a pile of m coins. The probability of “heads” of the ith coin equals pi. We select from this pile one coin and we toss it n times. We observe that heads show k times. On the basis of this observation, we find the probability Xr we selected the rth coin Prob{rth coin selected and that heads showed up k times} = Prob{rth coin|k heads} = pk r (1 − pr)n−k Pm i=1 pk i (1 − pi)n−k = Prob{k heads|rth coin}Prob{rth coin} Prob{k heads} Prob{k heads} = m X i=1 Prob{k heads|ith coin}Prob{ith coin} AKU-EE/1-9/HA, 1st Semester, 85-86 – p.22/102

47. Example: Number of showing, k=490, number of tossing, n=1000, number of coins, m=10, the specific coin, rth, r=5. Xr = Prob{5th coin out of 10 coins that 490 times heads showed up in 1000 tossing} Solution: p1 = p2 = · · · = p10 = 0.5, Prob{ith coin} = 0.1 Xr = p490 5 (1 − p5)510 1 10 P10 i=1 p490 i (1 − pi)1000−490 1 10 = 0.1 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.23/102

48. Random variable A random variable is a number assigned to every outcome of an experiment. Prob{x 6 x} of an event {x 6 x} is a number that depends on x. This number is denoted by Fx(x) and is called CDF of RV x. Properties of CDF: 1. F(∞) = 1, F(−∞) = 0 2. x1 6 x2 ⇒ F(x1) 6 F(x2) 3. Prob{x} = 1 − Fx(x) 4. F(x+ ) = F(x), F(x) is continuous from the right 5. Prob{x1 6 x 6 x2} = F(x2) − F(x1) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.24/102

49. We say the statistics of an RV are known if we can determine the Prob{x ∈ S} We say that an RV x is continuous type if Fx(x) is continuous. We say that an RV x is discrete type if Fx(x) is staircase. We say that an RV x is mixed type if Fx(x) is a combination of continuous and staircase function. fx(x) = d dx Fx(x) is the PDF for a continuous random variable, and for a discrete random variable f(x) = P i piδ(x − xi) where pi=Prob{x = xi} AKU-EE/1-9/HA, 1st Semester, 85-86 – p.25/102

50. Properties: 1. f(x) 0, F(x) = R x −∞ f(ξ)dξ, F(x2) − F(x1) = R x2 x1 f(x)dx 2. Prob{x 6 x 6 x + ∆x} ≈ f(x)∆x 3. f(x) = lim ∆x→0 Prob{x 6 x 6 x + ∆x} ∆x The mode or the most likely value of x is where f(x) is maximum. An RV is unimodal if it has only a single mode. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.26/102

51. Special RV Normal: f(x) = 1 σ √ 2π exp −(x−η)2 2σ2 Uniform: f(x) = 1 x2−x1 , x1 6 x 6 x2 0, otherwise Rayleigh: f(x) = x σ2 e−x2/2σ2 , x 0 Lognormal: f(x) = 1 σx √ 2π e− (ln x−η)2 2σ2 , x 0 Cauchy: f(x) = 1 π(x2 + 1) Gamma: f(x) = cb+1 Γ(b + 1) xb e−cx , x 0, Γ(b+1) = bΓ(b), if b=an integer, it is called Erlang density. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.27/102

52. Laplace: f(x) = 0.5e−|x| Chi and Chi-square: χ = pPn i=1 x2 i , y = χ2 , f(χ) = 2aχn−1 e−χ2/2σ2 f(y) = ayn/2−1 e−y/2σ2 , a = 1 Γ(n/2)(σ √ 2)2 Geometric: Prob{x = k} = pqk , k = 0, 1, · · · , ∞ Binomial: Prob{x = k} = n k pk (1 − p)n−k , k = 0, 1, · · · , n x is of lattice type and its density is a sum of impulses, f(x) = n X k=0 n k pk (1 − p)n−k δ(x − k) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.28/102

53. Negative Binomial: n + k − 1 k pn (1 − p)k , k = 0, 1, · · · , ∞ Poisson: Prob{x = k} = e−a ak k! , k = 0, 1, · · · The density function is Prob{x = k} = e−a ∞ X k=0 ak k! δ(x − k) Example1: Given a constant t0, we define a RV n such that its value equals the number of points in the interval (0, t0), find the probability that the number of points in this interval is k Solution: Prob{n = k} = e−λt0 (λt0)k k! AKU-EE/1-9/HA, 1st Semester, 85-86 – p.29/102

54. Example2: If t1 is the first random point to the right of the fixed point t0 and we define RV x as the distance from t0 to t1, determine the PDF and CDF for x Solution: F(x)=probability that there are at least one point between t0 and t0 + x, 1 − F(x) is the probability that there are no points=Prob{n = 0} = e−λx , F(x) = 1 − e−λx , f(x) = λe−λx u(x) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.30/102

55. Conditional distribution Baye’s rule: Prob{A|B} = Prob{AB} Prob{B} 6= 0 ⇒ conditional CDF: F(x|B) = Prob{x 6 x|B} = Prob{x 6 x, B} Prob{B} , {x 6 x, B} is the intersection of {x 6 x} and B F(∞|B) = 1, F(−∞|B) = 0, Prob{x1 6 x 6 x2|B} = F(x 6 x2|B) − F(x 6 x1|B) = Prob{x1 6 x 6 x2, B} Prob(B) Conditional PDF: f(x|B) = d dx F(x|B). To find F(x|B) in general we must know about the experiment. However, if B can be expressed in terms of x, then, for determination of F(x|B), knowledge of F(x) is enough AKU-EE/1-9/HA, 1st Semester, 85-86 – p.31/102

56. Important cases B = {x 6 a}, F(x|B) = Prob{x 6 x, x 6 a} Prob{x 6 a} , if x a ⇒ {x 6 x, x 6 a} = {x 6 a} ⇒ F(x|B) = 1, x a, if x a ⇒ {x 6 x, x 6 a} = {x 6 x} ⇒ F(x|x 6 a) = F(x) F(a) , x a f(x|x 6 a) = ( d dx {F(x|x 6 a)} = f(x) F(a) , x a 0, x a Example: Determine f(x||x − η| 6 kσ), x ∼ N(η; σ) Solution: −kσ + η 6 x 6 kσ + η ⇒ f(x||x − η| 6 kσ) = f(x) F(|x − η| 6 kσ) = N(η; σ) G(k) − G(−k) , if x ∋ |x − η| 6 kσ ⇒ f(x||x − η| 6 kσ) = 0 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.32/102

57. Total probability If {A1, A2, · · · , An} are disjoint and partition the whole space: Prob{x 6 x} = n X i=1 Prob{x 6 x|Ai}Prob(Ai) F(x) = n X i=1 F(x 6 x|Ai)Prob(Ai) f(x) = n X i=1 f(x 6 x|Ai)Prob(Ai) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.33/102

58. Gaussian mixture Binary case: f(x|B) = N(η1; σ1), f(x|B̄) = N(η2; σ2) ⇒ f(x) = pN(η1; σ1) + (1 − p)N(η2; σ2) f(x) is a multimodal distribution. Generally, we can have: f(x) = n X i=1 piN(ηi; σi), n X i=1 pi = 1 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.34/102

59. The Prob{A|x = x} cannot be defined. But, it can be defined as a limit. Prob{A|x1 6 x 6 x2} = Prob{x1 6 x 6 x2|A}Prob{A} Prob{x1 6 x 6 x2} = F(x2|A) − F(x1|A) F(x2) − F(x1) Prob{A} (1) Let x = x1 and x + ∆x = x2 and divide the numerator and denominator of (1) by ∆x → 0, then, we have Prob{A|x = x} = f(x|A) f(x) Prob{A} (2) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.35/102

60. From (2), we have f(x|A) = Prob{A|x = x} Prob{A} f(x) = Prob{A|x = x} R ∞ −∞ Prob{A|x = x}f(x) dx Example: A={k heads in n tossing in a specific order} where probability of a head showing, p, is a RV with PDF f(p). What is f(p|A) Solution: Prob{A|P = p} = pk (1 − p)n−k , P is a RV with f(p) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.36/102

61. From (2), we have f(p|A) = pk (1 − p)n−k f(p) R 1 0 pk(1 − p)n−kf(p) dp f(p|A) is called a posteriori density, and f(p) is called a priori density for RV P . AKU-EE/1-9/HA, 1st Semester, 85-86 – p.37/102

62. For large n, pk (1 − p)n−k has a sharp maximum at p = k/n. f(p)pk (1 − p)n−k is highly concentrated near p = k/n. If f(p) has a sharp peak at p = 0.5, the coin is reasonably fair, then, for moderate values of n, f(p)pk (1 − p)n−k has two peaks; one near p = k/n and the other near p = 0.5. As n increases sharpness of pk (1 − p)n−k prevails and the resulting a posteriori density f(p|A) has the maximum near k/n AKU-EE/1-9/HA, 1st Semester, 85-86 – p.38/102

63. If the probability of heads in coin tossing experiment is not a number, but an RV P with density f(p). In the experiment of the tossing of a randomly selected coin, show that Prob{head}= R 1 0 pf(p) dp Solution: A = {head} ⇒, the conditional probability of A is the probability of heads if the coin with P = p is tossed. In other words, Prob{A|P = p} = p R 1 0 Prob{A|P = p}f(p) dp = R 1 0 pf(p) dp = Prob{A} This is the probability that at the next tossing head will show. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.39/102

64. Example: If P is a uniform RV, determine the posteriori density. Solution: A={k heads in n tossing in a specific order} f(p|A) = pk (1 − p)n−k R 1 0 pk(1 − p)n−k dp = (n + 1)! k!(n − k)! pk (1 − p)n−k , Beta density AKU-EE/1-9/HA, 1st Semester, 85-86 – p.40/102

65. Example: Assuming that the coin was tossed n times and heads showed k times, what is the probability that at the next tossing heads would show? Solution: Z 1 0 pf(p|A) dp = (n + 1)! k!(n − k)! Z 1 0 p pk (1 − p)n−k dp = k + 1 n + 2 , almost the common sense! This is called the law of succession. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.41/102

66. Function of a RV y = g(x), x is a RV. F(y) = Prob{y 6 y} = Prob{g(x) 6 y} Example: y = ax + b, x ∼ f(x) Fy(y) = Prob{ax + b 6 y} = Prob x 6 y − b a , a 0 = Fx y − b a , a 0 = 1 − Fx y − b a , a 0 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.42/102

67. Example: y=x2 , x ∼ fx(x) Fy(y) = Prob{x2 6 y} = Prob{− √ y 6 x 6 √ y} = Fx( √ y) − Fx(− √ y), y 0, Fy(y) = 0, y 0 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.43/102

68. Example: Hard limiter y = g(x) = 1, x 0 −1, x 6 0 Fy(y) = Prob{y = 1} = Prob{x 0} = 1 − Fx(0) Fy(y) = Prob{y = −1} = Prob{x 6 0} = Fx(0) Example: Quantization y = g(x) = ns, (n − 1)s x ns Prob{y = ns} = Prob{(n−1)s x ns} = Fx(ns)−Fx((n−1)s) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.44/102

69. PDF determination y = g(x), fy(y) = n X i=1 fx(xi) |g′(xi)| , where xi are the roots of y = g(x). Example: y = ex , x ∼ N(0; σ2 ) There is only one roots: x = log y y = ex ⇒ g′ (x) = ex = y ⇒ fy(y) = fx(log y) y , y 0 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.45/102

70. If x is an arbitrary RV with continuous distribution Fx(x) and y = Fx(x) then y is a uniform RV in the interval [0, 1] If 0 y 1 then y = Fx(x) has only a single solution for x1. g′ (x) = F′ x(x) = fx(x) then fy(y) = fx(x1) |g′(x1)| = fx(x1) fx(x1) = 1, 0 y 1 If y 0 or y 1 then y = Fx(x) has no real solution then fy(y) = 0. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.46/102

71. Example Now, If we are given two distribution functions F1(x) and F2(y), Find a monotonically increasing function g(x) such that, if y = g(x) and Fx(x) = F1(x) then Fy(y) = F2(y). Solution We maintain that g(x) must be such that F2[g(x)] = F1(x) Fy(y) = Prob{y 6 y} = Prob{g(x) 6 g(x)} = Prob{x 6 x} = Fx(x) therefore, if a particular CDF Fy(y) is given then RV that with such CDF is: Because Fy(y) is a uniform RV then x ∼ Unif[0, 1] ⇒ y = F−1 y (x) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.47/102

72. For continuous RV: E(x) = η = Z ∞ −∞ xfx(x)dx For discrete type: fx(x) = X i piδ(x − xi), E(x) = X i pixi Conditional mean: E(x|M) = Z ∞ −∞ xf(x|M)dx AKU-EE/1-9/HA, 1st Semester, 85-86 – p.48/102

73. Mean of a function of a RV: y = g(x), E(y) = Z ∞ −∞ yfy(y)dy = Z ∞ −∞ g(x)fx(x)dx For continuous RV, variance: σ2 = Z ∞ −∞ (x − η)2 fx(x)dx For discrete type: σ2 = X i pi(xi − η)2 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.49/102

74. Moments: E(xn ) = µn = Z ∞ −∞ xn fx(x)dx Central moments: E{(x − η)n } = Z ∞ −∞ (x − η)n fx(x)dx AKU-EE/1-9/HA, 1st Semester, 85-86 – p.50/102

75. Prob{|x − η| ε} 6 σ2 ε2 , Tchebycheff inequality Prob{|x − η| ε} = R |x−η|ε fx(x)dx, and by definition σ2 = R ∞ −∞ (x − η)2 fx(x)dx then σ2 R |x−η|ε (x − η)2 fx(x)dx, and by assumption |x − η| ε then σ2 Z |x−η|ε (x − η)2 fx(x)dx ε2 Z |x−η|ε fx(x)dx = ε2 Prob{|x − η| ε} AKU-EE/1-9/HA, 1st Semester, 85-86 – p.51/102

76. Characteristic function: E(e−jωx ) = Φ(ω) = Z ∞ −∞ fx(x)e−jωx dx, |Φ(ω)| 6 Φ(0) = 1 Moment Generating function: Φ(s) = Z ∞ −∞ fx(x)e−sx dx Second moment generating function: Ψ(s) = ln Φ(s) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.52/102

77. Φ(n) (s) = E{(−1)n xn e−sx } ⇒ (−1)n Φ(n) (0) = E{xn } Φ(s) = ∞ X n=0 (−1)n E(xn ) n! sn , s → 0 This is true if moments are finite and then the series converges absolutely near s = 0 For continuous RV, Cumulants: γn = dn dsn Ψ(s) |s=0 , Ψ(s) = ∞ X n=1 (−1)n γn n! sn AKU-EE/1-9/HA, 1st Semester, 85-86 – p.53/102

78. For discrete RV, Characteristic function: Φ(ω) = E(e−jωx ) = X i pie−jωxi , DFT of pi sequence If n is of lattice type RV: Γ(z) = E(zn ) = ∞ X n=−∞ pnzn then Γ(1/z) is z transform of the sequence pn = Prob{n = n} AKU-EE/1-9/HA, 1st Semester, 85-86 – p.54/102

79. Example For Binomial, and Poisson RV find Γ(z) Solution: pk = n k pk qn−k ⇒ Γ(z) = (pz + q)n pk = e−λ λk k! ⇒ Γ(z) = eλ(z−1) Moments: E(kn ) = Γ(n) (z = 1) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.55/102

80. F(x, y) = Prob(x 6 x, y 6 y) F(−∞, y) = 0, F(x, −∞) = 0, F(∞, ∞) = 1, F(∞, y) = Fy(y), F(x, ∞) = Fx(x) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.56/102

81. Prob{x1 6 x 6 x2, y} = F(x2, y) − F(x1, y) Prob{x, y1 6 y 6 y2} = F(x, y2) − F(x, y1) Prob{x1 6 x 6 x2, y1 6 y 6 y2} = F(x2, y2) − F(x2, y1) − [F(x1, y2) − F(x1, y1)] AKU-EE/1-9/HA, 1st Semester, 85-86 – p.57/102

82. f(x, y) = ∂2 ∂x∂y F(x, y) ⇔ F(x, y) = Z x −∞ Z y −∞ f(x, y)dxdy Prob{(x, y) ∈ D} = Z D Z f(x, y)dxdy f(x) = Z ∞ −∞ f(x, y)dy, f(y) = Z ∞ −∞ f(x, y)dx, marginal PDF AKU-EE/1-9/HA, 1st Semester, 85-86 – p.58/102

83. Joint Normality: f(x, y) = N(η1, η2; σ1, σ2; r) f(x, y) = exp          (x − η1)2 σ2 1 − 2r (x − η1)(y − η2) σ1σ2 + (y − η2)2 σ2 2   −2(1 − r2 )        2πσ1σ2 √ 1 − r2 , |r| 1 Marginal densities: f(x) = N(η1, σ1), f(y) = N(η2, σ2) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.59/102

84. Example f(x, y) = 1 2πσ2 exp − x2 + y2 2σ2 Determine Prob{x2 + y2 6 z2 } Solution: Prob{x2 + y2 6 z2 | {z } D } = Z D Z f(x, y)dxdy x = r cos θ, y = r sin θ ⇒ Prob{x2 +y2 6 z2 } = 1 2πσ2 Z z 0 Z 2π 0 e−r2/2σ2 rdrdθ = 1−e−z2/2σ2 , (Rayleigh) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.60/102

85. Example f(x, y) = 1 2πσ2 exp − (x − ηx)2 + (y − ηy)2 2σ2 Determine Prob{x2 + y2 6 z2 } Solution: Prob{x2 + y2 6 z2 | {z } D } = Z D Z f(x, y)dxdy x = z cos θ, y = z sin θ, η = q η2 x + η2 y, ηx = η cos φ, ηy = η sin φ, Fz(z) = 1 2πσ2 Z z 0 Z 2π 0 exp v2 − 2zηx cos θ − 2zηy sin θ + η2 −2σ2 vdvdθ AKU-EE/1-9/HA, 1st Semester, 85-86 – p.61/102

86. fz(z) = z 2σ2π exp − z2 + η2 2σ2 Z 2π 0 exp vη cos(θ − φ) σ2 dθ fz(z) = z σ2 exp − z2 + η2 2σ2 Z 2π 0 exp vη cos ω σ2 2π dω fz(z) = z σ2 exp − z2 + η2 2σ2 I0( zη σ2 ), (Rician) As η → 0 Rician RV approaches a Rayleigh RV. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.62/102

87. z = g(x, y), w = h(x, y) Joint CDF: Fzw(z, w) = Prob{(x, y) ∈ Dzw} = Z D zw Z fxy(x, y)dxdy Example: z = p x2 + y2, w = y x Joint PDF: z = g(xn, yn), w = h(xn, yn), xn, yn are the roots ⇒ AKU-EE/1-9/HA, 1st Semester, 85-86 – p.63/102

88. fzw(z, w) = n X i=1 fxy(x, y) J(xi, yi) Jacobian: J(x, y) =

95. ∂z ∂x ∂z ∂y ∂w ∂x ∂w ∂y

102. =

109. ∂x ∂z ∂x ∂w ∂y ∂z ∂y ∂w

116. −1 Example: Find the PDF for z z = xy Solution: Assume an auxiliary RV w = x AKU-EE/1-9/HA, 1st Semester, 85-86 – p.64/102

117. There is only a single solution: x = w and y = z/w J(x, y) =

121. y x 1 0

125. = −w fzw(z, w) = 1 |w| fxy(w, z/w) ⇒ fz(z) = Z ∞ −∞ fxy(w, z/w) |w| dw AKU-EE/1-9/HA, 1st Semester, 85-86 – p.65/102

126. Let x ∼ U(0, 1), y ∼ U(0, 1). Suppose X and Y are independent. Define Z = X + Y, W = X − Y . Show that Z and W are dependent, but uncorrelated RVs. Solution: x = z + w 2 , y = z − w 2 . 0 z 2, −1 w 1, z+w 6 2, z−w 6 2, z |w|, |J(z, w)| = 1/2. fZW (z, w) = 1/2, 0 z 2, − 1 w 1, z + w 6 2, z − w 6 2, |w| z, 0, otherwise, AKU-EE/1-9/HA, 1st Semester, 85-86 – p.66/102

127. z w - - × fZ(z) = Z fZW (z, w)dw =      R z −z 1 2 dw = z, 0 z 1, R 2−z z−2 1 2 dw = 2 − z, 1 z 2, AKU-EE/1-9/HA, 1st Semester, 85-86 – p.67/102

128. Or, fZ(z) = fX(z) ⊗ fY (z) =    z, 0 z 1, 2 − z, 1 z 2, 0, otherwise, Clearly, fZW (z, w) 6= fZ(z)fW (w), Z and W are not independent. However, E(ZW) = E [(X + Y )(X − Y )] = E(X2 ) − E(Y 2 ) = 0, E(W) = E(X − Y ) = 0, Cov(Z, W) = E(ZW) − E(Z)E(W) = 0 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.68/102

129. z = g(x, y) Mean: E{z} = R ∞ −∞ R ∞ −∞ g(x, y)f(x, y)dxdy Covariance: C = E{(x − ηx)(y − ηy)} Correlation coefficient: r = C/σxσy, |r| 6 1 Uncorrelatedness: if C = 0 or r = 0 Orthogonality: E{xy} = 0 ⇔ x⊥y Moments: E{xk yr } = R ∞ −∞ R ∞ −∞ xk yr f(x, y)dxdy Joint MGF: Φ(s, u) = E{e−(sx+uy) }, s, u ∈ C Marginal MGF: Φx(s) = Φ(s, 0), Φy(u) = Φ(0, u) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.69/102

130. Joint characteristic functions (i.e, JMGF) are useful in determining the PDF of linear combinations of RVs. Example: With X and Y as independent Poisson RVs with parameters λ1 and λ2 respectively, and Z = X + Y Solution: Z = X + Y ⇒ ΦZ(ω) = ΦX(ω)ΦY (ω) ΦX(ω) = eλ1(e−jω−1) , ΦY (ω) = eλ2(e−jω−1) ⇒ Φz(ω) = e(λ1+λ2)(e−jω−1) ⇒ Z ∼ Poiss(λ1 + λ2) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.70/102

131. Fy(y|x1 6 x 6 x2) = Prob{x1 6 x 6 x2, y y} Prob{x1 x x2} = F(x2, y) − F(x1, y) | {z } R y −∞ R x2 x1 f(x,y)dxdy F(x2) − F(x1) Differentiating from both sides with respect to y, we have: fy(y|x1 x x2) = R x2 x1 f(x, y)dx F(x2) − F(x1) As x1 → x2 we have: f(y|x = x) = f(x, y) fx(x) = f(y|x) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.71/102

133. Example: determine f(x|y), f(y|x) fXY (x, y) = k, 0 x y 1, 0, otherwise, Z Z fXY (x, y)dxdy = Z 1 0 Z y 0 kdxdy = Z 1 0 kydy = k 2 = 1 ⇒ k = 2 fX(x) = Z fXY (x, y)dy = Z 1 x kdy = k(1 − x), 0 x 1, fY (y) = Z fXY (x, y)dx = Z y 0 kdx = ky, 0 y 1. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.73/102

134. fX|Y (x|y) = fXY (x, y) fY (y) = 1 y , 0 x y 1, fY |X(y|x) = fXY (x, y) fX(x) = 1 1 − x , 0 x y 1. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.74/102

135. Example: Poisson sum of Bernoulli random variables. Let Xi, i = 1, 2, 3, · · · represent independent, identically distributed Bernoulli random variables with P(Xi = 1) = p, P(Xi = 0) = 1 − p = q and N a Poisson random variable with parameter λ that is independent of all Xi. Consider the random variables Y = N X i=1 Xi, Z = N − Y. Show that Y and Z are independent Poisson random variables. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.75/102

136. Solution: To determine the joint probability mass function of Y and Z, consider P(Y = m, Z = n) = P(Y = m, N − Y = n) = P(Y = m, N = m + n) = P(Y = m |N = m + n)P(N = m + n) = P( N P i=1 Xi = m |N = m + n)P(N = m + n) = P( m+n P i=1 Xi = m)P(N = m + n) (Note that m+n X i=1 Xi ∼ B(m+n, p) and Xis are independent of N) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.76/102

137. = (m + n)! m!n! pm qn e−λ λm+n (m + n)! = e−pλ (pλ)m m! e−qλ (qλ)n n! = P(Y = m)P(Z = n) = Y ∼ P(pλ) and Z ∼ P(qλ) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.77/102

138. The conditional expected values: E{g(y)|M} = Z ∞ −∞ g(y)f(y|M)dy The conditional mean(regression curve) and variance: ηy|x = E{y|x = x} = Z ∞ −∞ yf(y|x)dy σ2 y|x = E{(y − ηy|x)2 } = Z ∞ −∞ (y − ηy|x)2 f(y|x)dy AKU-EE/1-9/HA, 1st Semester, 85-86 – p.78/102

139. Example: Determine E{X|Y } and E{Y |X}. fXY (x, y) = 1, 0 |y| x 1, 0, otherwise. x y - - × AKU-EE/1-9/HA, 1st Semester, 85-86 – p.79/102

140. Solution: fX(x) = Z x −x fXY (x, y)dy = 2x, 0 x 1, fY (y) = Z 1 |y| 1dx = 1 − |y|, |y| 1, fX|Y (x|y) = fXY (x, y) fY (y) = 1 1 − |y| , 0 |y| x 1, fY |X(y|x) = fXY (x, y) fX(x) = 1 2x , 0 |y| x 1. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.80/102

141. E(X|Y ) = R xfX|Y (x|y)dx = R 1 |y| x (1−|y|) dx = 1 (1−|y|) x2 2

144. 1 |y| = 1−|y|2 2(1−|y|) = 1+|y| 2 , |y| 1. E(Y |X) = Z yfY |X(y|x)dy = Z x −x y 2x dy = 1 2x y2 2

148. x −x = 0, 0 x 1 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.81/102

149. φ(x) = E{y|x} = a function of x (3) φ(x) = E{y|x} = a function of RV x (4) For (4), we have: E{φ(x)} = Z ∞ −∞ φ(x)f(x)dx = Z ∞ −∞ f(x)dx Z ∞ −∞ yf(y|x)dy | {z } φ(x) = E{E{y|x} | {z } φ(x) } Therefore, we have: Ey{Ex{y|x}} = Z ∞ −∞ yf(x, y)dydx = E{y} AKU-EE/1-9/HA, 1st Semester, 85-86 – p.82/102

150. This result can be generalized: E{g(x, y)|x} = a function of x = Z ∞ −∞ f(x) Z ∞ −∞ g(x, y)f(y|x)dydx (5) The last equation is obtained via: E{g(x, y)|M} = Z ∞ −∞ Z ∞ −∞ g(x, y)f(x, y|M)dydx (5) is: Z ∞ −∞ Z ∞ −∞ g(x, y)f(x, y)dydx = E{g(x, y)} (6) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.83/102

151. because of (6), we have E{Ex{g(x, y)|x}} = E{g(x, y)} Note that the following are the extensions of above: E{g1(x)g2(y)} = E{E{g1(x)g2(y)|x}} = Ex{g1(x)Ey{g2(y)|x}} AKU-EE/1-9/HA, 1st Semester, 85-86 – p.84/102

152. Is it possible to estimate a RV? The answer is yes, but some estimation tools are required. One of the important estimation tools is called mean square estimation(MSE) principle. If a RV y is to be estimated by a constant c based on MSE principle we have the following: e = E{(y − c)2 } = Z ∞ −∞ (y − c)2 f(y)dy ⇒ We then minimize e with respect to the unknown c. ∂e ∂c = −2 Z ∞ −∞ (y−c)f(y)dy = 0 ⇒ c = Z ∞ −∞ yf(y)dy, (Mean value) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.85/102

153. If a RV y is to be estimated by a function of another RV x based on MSE principle we have the following: e = E{(y − c(x))2 } = Z ∞ −∞ Z ∞ −∞ (y − c(x))2 f(x, y)dydx = Z ∞ −∞ f(x) Z ∞ −∞ (y − c(x))2 f(y|x)dy | {z } 0 dx The integral is minimum if the inner integral is minimum, for ∀x. This can only occur if min c(x) J = Z ∞ −∞ (y − c(x))2 f(y|x)dy, ∀x AKU-EE/1-9/HA, 1st Semester, 85-86 – p.86/102

154. ∂J ∂c = 0 ⇒ c(x) = Z ∞ −∞ yf(y|x)dy, (Conditional mean) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.87/102

155. Linear scenario: J = E{[y − (Ax + B)]2 } Orthogonality principle: Data⊥error AKU-EE/1-9/HA, 1st Semester, 85-86 – p.88/102

156. A random sequence − → X = [x1, x2, · · · , xn] Prob{ − → X ∈ D} = Z · · · D Z f( − → X)d − → X F(X) = F(x1, x2, · · · , xn) = Prob{x1 6 x1, · · · , xn 6 xn} F(x1) = F(x1, ∞, · · · , ∞), f(x1, x2) = Z Z f(x1, x2, x3, x4)dx3dx4 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.89/102

157. Function of a sequence of RV: − → X = [x1, x2, · · · , xn] y1 = g1( − → X), y2 = g2( − → X), · · · , yk = gk( − → X) If k = n: f− → Y (Y ) = f− → X (X) |J(X)| , J(X) =

164. ∂g1 ∂x1 · · · ∂g1 ∂xn . . . . . . . . . ∂gn ∂x1 · · · ∂gn ∂xn

171. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.90/102

172. Example: yk = x1 + x2 + · · · + xk, k = n, xi are independent. Solution: x1 = y1, x2 = y2 − y1, · · · , xk = yk − yk−1, J = 1 fY (y1, · · · , yn) = fx1 (y1)fx2 (y2 − y1) · · · fxn (yn − yn−1) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.91/102

173. Order statistics sequence: y1 = x1 y2 = x2 · · · yn = xn fyk dy = Prob(y 6 yk 6 y + dy | {z } B ) A1 = {x 6 y}, A2 = {y 6 x 6 y + dy} Prob(A1) = Fx(y), Prob(A2) = fx(y)dy, Prob(A3) = 1−Fx(y) B occurs iff A1 occurs k − 1 times A2 occurs 1 time, and A3 occurs n − k times Prob(B) = n! (k − 1)!1!(n − k)! Probk−1 (A1)Prob(A1)Probn−k (A3) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.92/102

174. fyk (y) = n! (k − 1)!1!(n − k)! Fk−1 x (y)fx(y)[1 − Fx(y)]n−k Example1: Minimum, median, and maximum. fy1 (y) = nfx(y)(1 − Fx(y))n−1 fym+1 (y) = n!Fm x (y)(1 − Fx(y))n−m−1 fx(y) m!(n − m − 1)! , n = 2m+1, median = ym+1 fyn (y) = nFn−1 x (y)fx(y) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.93/102

175. Example2: {x1, x2, · · · , xn} are IID uniform(0,1) RV, the sorted RV, {y1, y2, · · · , yn} fy1 (y) = n(1 − y)n−1 , 0 6 y 6 1 fyn (y) = nyn−1 , 0 6 y 6 1 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.94/102

176. Covariance, and correlation matrix: Cij = E{(xi − ηi)(xj − ηj)}, Rij = E{xixj}, C = R − ηT η Correlation matrix is a positive semidefinite matrix: Eigenvalues of R are nonnegative. Characteristic function of a sequence: Φ(Ω) = E{e−jΩxT }, x = [x1, · · · , xn], Ω = [ω1, · · · , ωn] AKU-EE/1-9/HA, 1st Semester, 85-86 – p.95/102

177. A Gaussian sequence: f(X) = 1 p (2π)n∆ exp −0.5(X − η)C−1 (X − η)T Φ(Ω) = exp −0.5ΩCΩT exp −jηΩT AKU-EE/1-9/HA, 1st Semester, 85-86 – p.96/102

178. Central Limit Theorem: Suppose x1, x2, · · · , xn are a set of zero mean independent, identically distributed (IID) random variables with some common distribution. Consider their scaled sum x = x1 + x2 + · · · + xn √ n . Then asymptotically as n → ∞ ⇒ x ∼ N(0, σ2 ) AKU-EE/1-9/HA, 1st Semester, 85-86 – p.97/102

179. Proof: Although the theorem is true under even more general conditions, we shall prove it here under the independence assumption. Let σ2 represent their common variance. Since E{xi} = 0 ⇒ E{x2 i } = σ2 we have Φx(u) = E{e−jux } = n Y i=1 E{e−ju/ √ nxi } = Φxi (u/ √ n) n E(ejxiu/ √ n ) = E 1 − jxiu √ n + j2 x2 i u2 2!n + j3 x3 i u3 3!n3/2 + · · · AKU-EE/1-9/HA, 1st Semester, 85-86 – p.98/102

180. = 1 − σ2 u2 2n + o 1 n3/2 , Φx(u) = 1 − σ2 u2 2n + o 1 n3/2 n , lim n→∞ 1 − z n n = e−z lim n→∞ Φx(u) → e−σ2u2/2 , The central limit theorem states that a large sum of independent random variables each with finite variance tends to behave like a normal random variable. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.99/102

181. Thus the individual PDFs become unimportant to analyze the collective sum behavior. If we model the noise phenomenon as the sum of a large number of independent random variables (eg: electron motion in resistor components), then this theorem allows us to conclude that noise behaves like a Gaussian RV AKU-EE/1-9/HA, 1st Semester, 85-86 – p.100/102

182. Caution: It may be remarked that the finite variance assumption is necessary for the theorem to hold good. To prove its importance, consider the RVs to be Cauchy distributed, and let Φxi (u) = e−α|u| , where xi ∼ C(α) Φx(u) = n Y i=1 Φx(u/ √ n) = e−α|u|/ √ n n ∼ C(α √ n), AKU-EE/1-9/HA, 1st Semester, 85-86 – p.101/102

183. which shows that x is still Cauchy with parameter In other words, central limit theorem does not hold good for a set of Cauchy RVs as their variances are undefined. AKU-EE/1-9/HA, 1st Semester, 85-86 – p.102/102

Random Variables

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Random Variables

Similar a Random Variables (20)

Más de HAmindavarLectures

Más de HAmindavarLectures (10)

Último

Último (20)

Random Variables