"Random variables", "stochastic Processes" graduate course.
Lecture notes of Prof. H.Amindavar.
Professor of Electrical engineering at Amirkabir university of technology.
1. Bernoulli trials
If a set has n elements then the total number of its subsets
consisting of k elements each equals
n
k
=
n!
k!(n − k)!
Example:
If we place at random n points in the interval [0, T]. What is
the probability that k of these points are in the interval
(t1, t2)?
solution:
A = {a point is in the interval(t1, t2)} ⇒ P(A) = t2−t1
T
,
A occurs k times, it means that k of the n points
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.1/102
2. lie in the interval (t1, t2) and the rest; (n − k) points are
outside of this interval with probability q = 1 − p
Prob(A) =
n
k
pk
qn−k
Example:
An order of 104
parts is received. The probability of the
even A that the part is defective equals 0.1. Find the
probability that in 104
trials, A will occur at most 1000 times.
Solution:
p = 0.1, n = 104
Prob{0 6 n 6 1000} =
1000
X
n=0
104
k
0.1k
0.9104−k
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.2/102
3. g(x) =
1
√
2π
e−x2/2
, G(x) =
Z x
−∞
g(y)dy
G(∞) = 1, G(0) = 0.5, G(−∞) = 0, G(−x) = 1 − G(x)
1
σ
√
2π
Z x2
x1
exp
−(x − µ)2
2σ2
dx = G
x2 − µ
σ
−G
x1 − µ
σ
erf(x) =
1
√
2π
Z x
0
e−y2/2
dy =
1
√
2π
Z x
−∞
e−y2/2
dy −
Z 0
−∞
e−y2/2
dy
= G(x) − 0.5
erfc(x) =
1
√
2π
Z ∞
x
e−y2/2
dy, erf(x) =
2
√
2π
Z ∞
x
e−y2/2
dy
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.3/102
4. Z
e−x2/2
dx =
Z
−1
x
e−x2/2
′
dx
Using integration by parts:
R
udv = uv −
R
vdu
Q(x) =
1
√
2π
Z ∞
x
e−y2/2
dy =
e−x2/2
x
√
2π
1 −
1
x2
+
1 · 3
x4
−
1 · ·5
x6
+ · · ·
Q(x) = 0.5erfc(x/
√
2)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.4/102
5. Watson’s lemma
I(x) =
Z b
0
f(t)e−xt
dt, b 0
Watson’s lemma gives the full asymptotic expansion of
I(x) provided f(t) is continuous on the interval 0 6 t 6 b
and f(t) has the asymptotic series expansion
f(t) ∼ tα
∞
X
n=0
antβn
t → 0+
⇒
I(x) ∼
∞
X
n=0
an
Γ(α + βn + 1)
xα+βn+1
, x → ∞
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.5/102
6. Q(x) =
1
√
2π
Z ∞
x
e−y2/2
dy, u = y − x
=
1
√
2π
e−x2/2
Z ∞
0
e−u2/2
e−xu
du
=
1
√
2π
e−x2/2
∞
X
n=0
(−1)n
n!2n
Z ∞
0
u2n
e−ux
du
Related topics: Laplace’s and Fourier’s methods, steepest
descent and saddle point approximation
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.6/102
7. DeMoivre-Laplace theorem
if npq 1, and np −
√
npq 6 k 6 np +
√
npq ⇒
n
k
pk
qn−k
≈
1
√
2πnpq
exp
−(k − np)2
2npq
Example:
A fair coin is tossed 1000 times. Find the probability that
heads show 510 times.
n = 1000, p = 0.5, np = 500,
√
npq = 5
√
10 = 15.81
Solution:
1000
500
.5500
.5500
≈
e−100/500
√
2πnpq
= 0.0207
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.7/102
8. Example:
A fair coin is tossed 104
times. Find the probability that
number of heads is between 4995 and 5005.
n = 104
, np = 5000, npq = 50,
√
npq = 7.07
Solution:
Prob{4995 6 k 6 5005} =
5005
X
k=4995
104
k
0.5k
0.5104−k
k2
X
k=k1
g
k − np
√
npq
≈
Z k2
k1
g
x − np
√
npq
dx
= G
k2 − np
√
npq
− G
k1 − np
√
npq
= 0.041
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.8/102
9. Law of large numbers
An event A with Prob{A} = p occurs k times in n trials ⇒
k ≃ np , this is a heuristic statement. Let A denote an
event whose probability of occurrence in a single trial is p.
If k denotes the number of occurrences of A in n
independent trials, then
lim
n→∞
Prob{k = np} ≃
1
√
2πnpq
→ 0 Never occurs!
The approximation k ≃ np means that the ratio k/n is close
to p in the sense that, for any ǫ 0,
lim
n→∞
Prob
35. Example: p = 0.5, ǫ = 0.05
n(p − ǫ) = 0.45n, n(p + ǫ) = 0.55n, ǫ
r
n
pq
= 0.1
√
n
Solution:
n 100 900
0.1
√
n 1 3
2G(0.1
√
n) − 1 0.682 0.997
The last row indicates that after 900 independent trials we
may have some confidence in accepting k/n ≈ p
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.11/102
36. Generalized Bernoulli trials
U = {A1 occurs k1 times, A2 occurs k2 times, · · · , Ar occurs kr times}
The number of occurrence of U is
n!
k1!k2! · · · kr!
, n =
r
X
i=1
ki
Since the trials are independent, the probability of each
event is pk1
1 pk2
2 · · · pkr
r ⇒
Prob{U} =
n!
k1!k2! · · · kr!
pk1
1 pk2
2 · · · pkr
r
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.12/102
37. Example:
A fair die is rolled 10 times. Determine the probability that
ones shows 3 times, and an even number shows 6 times.
Solution:
A1 = {1}, A2 = {2, 4, 6}, A3 = {3, 5} ⇒
p1 =
1
6
, p2 =
3
6
, p3 =
2
6
k1 = 3, k2 = 6, k3 = 1
Prob{U} =
10!
3!6!1!
1
6
3
1
2
6
1
3
= 0.0203
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.13/102
38. Poisson theorem
Prob{An event A occurs k times in n trials} =
n
k
pk
qn−k
if p 1 and n → ∞ ⇒ np ≈ npq 1. However, if np is of
order 1, then the Gaussian approximation is no longer
valid. We use the following
n
k
pk
qn−k
≈ e−np (np)k
k!
, Poisson theorem
if k is of order np, then k n and kp 1 and
n(n − 1)(n − 2) · · · (n − k + 1) ≈ n · n · n · · · n = nk
and
q = 1 − p ≈ e−p
, qn−k
≈ e−(n−k)p
≈ e−np
. Hence, we have
n
k
pk
qn−k
≈ e−np (np)k
k!
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.14/102
39. Binomial(n, k)= as n → ∞, p → 0
n!
(n − k)!k!
λk
nk
1 −
λ
n
n−k
√
2πne−n
nn
√
2π(n − k)n−k+0.5e−n+knk
λk
nk
e−λ
1
(1 − λ
n
)nek
λk
k!
e−λ
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.15/102
40. n → ∞, p → 0, np → a
n
k
pk
qn−k
→ e−a ak
k!
Example:
A system contains 1000 components. Each component
fails independently of the others and the probability its
failure in one month equals 10−3
. Find the probability that
the system will function at the end of one month.
Solution: This can be considered as a problem in repeated
trials with p = 10−3
, q = 0.999, n = 1000, k = 0
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.16/102
41. Prob{k = 0} =
1000
0
p0
q1000
= 0.9991000
, Exact
Prob{k = 0} ≈ e−1 (np)0
0!
= 0.368
Applying the same idea as before:
Prob{k1 6 k 6 k2} =
k2
X
k=k1
n
k
pk
qn−k
≈ e−np
k2
X
k=k1
(np)k
k!
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.17/102
42. Generalization of Poisson theorem
Let’s assume A1, A2, · · · ,Am+1, are the m + 1 events of a
partition with Prob{Ai} = pi, pm+1 = 1 −
Pm
i=1 pi.
we can show that
n!
k1! · · · km+1!
pk1
1 · · · p
km+1
m+1 ≈
e−a1
ak1
1
k1!
· · ·
e−am
akm
m
km!
where ai = npi. The reason for having m terms on the right
hand side whereas m + 1 terms on the left hand side is
pm+1 = 1 −
Pm
i=1 pi
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.18/102
43. Random Poisson points
n random points in the interval (−T/2, T/2).
Prob{k points in ta = t2 − t1} =
n
k
pk
qn−k
, p =
ta
T
If n, T → ∞, Prob{k points in ta =} ≈ e−nta
T
(nta
T
)k
k!
If λ = n/T, the rate at which the events occur, is constant,
the resulting process is an infinite set of points covering the
entire t axis from −∞ to ∞.
Prob{k points in ta} = e−λta
(λta)k
k!
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.19/102
44. Points in non overlapping intervals
Let’s consider the interval (−T/2, T/2) containing n points,
and two non-overlapping intervals ta tb
Prob{ka points in ta, kb points in tb} =
n!
ka!kb!kc!
ta
T
ka
tb
T
kb
(1−
ta
T
−
tb
T
)kc
Suppose now that, λ = n/T, n, T → ∞ we have
nta/T = λta, ntb/T = λtb we can conclude that
Prob{ka points in ta, kb points in tb} ≈ e−λta
(λta)ka
ka!
e−λtb
(λtb)kb
kb!
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.20/102
45. Prob{ka points in ta,kb points in tb}= Prob{ka points in ta} Prob{kb points in tb}
then the events {ka in ta} and {kb in tb} are independent.
These outcomes are called random Poisson points.
Properties:
1. Prob{ka points in ta} = e−λta (λta)ka
ka!
2. if two intervals (t1, t2) and (t3, t4) are non-overlapping
then the events in these intervals are independent.
Telephone calls, car crossing a bridge, shot noise, . . .
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.21/102
46. Baye’s theorem
Let’s assume we have a pile of m coins. The probability of
“heads” of the ith coin equals pi. We select from this pile
one coin and we toss it n times. We observe that heads
show k times. On the basis of this observation, we find the
probability Xr we selected the rth coin
Prob{rth coin selected and that heads showed up k times} =
Prob{rth coin|k heads} =
pk
r (1 − pr)n−k
Pm
i=1 pk
i (1 − pi)n−k
=
Prob{k heads|rth coin}Prob{rth coin}
Prob{k heads}
Prob{k heads} =
m
X
i=1
Prob{k heads|ith coin}Prob{ith coin}
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.22/102
47. Example:
Number of showing, k=490, number of tossing, n=1000,
number of coins, m=10, the specific coin, rth, r=5.
Xr =
Prob{5th coin out of 10 coins that 490 times heads showed
up in 1000 tossing} Solution:
p1 = p2 = · · · = p10 = 0.5, Prob{ith coin} = 0.1
Xr =
p490
5 (1 − p5)510 1
10
P10
i=1 p490
i (1 − pi)1000−490 1
10
= 0.1
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.23/102
48. Random variable
A random variable is a number assigned to every outcome
of an experiment.
Prob{x 6 x} of an event {x 6 x} is a number that
depends on x. This number is denoted by Fx(x) and is
called CDF of RV x.
Properties of CDF:
1. F(∞) = 1, F(−∞) = 0
2. x1 6 x2 ⇒ F(x1) 6 F(x2)
3. Prob{x} = 1 − Fx(x)
4. F(x+
) = F(x), F(x) is continuous from the right
5. Prob{x1 6 x 6 x2} = F(x2) − F(x1)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.24/102
49. We say the statistics of an RV are known if we can
determine the Prob{x ∈ S}
We say that an RV x is continuous type if Fx(x) is
continuous.
We say that an RV x is discrete type if Fx(x) is staircase.
We say that an RV x is mixed type if Fx(x) is a
combination of continuous and staircase function.
fx(x) = d
dx
Fx(x) is the PDF for a continuous random
variable, and for a discrete random variable
f(x) =
P
i piδ(x − xi) where pi=Prob{x = xi}
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.25/102
50. Properties:
1. f(x) 0, F(x) =
R x
−∞
f(ξ)dξ,
F(x2) − F(x1) =
R x2
x1
f(x)dx
2. Prob{x 6 x 6 x + ∆x} ≈ f(x)∆x
3. f(x) = lim
∆x→0
Prob{x 6 x 6 x + ∆x}
∆x
The mode or the most likely value of x is where f(x) is
maximum. An RV is unimodal if it has only a single mode.
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.26/102
51. Special RV
Normal: f(x) = 1
σ
√
2π
exp
−(x−η)2
2σ2
Uniform: f(x) =
1
x2−x1
, x1 6 x 6 x2
0, otherwise
Rayleigh: f(x) = x
σ2 e−x2/2σ2
, x 0
Lognormal: f(x) =
1
σx
√
2π
e−
(ln x−η)2
2σ2
, x 0
Cauchy: f(x) =
1
π(x2 + 1)
Gamma: f(x) =
cb+1
Γ(b + 1)
xb
e−cx
, x 0, Γ(b+1) = bΓ(b),
if b=an integer, it is called Erlang density.
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.27/102
52. Laplace: f(x) = 0.5e−|x|
Chi and Chi-square: χ =
pPn
i=1 x2
i , y = χ2
,
f(χ) = 2aχn−1
e−χ2/2σ2
f(y) = ayn/2−1
e−y/2σ2
, a = 1
Γ(n/2)(σ
√
2)2
Geometric: Prob{x = k} = pqk
, k = 0, 1, · · · , ∞
Binomial:
Prob{x = k} =
n
k
pk
(1 − p)n−k
, k = 0, 1, · · · , n
x is of lattice type and its density is a sum of impulses,
f(x) =
n
X
k=0
n
k
pk
(1 − p)n−k
δ(x − k)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.28/102
53. Negative Binomial:
n + k − 1
k
pn
(1 − p)k
, k = 0, 1, · · · , ∞
Poisson: Prob{x = k} = e−a ak
k!
, k = 0, 1, · · ·
The density function is Prob{x = k} = e−a
∞
X
k=0
ak
k!
δ(x − k)
Example1:
Given a constant t0, we define a RV n such that its value
equals the number of points in the interval (0, t0), find the
probability that the number of points in this interval is k
Solution:
Prob{n = k} = e−λt0
(λt0)k
k! AKU-EE/1-9/HA, 1st Semester, 85-86 – p.29/102
54. Example2:
If t1 is the first random point to the right of the fixed point t0
and we define RV x as the distance from t0 to t1, determine
the PDF and CDF for x
Solution:
F(x)=probability that there are at least one point between
t0 and t0 + x, 1 − F(x) is the probability that there are no
points=Prob{n = 0} = e−λx
, F(x) = 1 − e−λx
, f(x) =
λe−λx
u(x)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.30/102
55. Conditional distribution
Baye’s rule: Prob{A|B} =
Prob{AB}
Prob{B} 6= 0
⇒ conditional
CDF: F(x|B) = Prob{x 6 x|B} =
Prob{x 6 x, B}
Prob{B}
,
{x 6 x, B} is the intersection of {x 6 x} and B
F(∞|B) = 1, F(−∞|B) = 0, Prob{x1 6 x 6 x2|B} =
F(x 6 x2|B) − F(x 6 x1|B) =
Prob{x1 6 x 6 x2, B}
Prob(B)
Conditional PDF: f(x|B) =
d
dx
F(x|B). To find F(x|B) in
general we must know about the experiment. However, if B
can be expressed in terms of x, then, for determination of
F(x|B), knowledge of F(x) is enough AKU-EE/1-9/HA, 1st Semester, 85-86 – p.31/102
56. Important cases
B = {x 6 a}, F(x|B) =
Prob{x 6 x, x 6 a}
Prob{x 6 a}
, if x a ⇒
{x 6 x, x 6 a} = {x 6 a} ⇒ F(x|B) = 1, x a,
if x a ⇒
{x 6 x, x 6 a} = {x 6 x} ⇒ F(x|x 6 a) = F(x)
F(a)
, x a
f(x|x 6 a) =
(
d
dx
{F(x|x 6 a)} = f(x)
F(a)
, x a
0, x a
Example: Determine f(x||x − η| 6 kσ), x ∼ N(η; σ)
Solution:
−kσ + η 6 x 6 kσ + η ⇒ f(x||x − η| 6 kσ) =
f(x)
F(|x − η| 6 kσ)
=
N(η; σ)
G(k) − G(−k)
, if x ∋ |x − η| 6 kσ ⇒
f(x||x − η| 6 kσ) = 0 AKU-EE/1-9/HA, 1st Semester, 85-86 – p.32/102
57. Total probability
If {A1, A2, · · · , An} are disjoint and partition the whole
space:
Prob{x 6 x} =
n
X
i=1
Prob{x 6 x|Ai}Prob(Ai)
F(x) =
n
X
i=1
F(x 6 x|Ai)Prob(Ai)
f(x) =
n
X
i=1
f(x 6 x|Ai)Prob(Ai)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.33/102
58. Gaussian mixture
Binary case:
f(x|B) = N(η1; σ1), f(x|B̄) = N(η2; σ2) ⇒
f(x) = pN(η1; σ1) + (1 − p)N(η2; σ2)
f(x) is a multimodal distribution.
Generally, we can have:
f(x) =
n
X
i=1
piN(ηi; σi),
n
X
i=1
pi = 1
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.34/102
59. The Prob{A|x = x} cannot be defined. But, it can be
defined as a limit.
Prob{A|x1 6 x 6 x2} =
Prob{x1 6 x 6 x2|A}Prob{A}
Prob{x1 6 x 6 x2}
=
F(x2|A) − F(x1|A)
F(x2) − F(x1)
Prob{A} (1)
Let x = x1 and x + ∆x = x2 and divide the numerator and
denominator of (1) by ∆x → 0, then, we have
Prob{A|x = x} =
f(x|A)
f(x)
Prob{A} (2)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.35/102
60. From (2), we have
f(x|A) =
Prob{A|x = x}
Prob{A}
f(x)
=
Prob{A|x = x}
R ∞
−∞
Prob{A|x = x}f(x) dx
Example:
A={k heads in n tossing in a specific order} where
probability of a head showing, p, is a RV with PDF f(p).
What is f(p|A)
Solution:
Prob{A|P = p} = pk
(1 − p)n−k
, P is a RV with f(p)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.36/102
61. From (2), we have
f(p|A) =
pk
(1 − p)n−k
f(p)
R 1
0
pk(1 − p)n−kf(p) dp
f(p|A) is called a posteriori density, and f(p) is called a
priori density for RV P .
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.37/102
62. For large n, pk
(1 − p)n−k
has a sharp maximum at p = k/n.
f(p)pk
(1 − p)n−k
is highly concentrated near p = k/n. If
f(p) has a sharp peak at p = 0.5, the coin is reasonably
fair, then, for moderate values of n, f(p)pk
(1 − p)n−k
has
two peaks; one near p = k/n and the other near p = 0.5.
As n increases sharpness of pk
(1 − p)n−k
prevails and the
resulting a posteriori density f(p|A) has the maximum near
k/n
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.38/102
63. If the probability of heads in coin tossing experiment is not
a number, but an RV P with density f(p). In the
experiment of the tossing of a randomly selected coin,
show that Prob{head}=
R 1
0
pf(p) dp
Solution:
A = {head} ⇒, the conditional probability of A is the
probability of heads if the coin with P = p is tossed. In
other words, Prob{A|P = p} = p
R 1
0
Prob{A|P = p}f(p) dp =
R 1
0
pf(p) dp = Prob{A}
This is the probability that at the next tossing head will show.
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.39/102
64. Example:
If P is a uniform RV, determine the posteriori density.
Solution:
A={k heads in n tossing in a specific order}
f(p|A) =
pk
(1 − p)n−k
R 1
0
pk(1 − p)n−k dp
=
(n + 1)!
k!(n − k)!
pk
(1 − p)n−k
, Beta density
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.40/102
65. Example:
Assuming that the coin was tossed n times and heads
showed k times, what is the probability that at the next
tossing heads would show?
Solution:
Z 1
0
pf(p|A) dp =
(n + 1)!
k!(n − k)!
Z 1
0
p pk
(1 − p)n−k
dp
=
k + 1
n + 2
, almost the common sense!
This is called the law of succession.
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.41/102
66. Function of a RV
y = g(x), x is a RV.
F(y) = Prob{y 6 y} = Prob{g(x) 6 y}
Example:
y = ax + b, x ∼ f(x)
Fy(y) = Prob{ax + b 6 y} = Prob
x 6
y − b
a
, a 0
= Fx
y − b
a
, a 0
= 1 − Fx
y − b
a
, a 0
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.42/102
67. Example:
y=x2
, x ∼ fx(x)
Fy(y) = Prob{x2
6 y} = Prob{−
√
y 6 x 6
√
y} =
Fx(
√
y) − Fx(−
√
y), y 0, Fy(y) = 0, y 0
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.43/102
68. Example: Hard limiter
y = g(x) =
1, x 0
−1, x 6 0
Fy(y) = Prob{y = 1} = Prob{x 0} = 1 − Fx(0)
Fy(y) = Prob{y = −1} = Prob{x 6 0} = Fx(0)
Example: Quantization
y = g(x) = ns, (n − 1)s x ns
Prob{y = ns} = Prob{(n−1)s x ns} = Fx(ns)−Fx((n−1)s)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.44/102
69. PDF determination
y = g(x), fy(y) =
n
X
i=1
fx(xi)
|g′(xi)|
,
where xi are the roots of y = g(x).
Example:
y = ex
, x ∼ N(0; σ2
)
There is only one roots: x = log y
y = ex
⇒ g′
(x) = ex
= y ⇒ fy(y) =
fx(log y)
y
, y 0
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.45/102
70. If x is an arbitrary RV with continuous distribution Fx(x)
and y = Fx(x) then y is a uniform RV in the interval [0, 1]
If 0 y 1 then y = Fx(x) has only a single solution for x1.
g′
(x) = F′
x(x) = fx(x) then
fy(y) =
fx(x1)
|g′(x1)|
=
fx(x1)
fx(x1)
= 1, 0 y 1
If y 0 or y 1 then y = Fx(x) has no real solution then
fy(y) = 0.
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.46/102
71. Example
Now, If we are given two distribution functions F1(x) and
F2(y), Find a monotonically increasing function g(x) such
that, if y = g(x) and Fx(x) = F1(x) then Fy(y) = F2(y).
Solution
We maintain that g(x) must be such that F2[g(x)] = F1(x)
Fy(y) = Prob{y 6 y} = Prob{g(x) 6 g(x)} =
Prob{x 6 x} = Fx(x)
therefore, if a particular CDF Fy(y) is given then RV that
with such CDF is: Because Fy(y) is a uniform RV then
x ∼ Unif[0, 1] ⇒ y = F−1
y (x)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.47/102
72. For continuous RV:
E(x) = η =
Z ∞
−∞
xfx(x)dx
For discrete type:
fx(x) =
X
i
piδ(x − xi), E(x) =
X
i
pixi
Conditional mean:
E(x|M) =
Z ∞
−∞
xf(x|M)dx
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.48/102
73. Mean of a function of a RV:
y = g(x), E(y) =
Z ∞
−∞
yfy(y)dy =
Z ∞
−∞
g(x)fx(x)dx
For continuous RV, variance:
σ2
=
Z ∞
−∞
(x − η)2
fx(x)dx
For discrete type:
σ2
=
X
i
pi(xi − η)2
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.49/102
75. Prob{|x − η| ε} 6
σ2
ε2
, Tchebycheff inequality
Prob{|x − η| ε} =
R
|x−η|ε
fx(x)dx, and by definition
σ2
=
R ∞
−∞
(x − η)2
fx(x)dx then σ2
R
|x−η|ε
(x − η)2
fx(x)dx,
and by assumption |x − η| ε then
σ2
Z
|x−η|ε
(x − η)2
fx(x)dx ε2
Z
|x−η|ε
fx(x)dx =
ε2
Prob{|x − η| ε}
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.51/102
76. Characteristic function:
E(e−jωx
) = Φ(ω) =
Z ∞
−∞
fx(x)e−jωx
dx, |Φ(ω)| 6 Φ(0) = 1
Moment Generating function:
Φ(s) =
Z ∞
−∞
fx(x)e−sx
dx
Second moment generating function:
Ψ(s) = ln Φ(s)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.52/102
77. Φ(n)
(s) = E{(−1)n
xn
e−sx
} ⇒ (−1)n
Φ(n)
(0) = E{xn
}
Φ(s) =
∞
X
n=0
(−1)n E(xn
)
n!
sn
, s → 0
This is true if moments are finite and then the series
converges absolutely near s = 0
For continuous RV, Cumulants:
γn =
dn
dsn
Ψ(s) |s=0 , Ψ(s) =
∞
X
n=1
(−1)n γn
n!
sn
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.53/102
78. For discrete RV, Characteristic function:
Φ(ω) = E(e−jωx
) =
X
i
pie−jωxi
, DFT of pi sequence
If n is of lattice type RV:
Γ(z) = E(zn
) =
∞
X
n=−∞
pnzn
then Γ(1/z) is z transform of the sequence
pn = Prob{n = n}
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.54/102
79. Example
For Binomial, and Poisson RV find Γ(z)
Solution:
pk =
n
k
pk
qn−k
⇒ Γ(z) = (pz + q)n
pk = e−λ λk
k!
⇒ Γ(z) = eλ(z−1)
Moments:
E(kn
) = Γ(n)
(z = 1)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.55/102
84. Example
f(x, y) =
1
2πσ2
exp
−
x2
+ y2
2σ2
Determine Prob{x2
+ y2
6 z2
}
Solution:
Prob{x2
+ y2
6 z2
| {z }
D
} =
Z
D
Z
f(x, y)dxdy
x = r cos θ, y = r sin θ ⇒
Prob{x2
+y2
6 z2
} =
1
2πσ2
Z z
0
Z 2π
0
e−r2/2σ2
rdrdθ = 1−e−z2/2σ2
, (Rayleigh)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.60/102
85. Example
f(x, y) =
1
2πσ2
exp
−
(x − ηx)2
+ (y − ηy)2
2σ2
Determine Prob{x2
+ y2
6 z2
}
Solution:
Prob{x2
+ y2
6 z2
| {z }
D
} =
Z
D
Z
f(x, y)dxdy
x = z cos θ, y = z sin θ, η =
q
η2
x + η2
y, ηx = η cos φ, ηy = η sin φ,
Fz(z) =
1
2πσ2
Z z
0
Z 2π
0
exp
v2
− 2zηx cos θ − 2zηy sin θ + η2
−2σ2
vdvdθ
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.61/102
86. fz(z) =
z
2σ2π
exp
−
z2
+ η2
2σ2
Z 2π
0
exp
vη cos(θ − φ)
σ2
dθ
fz(z) =
z
σ2
exp
−
z2
+ η2
2σ2
Z 2π
0
exp vη cos ω
σ2
2π
dω
fz(z) =
z
σ2
exp
−
z2
+ η2
2σ2
I0(
zη
σ2
), (Rician)
As η → 0 Rician RV approaches a Rayleigh RV.
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.62/102
87. z = g(x, y), w = h(x, y)
Joint CDF:
Fzw(z, w) = Prob{(x, y) ∈ Dzw} =
Z
D zw
Z
fxy(x, y)dxdy
Example:
z =
p
x2 + y2, w =
y
x
Joint PDF:
z = g(xn, yn), w = h(xn, yn), xn, yn are the roots ⇒
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.63/102
126. Let x ∼ U(0, 1), y ∼ U(0, 1). Suppose X and Y are
independent. Define Z = X + Y, W = X − Y . Show that Z
and W are dependent, but uncorrelated RVs.
Solution:
x =
z + w
2
, y =
z − w
2
.
0 z 2, −1 w 1, z+w 6 2, z−w 6 2, z |w|, |J(z, w)| = 1/2.
fZW (z, w) =
1/2, 0 z 2, − 1 w 1, z + w 6 2, z − w 6 2, |w| z,
0, otherwise,
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.66/102
127. z
w
-
-
×
fZ(z) =
Z
fZW (z, w)dw =
R z
−z
1
2
dw = z, 0 z 1,
R 2−z
z−2
1
2
dw = 2 − z, 1 z 2,
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.67/102
128. Or,
fZ(z) = fX(z) ⊗ fY (z) =
z, 0 z 1,
2 − z, 1 z 2,
0, otherwise,
Clearly, fZW (z, w) 6= fZ(z)fW (w), Z and W are not
independent. However,
E(ZW) = E [(X + Y )(X − Y )] = E(X2
) − E(Y 2
) = 0,
E(W) = E(X − Y ) = 0,
Cov(Z, W) = E(ZW) − E(Z)E(W) = 0
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.68/102
129. z = g(x, y)
Mean: E{z} =
R ∞
−∞
R ∞
−∞
g(x, y)f(x, y)dxdy
Covariance: C = E{(x − ηx)(y − ηy)}
Correlation coefficient: r = C/σxσy, |r| 6 1
Uncorrelatedness: if C = 0 or r = 0
Orthogonality: E{xy} = 0 ⇔ x⊥y
Moments: E{xk
yr
} =
R ∞
−∞
R ∞
−∞
xk
yr
f(x, y)dxdy
Joint MGF: Φ(s, u) = E{e−(sx+uy)
}, s, u ∈ C
Marginal MGF: Φx(s) = Φ(s, 0), Φy(u) = Φ(0, u)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.69/102
130. Joint characteristic functions (i.e, JMGF) are useful in
determining the PDF of linear combinations of RVs.
Example:
With X and Y as independent Poisson RVs with
parameters λ1 and λ2 respectively, and Z = X + Y
Solution:
Z = X + Y ⇒ ΦZ(ω) = ΦX(ω)ΦY (ω)
ΦX(ω) = eλ1(e−jω−1)
, ΦY (ω) = eλ2(e−jω−1)
⇒
Φz(ω) = e(λ1+λ2)(e−jω−1)
⇒ Z ∼ Poiss(λ1 + λ2)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.70/102
131. Fy(y|x1 6 x 6 x2) =
Prob{x1 6 x 6 x2, y y}
Prob{x1 x x2}
=
F(x2, y) − F(x1, y)
| {z }
R y
−∞
R x2
x1
f(x,y)dxdy
F(x2) − F(x1)
Differentiating from both sides with respect to y, we have:
fy(y|x1 x x2) =
R x2
x1
f(x, y)dx
F(x2) − F(x1)
As x1 → x2 we have:
f(y|x = x) =
f(x, y)
fx(x)
= f(y|x)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.71/102
132. Similarly, we have
f(x, y) = f(x|y)fy(y) = f(y|x)fx(x)
If x and y are independent we have:
f(x|y) = fx(x), f(y|x) = fy(y)
Baye’s theorem for PDF:
f(x|y) =
f(x, y)
fy(y)
=
f(x, y)
R ∞
−∞
f(y|x)fx(x)dx
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.72/102
133. Example: determine f(x|y), f(y|x)
fXY (x, y) =
k, 0 x y 1,
0, otherwise,
Z Z
fXY (x, y)dxdy =
Z 1
0
Z y
0
kdxdy =
Z 1
0
kydy =
k
2
= 1 ⇒ k = 2
fX(x) =
Z
fXY (x, y)dy =
Z 1
x
kdy = k(1 − x), 0 x 1,
fY (y) =
Z
fXY (x, y)dx =
Z y
0
kdx = ky, 0 y 1.
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.73/102
134. fX|Y (x|y) =
fXY (x, y)
fY (y)
=
1
y
, 0 x y 1,
fY |X(y|x) =
fXY (x, y)
fX(x)
=
1
1 − x
, 0 x y 1.
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.74/102
135. Example: Poisson sum of Bernoulli random variables. Let
Xi, i = 1, 2, 3, · · · represent independent, identically
distributed Bernoulli random variables with
P(Xi = 1) = p, P(Xi = 0) = 1 − p = q
and N a Poisson random variable with parameter λ that is
independent of all Xi. Consider the random variables
Y =
N
X
i=1
Xi, Z = N − Y.
Show that Y and Z are independent Poisson random vari-
ables.
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.75/102
136. Solution: To determine the joint probability mass function
of Y and Z, consider
P(Y = m, Z = n) = P(Y = m, N − Y = n) = P(Y = m, N = m + n)
= P(Y = m |N = m + n)P(N = m + n)
= P(
N
P
i=1
Xi = m |N = m + n)P(N = m + n)
= P(
m+n
P
i=1
Xi = m)P(N = m + n)
(Note that
m+n
X
i=1
Xi ∼ B(m+n, p) and Xis are independent of N)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.76/102
148. x
−x
= 0, 0 x 1
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.81/102
149. φ(x) = E{y|x} = a function of x (3)
φ(x) = E{y|x} = a function of RV x (4)
For (4), we have:
E{φ(x)} =
Z ∞
−∞
φ(x)f(x)dx =
Z ∞
−∞
f(x)dx
Z ∞
−∞
yf(y|x)dy
| {z }
φ(x)
= E{E{y|x}
| {z }
φ(x)
}
Therefore, we have:
Ey{Ex{y|x}} =
Z ∞
−∞
yf(x, y)dydx = E{y}
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.82/102
150. This result can be generalized:
E{g(x, y)|x} = a function of x
=
Z ∞
−∞
f(x)
Z ∞
−∞
g(x, y)f(y|x)dydx (5)
The last equation is obtained via:
E{g(x, y)|M} =
Z ∞
−∞
Z ∞
−∞
g(x, y)f(x, y|M)dydx
(5) is:
Z ∞
−∞
Z ∞
−∞
g(x, y)f(x, y)dydx = E{g(x, y)} (6)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.83/102
151. because of (6), we have
E{Ex{g(x, y)|x}} = E{g(x, y)}
Note that the following are the extensions of above:
E{g1(x)g2(y)} = E{E{g1(x)g2(y)|x}} = Ex{g1(x)Ey{g2(y)|x}}
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.84/102
152. Is it possible to estimate a RV? The answer is yes, but some
estimation tools are required. One of the important
estimation tools is called mean square estimation(MSE)
principle.
If a RV y is to be estimated by a constant c based on MSE
principle we have the following:
e = E{(y − c)2
} =
Z ∞
−∞
(y − c)2
f(y)dy ⇒
We then minimize e with respect to the unknown c.
∂e
∂c
= −2
Z ∞
−∞
(y−c)f(y)dy = 0 ⇒ c =
Z ∞
−∞
yf(y)dy, (Mean value)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.85/102
153. If a RV y is to be estimated by a function of another RV x
based on MSE principle we have the following:
e = E{(y − c(x))2
} =
Z ∞
−∞
Z ∞
−∞
(y − c(x))2
f(x, y)dydx
=
Z ∞
−∞
f(x)
Z ∞
−∞
(y − c(x))2
f(y|x)dy
| {z }
0
dx
The integral is minimum if the inner integral is minimum, for
∀x. This can only occur if
min
c(x)
J =
Z ∞
−∞
(y − c(x))2
f(y|x)dy, ∀x
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.86/102
176. Covariance, and correlation matrix:
Cij = E{(xi − ηi)(xj − ηj)}, Rij = E{xixj}, C = R − ηT
η
Correlation matrix is a positive semidefinite matrix:
Eigenvalues of R are nonnegative.
Characteristic function of a sequence:
Φ(Ω) = E{e−jΩxT
}, x = [x1, · · · , xn], Ω = [ω1, · · · , ωn]
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.95/102
178. Central Limit Theorem: Suppose
x1, x2, · · · , xn
are a set of zero mean independent, identically distributed
(IID) random variables with some common distribution.
Consider their scaled sum
x =
x1 + x2 + · · · + xn
√
n
.
Then asymptotically as n → ∞ ⇒ x ∼ N(0, σ2
)
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.97/102
179. Proof: Although the theorem is true under even more
general conditions, we shall prove it here under the
independence assumption. Let σ2
represent their common
variance. Since
E{xi} = 0 ⇒ E{x2
i } = σ2
we have
Φx(u) = E{e−jux
} =
n
Y
i=1
E{e−ju/
√
nxi
} =
Φxi
(u/
√
n)
n
E(ejxiu/
√
n
) = E
1 −
jxiu
√
n
+
j2
x2
i u2
2!n
+
j3
x3
i u3
3!n3/2
+ · · ·
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.98/102
180. = 1 −
σ2
u2
2n
+ o
1
n3/2
,
Φx(u) =
1 −
σ2
u2
2n
+ o
1
n3/2
n
, lim
n→∞
1 −
z
n
n
= e−z
lim
n→∞
Φx(u) → e−σ2u2/2
,
The central limit theorem states that a large sum of inde-
pendent random variables each with finite variance tends to
behave like a normal random variable.
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.99/102
181. Thus the individual PDFs become unimportant to analyze
the collective sum behavior. If we model the noise
phenomenon as the sum of a large number of independent
random variables (eg: electron motion in resistor
components), then this theorem allows us to conclude that
noise behaves like a Gaussian RV
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.100/102
182. Caution: It may be remarked that the finite variance
assumption is necessary for the theorem to hold good. To
prove its importance, consider the RVs to be Cauchy
distributed, and let
Φxi
(u) = e−α|u|
,
where xi ∼ C(α)
Φx(u) =
n
Y
i=1
Φx(u/
√
n) =
e−α|u|/
√
n
n
∼ C(α
√
n),
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.101/102
183. which shows that x is still Cauchy with parameter In other
words, central limit theorem does not hold good for a set of
Cauchy RVs as their variances are undefined.
AKU-EE/1-9/HA, 1st Semester, 85-86 – p.102/102