Varun Balupuri - Thesis

Stochastic Control for Optimal Dynamic
Trading Strategies
Varun Balupuri
Department of Mathematics
King’s College London
The Strand, London WC2R 2LS
United Kingdom
Email: varun.nair-balupuri@kcl.ac.uk
Tel: +44 (0)583 248 930
19 September 2016
Report submitted in partial fulfillment of the
requirements for the degree of MSc in Finan-
cial Mathematics in the University of London

Abstract
In this paper, we apply dynamic programming to solve Merton’s portfolio prob-
lem in the classical Black-Scholes model under the familiar cases of power, ex-
ponential and logarithmic utility, where we show that the optimal strategy is to
keep a constant proportion of wealth in the risky asset.
We also examine the problem in in the presence of stochastic volatility.
The problem is found to be solved by a non-stochastic function of time and
we perform Monte Carlo simulations to numerically verify this. A focus of this
paper is on numerical estimates and analysis to back up theoretical results.
A brief overview of the problem in the presence of transaction costs and the
associated diﬃculties is presented in Chapter 4.
1

Contents
1 INTRODUCTION 3
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Market model and notation . . . . . . . . . . . . . . . . . . . . . 4
1.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Portfolio Allocation 8
2.1 Optimising Risky Wealth Under Exponential Utility . . . . . . . 8
2.2 Extension to non-zero interest rates . . . . . . . . . . . . . . . . 10
2.3 Numerical Implementation/Monte Carlo Estimates . . . . . . . . 12
2.3.1 Re-balancing the portfolio . . . . . . . . . . . . . . . . . . 18
2.4 Addition of Consumption and Inﬁnite Horizons . . . . . . . . . . 22
3 Extension to Stochastic Volatility 25
3.1 Merton’s Problem in the Heston Model . . . . . . . . . . . . . . . 25
4 Transaction Costs 32
4.1 Proportional Transaction Costs . . . . . . . . . . . . . . . . . . . 32
5 CONCLUSION 36
A Simulation and Graphing Code 39
2

Chapter 1
INTRODUCTION
1.1 Background
Stochastic optimisation is concerned with controlling dynamic systems with
stochastic pertubations to maximise (or minimise) some criteria, usually a func-
tion representing value or a stopping time attaining said value. Richard Bellman
pioneered the dynamic programming equation and approach to these types of
problems in a 1954 paper, ’The theory of dynamic programming’ [1]
In the 1970’s the mathematics of dynamic programming and stochastic
control was applied in the context of economics/financial mathematics, first
recieving widespread recognition due to Merton’s 1973 paper [2].
From a mathematical approach, the field of dynamic portfolio choice was
first approached by Merton in his much celebrated, seminal papers ([3] [4]),
building upon the Black-Scholes framework, Merton employed techniques from
from stochastic control to show how to create optimal portfolios consisting of
a risky and riskless asset in a friction-less market with constant volatility and
constant interest rates. Since then, Merton’s ’portfolio problem’ has become a
highly studied topic and there have been many papers addressing improvements
and extensions to this framework.
In this paper we consider maximizing an investors value function using
dynamic programming to solve the associated Heath-Jacobi-Bellman equation
with various utility functions in the HARA1
class, as Merton did and we place
extra emphasis on numerical analysis and verification of these results through
Monte Carlo methods.
In Chapter 1 we look at the topic of stochastic control in the context of
portfolio optimisation, building the initial framework and presenting the relevant
mathematical theorems and results which are important for later chapters.
In Chapter 2 we look at the classical 2-asset Merton portfolio problem in
1hyperbolic absolute risk aversion (HARA) functions are easy to model mathematically
and have properties which make them suited to modeling reality.
3

CHAPTER 1. INTRODUCTION 4
the Black-Scholes framework under exponential utility and power utility. In the
most basic sense, the investor wants to decide what proportion of his wealth
to keep in the risky asset to maximise their expected utility over a finite time
horizon. In addition to optimising the investor’s ’risky wealth’ we consider the
case when a consumption parameter is added, seeking to optimise the investors
rate of removal of money as well. We will consider Merton’s problem over
an infinite horizon too, which is useful when modeling how to invest/consume
ones savings until retirement or death. We will show via the use of dynamic
programming, that all of these problems have a closed form solution and can be
explicitly calculated.
Since Merton’s problem is built upon many unrealistic assumptions, such
as the ability to trade continuously without fee and constant interest rates, the
addition of transaction costs and stochastic volatility is a very important step
in building a more realistic model.
Mathematical models have been introduced to add stochastic volatility
and stochastic interest rates to model stock prices. We examine Merton’s prob-
lem with stochastic volatility in Chapter 3, focusing on Heston’s model. Using
Liu and Muhle-Karbe’s approach we show that even in this setting, Merton’s
problem has an explicit solution and verify the theoretical results by discretizing
and using Monte Carlo methods.
The addition of transaction costs are briefly considered in chapter 4. Pro-
portional transaction costs were first incorporated By Davis and Norman in
1990 [6]. Since the investor cannot make a large amount of trades without ac-
cruing large fees, the optimal strategy changes from continuous re-balancing of
the portfolio to one which involves only making trades if the ratio falls outside
of a region, called the ’no trade’ region. Finding the boundaries of this region
is not a trivial task and computational approaches fall beyond the scope of this
thesis.
This paper focuses on models where an explicit solution to Merton’s prob-
lem exists, however this is not generally the case. We can only use the Heath-
Jacobi-Bellman equation when the value function is sufficiently smooth. For
many problems, this is not the case and alternate methods such as the viscos-
ity solution approach introduced by Crandall and Lions (1980) is an effective
method to attack such problems. A modern development in the literature is on
backward stochastic differential equations, which provide a probabilistic repre-
sentation of non-linear PDE’s. These topics are not discussed in this paper.
1.2 Market model and notation
Unless explicitly stated, consider a frictionless market with two assets, a risky
stock, denoted by S, modelled by a stochastic price process (St)t≥0, which
follows the standard Black-Scholes dynamics:
dSt = St(µdt + σdWt)

and a riskless ’bank account’ Bt, whose dynamics are given by the ODE:
dBt = rBtdt
Definition 1.2.1 (Self Financing Portfolio). In continuous time, consider a
portfolio consisting of ∆t units of the risky asset S and φt units of the riskless
asset B. This portfolio has a time t value of Vt = ∆tSt + φtBt. We say that
the portfolio is self-financing2
if:
Vt = V0 +
t
0
∆wdSw +
t
0
φwdBw
We assume the investor’s actions have no affect on the market and the
investor’s strategy is self-financing.
Unless explicity stated otherwise, let µ, σ, r ∈ R and Wt be a standard one-
dimensional Brownian Motion on complete probability space (Ω, F, P), with
(Ft)t≥0 being a filtration.
Definition 1.2.2 (Adapted Process). A process (Xt)t≥0 is adapted with respect
to a filtration F = (Ft)t≥0 if:
∀t ≥ 0, Xt is measurable.
Definition 1.2.3 (Progressively Measurable Process). A process (Xt)t≥0 is
progressively measurable with respect to F if ∀t ≥ 0:
the map [0, t] × Ω defined by (s, ω) → Xs(ω) is B([0, t]) ⊗ Ft-measurable.
1.3 Preliminaries
Important theorems and results which are applied in latter chapters are pre-
sented here without formal proof. References to derivations are provided for
the interested reader.
Consider stochastic differential equation:
dXt = b(Xt, αt)dt + σ(Xt, αt)dWt (1.1)
with X0 = x , (αt)t≥0 is a progressively measurable process. Let functions a, σ
satisfy the Lipschitz condition.
Finite Horizon Case
For proofs and extensions of the following theorems, see pg. 40-46 of Pham’s
textbook [5]
2Intuitively this means that no money is exogenously added or withdrawn from the portfolio

Fix T ∈ (0, ∞). Define
A = α : E
T
0
|b(0, αt)|2
+ |σ(0, αt)|2
dt < ∞
For functions f, g define our gain function as
J(t, x, α) = E
t,x
T
t
f(s, Xs, αs)ds + g(XT )
The value function linked to this is v(t, x) = sup
α∈A
J(t, x, a)
Theroem 1.3.1 (Dynamic Programming Principle). For all t1 ∈ [t, T]
v(t, x) = sup
α∈A
E
t,x
t1
t
f(s, Xs, αs)ds + v(t1, Xt1
)
If v is smooth, making use of Itô’s formula leads us to the HJB equation
in the finite-horizon case.
Theroem 1.3.2 (Finite Horizon Hamilton-Jacobi-Bellman equation). 3
Let
α = a ∈ R with a arbitrary. The value function v(t, x) satisfies the following
partial differential equation (if the supremum is finite), known as the Hamilton-
Jacobi-Bellman equation:
∂v
∂t
+ sup
a
f(t, x, a) + b(x, a)
∂v
∂x
+
1
2
|σ(x, a)|2 ∂2
v
∂x2
= 0
with boundary condition v(T, x) = g(x)
Infinite Horizon Case
For the infinite horizon class, we consider problems where T = ∞. Xt is time-
homogeneous and we discount the gain function to maintain finiteness of J(x, α)
For β > 0, let A(x) be the set of admissible controls α satisfying
E
∞
0
e−βs
|f(XS, αS)|ds < ∞
Define
J(x, α) = E
∞
0
e−βs
f(Xx
S, αS)ds
and similarly to the finite horizon case, the value function is v(x) = sup
α∈A((§)
J(x, α)
Theroem 1.3.3 (Infinite Horizon Hamilton-Jacobi-Bellman equation). Assum-
ing our SDE follows (1.1) v(t, x) satisfies ∀x ∈ R:
βv(x) + sup
a
f(x, a) + b(x, a)
∂v
∂x
+
1
2
|σ(x, a)|2 ∂2
v
∂x2
= 0
3In Pham’s proof, the infinitesimal generator, La is used. We do not use it throughout
this paper for consistency and to highlight the explicit equations

A vital result in dynamic programming is the verification theorem, which
ensures that for an optimal control problem, a candidate solution of a non-linear
PDE coincides with the value function.
We do not state the theorem here, but it’s consequences are important for
ensuring our solutions are indeed optimal controls.[5].
Utility Classes
The aim of Merton’s portfolio problem is to optimise the investors (expected)
utility. We must take into account the investors risk aversion. A more risk
averse investor would prefer to place more of his capital in the riskless asset to
ensure guaranteed returns rather than a riskier stock.
Definition 1.3.4 (Hyperbolic Absolute Risk Aversion). A utility function U(x)
is said to be a HARA utility function if and only if it is of the form:
U(x) =
1 − γ
γ
ax
1 − γ
+ b
γ
with a > 0 and ax
1−γ + b > 0.
In this paper we consider three types of utility function:
• Exponential : U(x) = −e−αx
• Power : U(x) = 1
1−γ x1−γ
• Logarithmic : U(x) = log(x)
To see that exponential utility belongs in the class, take the limit as γ tends to
∞ and b = 1 in Definition 1.3.4. Similarly, the logarithmic utility is obtained
by setting a = 1 and observing the limit as γ tends to 0.
These utility functions all fall under the HARA class and for many of the
optimization problems studied in this paper, they lead to closed form solutions.
In the case of power utility which we will consider in this section, we can
simplify calculations by using a result about its homotheticity.
Proposition 1.3.5 (Power Utility Homotheticy). Let U(x) = 1
γ xγ
for γ ∈
(0, 1). Then V (t, x, y) is increasing and concave in both x and y and ∀ρ > 0:
V (t, ρx, ρy) = ργ
V (t, x, y)
For a proof, see Davis and Norman 1990 Theorem 3.1 [6]

Chapter 2
Portfolio Allocation
2.1 Optimising Risky Wealth Under Exponen-
tial Utility
In this subsection, we assume interest rate, r = 0. Hence the risk-free account
follows Bt = 1. Let Xt denote the investor’s total wealth at time t. Let πt be
the proportion of total wealth that the investor has invested in the risky asset
S at time t.
Hence, πtXt is the amount of wealth in units of currency invested in the
stock at time t and πtXt
St
is the number of shares the investor owns at t.
We wish to optimize over all possible dynamic strategies πt, the proportion
of the investor’s total wealth which is ’risky wealth’ in order to maximize their
expected utility. Xt follows dynamics1
dXt =
πtXt
St
dSt +
(1 − πt)Xt
Bt
dBt
dXt =
πtXt
St
St(µdt + σdWt)
= πtXt(µdt + σdWt)
(2.1)
The sum of proportions invested over all assets in the market model must equal
1. In this two asset market this of course means the investor places (1 − πt)Xt
units of wealth into the riskless asset. If πt > 1, this implies the investor is short
in the riskless asset. Similarly if πt < 0 the investor is short in the risky asset.
Deﬁne utility function U : R → R with risk aversion parameter α by
U(x) = −e−αx
This falls under the class of constant absolute risk aversion (CARA) utility
1We may also impose the additional restriction that Xt ≥ 0 depending on whether the
investor is allowed to continue trading after bankruptcy.
8

CHAPTER 2. PORTFOLIO ALLOCATION 9
functions and is concave and increasing on R implying ∀t ∈ [0, T], V (t, x) is also
increasing and concave. ([5] see pg.52 for proof)
Define the set of all admissible trading strategies
A = {π(·) : bounded, adapted stochastic process s.t: Xπ
x ≥ 0 P-a.s }
Where Xπ
x (t) corresponds to the wealth of the investor following strategy π at
time t with initial wealth x. We want to maximise expected utility over all
π ∈ A
Define our value function:
V (t, x) = sup
π∈A
E
t,x
[U(XT )] (2.2)
The Hamilton-Jacobi-Bellman equation corresponding to (2.2) is given by The-
orem 1.3.22
∂V
∂t
+ sup
π∈R
πxµ
∂V
∂x
+
1
2
π2
σ2
x2 ∂2
V
∂x2
= 0 (2.3)
With boundary condition V (T, x) = U(x) Denote optimal π by π∗
. Differenti-
ating the sumpremum part in (2.3) with respect to π and equating to zero to
find optimal π yields:
d
dπ
πxµ
∂V
∂x
+
1
2
π2
σ2
x2 ∂2
V
∂x2
= 0
xµ
∂V
∂x
+ x2
σ2
π∗ ∂2
V
∂x2
= 0
=⇒ π∗
=
− µ∂V
∂x
xσ2 ∂2V
∂x2
(2.4)
Substituting this optimal value of π∗
into (2.3) gives rise to a non-linear PDE :
∂V
∂t
−
µ2
2σ2
∂V
∂x
2
∂2V
∂x
= 0 (2.5)
This is a separable partial differential equation, so we can find a solution using
an ansatz of the form V (t, x) = −e−αx−αβ(T −t)
We now have:
∂V
∂t
= −αβe−αx−αβ(T −t)
∂V
∂x
= αe−αx−αβ(T −t)
∂2
V
∂x2
= −α2
e−αx−αβ(T −t)
(2.6)
2In this case the corresponding variable are f(t, x, π) = 0, b(x, π)dt+σ(x, π)dWt = πµxdt+
σπxdWt

Substituting (2.6) into (2.5) gives us a value for β
β =
µ2
2ασ2
(2.7)
Also, π∗
in (2.4) becomes
π∗
=
µ
xσ2α
(2.8)
It is important to note that π∗
is constant (x is fixed as X0 = x) in agreement
to Merton’s discovery, meaning the optimal strategy is for the investor to keep
a constant fraction of his wealth, π∗
in the risky asset S and rebalance his
portfolio continuously to maintain this proportion.
The analytic exponential utility V (t, x) is
V (t, x) = − exp(−αx − α(
µ2
2ασ2
)(T − t)) (2.9)
The dynamics of Xt derived in (2.1) now reduce to Arithmetic Brownian Motion,
which by direct integration gives:
dXt =
µ2
σ2α
dt +
µ
σα
dWt =⇒ Xt = X0 +
µ2
σ2α
t +
µ
σα
Wt (2.10)
Hence Xt
Dist
∼ N(X0 + µ2
σ2α t, µ2
σ2α2 t). We see that if no modification to the
model is made to prevent the investor from stopping when he hits bankruptcy,
then Xt can become negative.
2.2 Extension to non-zero interest rates
We extend our market model from the previous section by now adding a constant
interest rate r to the riskless asset.
Let St follow:
dSt = St[(r + µ)dt + σdWt]
and let the riskless asset obey dBt = rBtdt as usual.
Consider a ’power utility’/constant relative risk aversion (CRRA) utility
function:
U(x) =
1
1 − γ
x1−γ
, γ ∈ (0, 1)
Consider as before all admissible dynamic trading strategies, πt. Again we want
to optimise V (t, x) as defined in (2.2) for our new utility function on a finite
time horizon.
The investor’s total wealth obeys

dXt =
πtXt
St
dSt +
Xt(1 − πt)
Bt
dBt
= πtXt[(r + µ)dt + σdWt] + Xt(1 − πt)rdt
= Xt[πtµ + r]dt + XtπtσdWt
(2.11)
By applying Theorem 1.3.2, the corresponding HJB equation:
∂V
∂t
+ sup
π∈A
(πxµ + rx)
∂V
∂x
+
1
2
π2
σ2
x2 ∂2
V
∂x2
= 0 (2.12)
With boundary condition V (T, x) = U(x)
By using ansatz V (t, x) = 1
1−γ x1−γ
f(t) for some function f and following
the same procedure as in the exponential utility case to ﬁnd a solution for π∗
and β results in
∂V
∂t
=
1
1 − γ
x1−γ
f (t)
∂V
∂x
= x−γ
f(t)
∂2
V
∂x2
= −γx−γ−1
f(t)
(2.13)
Substituting (2.13) in to (2.12) for optimal π∗
and unknown f(t) gives us
explicit solutions:
π∗
=
µ
γσ2
(2.14)
1
1 − γ
x1−γ
f (t) = (
−µ2
2γσ2
− r)x1−γ
f(t)
=⇒ f (t) = −(1 − γ)(
µ2
2γσ2
+ r)f(t)
(2.15)
The solution of the ODE in (2.15) with the initial condition f(T) = 1 is given
by3
f(t) = e
(1−γ)( µ2
2γσ2 +r)(T −t)
3This is obvious as at time T we have V (T, x) = U(x) = 1
1−γ
x1−γ

So the previously unknown function f(t) is:
f(t) = e(T −t)(1−γ)β
Where we have used:
β =
1
2
µ2
γσ2
+ r
We say beta represents a ’fictitious safe rate’. If the investor was to place his
wealth in a safe asset with compound interest rate β, they would attain the
same utility as following trading strategy π∗
.
This gives us an explicit representation for V (t, x)
V (t, x) =
1
1 − γ
x1−γ
exp (T − t)(1 − γ) + (
1
2
µ2
γσ2
+ r) (2.16)
Optimal π∗
is constant and is given by:
π∗
=
µ
γσ2
In this case, the dynamics of Xt in (2.11) reduce to Geometric Brownian
Motion
dXt = Xt[πtµ + r]dt + XtπtσdWt
= (
µ2
γσ2
+ r)Xtdt +
µ
γσ
dWt
=⇒ Xt = X0 exp (
µ2
γσ2
+ r) −
µ2
2γ2σ2
t +
µ
γσ
Wt
(2.17)
2.3 Numerical Implementation/Monte Carlo Es-
timates
In this section we wish to verify previous results. We must first use a method
to disretize from the continuous time setting in the previous section so we can
perform numerical analysis.
Once we have a model to simulate stock prices and the dynamics of the
wealth processes, we can perform thousands of simulations and use these paths
to find an estimate for the value function in (2.2).
Simulation Model
The stock price process and wealth process with constant π∗
follow geometric
brownian motion and can be discretized by various methods such as the Euler

Maruyama method or the Milstein method, but since GBM has a closed form
solution it is possible to simulate the log-price process then exponentiate.
Assume dSt = St(µdt + σdWt). Let us divide the interval [0, T] into N
equal time steps such that T = Nδ then the following code simulates a price
path, making use of the property Wt
Dist
∼
√
tN(0, 1) 4
#GenStockPrice.py
def gen_BS_pricepath(mu,sigma,S0,N,T):
delta = T/float(N)
s = np.zeros((N+1)
s[0] = np.log(S0)
for i in range(1,N+1):
s[i] = s[i-1] + (mu-0.5*sigma**2)*delta +
sigma*np.sqrt(delta)*np.random.normal()
return np.exp(s)
The riskless asset is not inﬂuenced by any stochastic element, so to simu-
late the riskless balance at each δ5
we simply loop over:
#Bank Balance logic
b[i] = b[i-1]*np.exp(r*delta)
Optimal pi
For exponential utility U(x) = −e−αx
, using our derived V (t, x) = −e−αx−αβ(T −t)
in (2.9) with β = −µ2
2ασ2 with test parameters: t = 0, T = 1, x = 1, µ = 0.05, σ =
0.2, α = 1 gives π∗
= 1.25.
Comparing the theoretically derived value function, V (t, x) to a Monte
Carlo simulation of the expected utility of the wealth process with optimal π∗
and 10,000 scenarios gives us a result in very close agreement:
• Analytic V (t, x) = −0.35678
• Simulated V (t, x) = −0.35656
To verify that this value of π∗
is indeed the optimal choice for these pa-
rameters, we show numerically that the expected utility decreases when π∗
is
perturbed.
For the power utility case with γ = 1
2 and σ = 0.2, a Monte Carlo estimate
with 10000 simulations of V (t, x) lead to results in very close agreement to the
theoretical values of V (t, x) shown in the table below
4N(µ, σ2) is the CDF of a normal random variable with mean µ and variance σ2
5Interest is continuously compounded in these simulations.

Figure 2.1: Proportion Risky Wealth vs. Expected Utility for π-values at in-
crements of 0.02 with 1,000 simulations each. Optimal π∗
= 1.25 shown by red
line
Figure 2.2: A sample stock price path with optimal π∗
and optimal number of
shares π∗
Xt
St
for µ = 0.05, σ = 0.2, α = 1 and T = 1 year. (Exponential Utility)
Parameters Monte Carlo V (t, x) Theoretical V (t, x)
µ = 0.02, r = 0 6.383373 6.388118
µ = 0.05, r = 0 6.712548 6.732454
µ = 0, r = 0 6.324555 6.324555
µ = 0.02, r = 0.02 6.524122 6.517167
µ = 0.05, r = 0.02 6.884561 6.868459
µ = −0.02, r = 0.02 6.514090 6.517167

We turn our attention now to the investor’s risk aversion parameter. In-
tuitively, increasing α (or γ in the power utility case) means the investor is
less prone to taking on risk via the risky asset. Consequentially, their expected
wealth should be lower, but standard deviation should also be smaller. Con-
versely a low risk aversion means the investor is willing to invest a greater
proportion of his/her wealth in the risky asset and (in the case of these parame-
ters) will result in a higher expected terminal wealth but with a higher standard
deviation. This can be seen in Figure 2.3 for exponential utility, with the ﬁnal
wealth distribution Xt tending to N(X0 + µ2
σ2α t, µ2
σ2α2 t) as expected due to the
ABM behaviour of Xt.
In the case of power utility function, the wealth dynamics are given by
the Geometric Brownian Motion in (2.17), and so in this case Xt will be log-
normally distributed and this behaviour can be clearly seen in Figure 2.4 with
the lack of symmetry and longer right tail.

Figure 2.3: Histogram of terminal wealth distribution for diﬀerent risk aversions
for exonential case utility case. µ = 0.05, σ = 0.2, T = 1 year.

Figure 2.4: Histogram of terminal wealth distribution for diﬀerent risk aversions
for power utility case. µ = 0.05, σ = 0.2, T = 1 year

2.3.1 Re-balancing the portfolio
The results we have so far derived rely on the investor to be able to rebalance
his portfolio continuously at time. However, in practice it is impossible for
an to rebalance in such a manner. In this subsection, we look at rebalancing
our portfolio weights at various frequencies to see how this aﬀects our terminal
wealth and utility.
We have derived a closed form solution for the wealth process Xt and
shown that in the absence of transaction costs or fees, the optimal control pit
is constant. This gives us a theoretical value in a continuous time setting with
instant and continuous rebalancing given by (2.17) in the case of power utility
and (2.10) in the case of exponential utility.
We shall consider the power utility case, with Xt behaving as in (2.17).
Recall, the stock price obeys Geometric Brownian Motion and the riskless ac-
count is continuously compounded at a constant interest rate of r.
Figure 2.5: Sample path of theoretical total wealth process, with amount of
wealth in stock and in riskless if continuous rebalancing is performed.
Figure 2.5 demonstrates how Xt, wealth in stock and wealth in bank evolve
in the case of continuous portfolio rebalancing.
As before, divide [0, T] into N equal time steps such that T = Nδ. Let
Xiδ be the investor’s total wealth at discrete time point iδ for i ∈ [0, N]. Let
XS
iδ denote the amount of wealth invested in the stock S at iδ and XB
iδ be the
amount of wealth invested in the riskless asset B.
At certain regular time-points kiδ for some k ∈ N the investor rebalances

their portfolio6
, buying or selling the risky asset (stock) such that the proportion
of wealth they have invested in the stock is again π∗
.
At each rebalance step kiδ, the investor want to re-attain the propotion
πt = π∗
so he must transfer
Rkiδ = (1 − π∗
)XS
kiδ − π∗
XB
kiδ
from the stock to the riskless asset.
At these rebalancing steps, the usual evolution process for the stock is
replaced by XS
kiδ − Rkiδ. Similarly for the riskless asset, we replace XB
kiδ with
XB
kiδ + Rkiδ.
Figure 2.6: Rebalancing πt with k = 50. πt evolves as usual according to the
evolution of St and Bt until every 50th day, where the investor buys or sells
shares to bring πt back to π∗
, in this case π∗
= 1.25
Figure 2.6 demonstrates how the proportion of wealth held in the stock S
ie. πt varies with time. The same parameters are used as at the start of Section
2.3.
When rebalancing is performed less than daily (k = 5) as in ﬁgure 2.7
the investor’s total actual wealth is very close to the theoretical wealth process
given in (2.17), but begins to deviate from optimal.
Figure 2.8 shows the discrepancy between the theoretical wealth pro-
cess and the investors actual wealth process with diﬀerent rebalancing peri-
ods, demonstrating how actual wealth deviates from the optimal wealth path if
perfect ratio π∗
is obeyed continuously.
6For simplicity assume there are 250 trading days in a year, 20 trading days in a month
and 5 in a week (in reality there are approximately 252 trading days in a year).

Figure 2.7: Wealth in stock, riskless account and total wealth with continuous
time result over 5 years (T = 5, X0 = 1)

Figure 2.8: Optimal stock wealth (red) vs. actual stock wealth (green) for
various rebalancing frequencies.(a) = Daily, (b) = Weekly, (c) = Fortnightly,
(d) = Monthly, (e) = Bi-monthly, (f) = No rebalancing

2.4 Addition of Consumption and Infinite Hori-
zons
In this section we add an additional variable to our model. Now, let us assume
the investor wants to optimize how they withdraw wealth from their portfolio
in order to spend on goods and services.
Let us assume the stock price evolves as dSt = St[(r + µ)dt + σdWt] as in
section 2.2. Let πt be the proportion of wealth invested in the stock at time t
and ct be the ’consumption rate’.
The SDE for Xt is:
dXt =
πtXt
St
dSt +
Xt(1 − πt)
Bt
dBt − ctdt
= πtXt[(r + µ)dt + σdWt] + Xt(1 − πt)rdt − ctdt
= (πtµXt + rXt − ct)dt + XtπtσdWt
(2.18)
Note in the case of zero-interest rate r = 0, which we will consider from now on
for simplicity, the SDE for Xt becomes
dXt = (µπtXt − ct)dt + σπtXtdWt (2.19)
We consider an infinite time horizon and we want to maximise over (π, c) ∈ A×C
where A is the set of admissible control processes α and C is the set of control
processes for consumption c.7
In this problem, the investor is trying to maximise log-utility of his con-
sumption. Our value function is given by:
V (x) = sup
(π,c)∈A×C
E
x
∞
0
e−δt
log(ct)dt (2.20)
Since we are working in the infinite horizon case, the corresponding Hamilton-
Jacobi-Bellman equation is given by Theorem 2.21. The coefficients b(x, π) and
σ(x, π) in Theorem 2.21 correspond to (πµx − c) and σπx as in (2.19). We also
have terminal condition f = log(c).
− δV + sup
(π,c)∈R2
(πµx − c)
∂V
∂x
+
1
2
(πσx)2 ∂2
V
∂x2
+ log(c) = 0 (2.21)
Let π∗
, c∗
denote optimal values of π, c respectively.
To find maximum values of π and c, we can take partial derivatives of the
supremum part of (2.21), ∂
∂c (· · · ) = 0 and ∂
∂π (· · · ) = 0.
7c has the restrictions that ct ≥ 0 and ∀t, ∞
0 |ct| < ∞

=⇒ −
∂V
∂x
+
1
c∗
= 0
c∗
= 1/
∂V
∂x
Similarly,
π∗
=
− µ∂V
∂x
xσ2 ∂2V
∂x2
=
µ
σ2
Hence we see that the optimal consumption is proportional to the investor’s
wealth at time t. Using ansatz V (x) = 1
δ log(x) + C1 we calculate derivatives
∂V
∂x
=
1
δx
∂2
V
∂x2
=
−1
δx2
We see that c∗
= δx8
and the HJB equation (2.21) becomes
0 = µπ∗
−
c∗
x
−
1
2
(π∗
σ)2
+ δ log(c∗
) − δ log(x) − δ2
C1
0 =
µ2
σ
− δ −
µ2
2σ2
+ δ log(
δx
x
) − δ2
C1
After re-arranging,
C1 =
µ2
− 2σ2
δ + 2σ2
δlog(δ)
2δ2σ2
and for optimal (π∗
, c∗
), the SDE for Xt as in (2.19) becomes
dXt = (
µ2
σ2
− δ)Xtdt +
µ
σ
XtdWt
As in previous sections, the solution is in the form of Geometric Brownian
Motion
Xt = X0 exp (−δ +
µ2
σ2
−
µ2
2σ2
)t +
µ
σ
Wt (2.22)
V (x) is given by:
V (x) =
1
δ
log(x) +
µ2
− 2σ2
δ + 2σ2
δlog(δ)
2δ2σ2
(2.23)
8c∗
t is a constant fraction of Xt, meaning that the investors consumption rate is linearly
linked to his current wealth.

Numerical Results
From the value function (2.23), we aim to ﬁnd an approximation to the V (x) by
simulating many wealth paths obeying (2.22). Since this problem is stated in
the context of an inﬁnite horizon, we take a large value for T for approximation
purposes. Let T = 100 years.
For parameters µ = 0.1, σ = 0.4, X0 = S0 = 1, δ = 0.5, T = 100 years,
assuming 250 trading days per year and running 1000 simulations yields an
estimation ˆV (x) = −3.28909282. This is in close agreement to the analytic
expression for the value function, V (x) = −3.26129436
Figure 2.9: Optimal consumption, bank and stock wealth with stock price path
and total wealth shown. Note how consumption is a constant proportion of
total wealth. µ = 0.1, σ = 0.4, X0 = S0 = 1, δ = 0.5, T = 1 year.

Chapter 3
Extension to Stochastic Volatil-
ity
It is unrealistic over longer time horizons to assume that interest rates and
volatility will be constant. In this chapter we examine Merton’s problem in the
presence of stochastic volatility.
3.1 Merton’s Problem in the Heston Model
In this section we address solving Merton’s problem in the presence of stochastic
volatility. We examine the problem in the framework of the much celebrated
and studied Heston model pioneered by Steven Heston in his 1993 paper [11] .
Assume St follows
dSt = (µYt + r)Stdt + YtStdWS
t
µ is the rate of return of the stock, as in the Black-Scholes framework
and we assume ﬁxed constant interest rate r, but the main diﬀerence being the
volatility in the Heston model is itself a stochastic process, following dynamics
dYt = κ(θ − Yt)dt + ξ YtdWY
t
We say κ is the rate of reversion, ie: the rate at which Yt returns to the
long term mean given by θ. ξ is the volatility of the volatility.
In this model there are two driving Brownian Motions, WS
t and WY
t .
These are correlated with correlation ρ ∈ [0, 1].1
1In order to keep the volatility positive, we enforce that 2κθ > ξ2. This is known at the
Feller Condition.
25

CHAPTER 3. EXTENSION TO STOCHASTIC VOLATILITY 26
Let us consider a finite horizon problem, where the investor seeks to op-
timise proportion of wealth πt invested in the risky-asset. Let Xt denote the
investors wealth at time t. Xt follows
dXt =
Xtπt
St
dSt +
Xt(1 − πt)
Bt
dBt (3.1)
= Xt[πtµYt + r]dt + πtXt YtdWS
t (3.2)
The investor’s optimisation problem is to maximise
V (t, x, y) = sup
π
E
t,x,y
[U(XT )]
Where the utility function is a power utility of the form U(x) = 1
1−γ x1−γ
The HJB equation in the finite horizon case is:
∂V
∂t
+sup
π
x(πµy + r)
∂V
∂x
+ κ(θ − y)
∂V
∂y
+
1
2
x2
π2
y
∂2
V
∂x2
+ xπρξy
∂V
∂xy
+
1
2
ξy
1
2
ξ2
y
∂2
V
∂y2
In a similar style to our constant volatility 1-dimensional derivation, to find
optimal πt, we differentiate the supremum part and equate to zero.
d
dπ
x(πµy + r)
∂V
∂x
+ κ(θ − y)
∂V
∂y
+
1
2
x2
π2
y
∂2
V
∂x2
+ xπρξy
∂V
∂xy
+
1
2
ξy
1
2
ξ2
y
∂2
V
∂y2
= 0 =⇒
µxy
∂V
∂x
+ x2
yπ∗ ∂2
V
∂x2
+ xρyξ
∂2
V
∂xy
= 0
=⇒ π∗
=
− µx∂V
∂x − ρξx∂2
V
∂xy
x2 ∂2V
∂x2
π∗
=
− µ∂V
∂x − ρξ ∂2
V
∂xy
x∂2V
∂x2
(3.3)
Substituting this into our HJB equation reduces it to:
∂V
∂t
=
µ
2
(∂V
∂x )2
∂2V
∂x2
+ µρξy
∂V
∂x
∂2
V
∂xy
∂2V
∂x2
+
ρ2
ξ2
y
2
∂2
V
∂xy
∂2V
∂x2
− rv
∂V
∂x
− κ(θ − y)
∂V
∂y
−
ξy
2
∂2
V
∂y2
For the purposes of our numerical analysis, we will consider only the power
utility case. We can exploit results on homotheticity as in [9], defining a reduced
value function2
v(t, y) = (1−γ)V (t, 1, y), where V (t, x, y) = x1−γ
V (t, 1, y). The
partial derivatives are
2See Proposition 1.3.5

∂V
∂x
= (1 − γ)x−γ
V (t, 1, y)
∂2
V
∂x2
= −γ(1 − γ)x−γ−1
V (t, 1, y)
∂2
V
∂xy
= (1 − γ)x−γ ∂V (t, 1, y)
∂y
∂2
V
∂y2
= x1−γ ∂V (t, 1, y)
∂2y
The problem is simplified to solving a corresponding ’reduced’ HJB equation:
∂v
∂t
=
1 − γ
γ
[
−1
2
µ2
y − γr v − µyρξ
∂v
∂y
−
1
2v
ρ2
ξ2
y(
∂v
∂y
)2
]
− κ(θ − y)
∂v
∂y
−
1
2
ξ2
y
∂2
v
∂y2
with v(T, y) = 1
(3.4)
When these partial derivatives are substituted in (??) we find that the optimal
ratio π∗
t becomes
π∗
t =
µ
γ
+
ρξ
γ
∂v
∂y
v
(3.5)
One possible approach to attacking this problem under power utility and ex-
ponential utility is to use Boguslavskaya and Muravey’s (2015) result, reduc-
ing the optimal control to a linear parabolic boundary problem. Kallsen and
Muhle-Karbe showed that in a general stochastic volatility setting, the coeffi-
cients in the reduced PDE are affine linear functions, meaning we can use ansatz
v = eA(t)+B(t)y
with some smooth functions A and B[13].
Calculating derivatives of v gives
∂v
∂y
= B(t)eA(t)+B(t)y
∂2
v
∂y2
= B(t)2
eA(t)+B(t)y
Inserting these into (3.4) gives (after canceling exponential terms throughout)
dA(t)
dt
+
dB(t)
dt
y =
1 − γ
γ
[
−1
2
µ2
y − γr − µyρξB(t) −
1
2
ρ2
ξ2
yB(t)2
]
− κ(θ − y)B(t) −
1
2
ξ2
B(t)2
(3.6)
To solve this we use the following result.

Result 3.1.1 (Liu & Muhle-Karbe’s Representation of A(t) and B(t)). Let:
a =
(γ − 1)µ2
2γ
b =
γ − 1
γ
µρξ + κ
c =
ξ2
2
(
γ − 1
γ
ρ2
− 1)
D = b2
− 4ac
By comparing coeﬃcients in (3.6) we can separate terms to give a system of
ODE’s for A(t) and B(t):
dB(t)
dt
= cB(t)2
+ bB(t) + a B(T) = 0
dA(t)
dt
= (γ − 1)r − κθB(t) A(T) = 0
A(t) is a straightforward integral and B(t) is a Ricatti equation. These are
solved by3
:
B(t) = −2a
e
√
D(T −t)
− 1
e
√
D(T −t)(b +
√
D) − b +
√
D
A(t) = r(1 − γ)(T − t) + κθ
T
t
B(s)ds
Then the theoretical value function is:
V (t, x, y) =
1
1 − γ
x1−γ
eA(t)+B(t)y
For a more detailed explanation, see [9].
Importantly, we now have deterministic expression for π∗
t by using Result
3.1.1 with (3.5):
π∗
t =
µ
γ
+
ρσB(t)
γ
(3.7)
Perhaps quite surprisingly, even in the case of the Heston model, Merton’s
problem has an explicit solution, however unlike the constant volatility Black-
Scholes case, π∗
t is a deterministic function depending on the current time t and
time horizon T.
Numerical Analysis
In this section we wish to explore the properties of the optimal portfolio π∗
t and
conﬁrm that V (t, x, y) explicitly derived in Result 3.1.1 is in line with simulated
estimates.
3Provided D > 0

For our simulation model, we can no longer model the wealth process as
Geometric Brownian Motion. π∗
t is a function dependent on time t, rather than
constant in the Black-Scholes case. Instead we can simulate the behaviour of
Xt by using a finite difference method.
With N discretization steps of equal size dt such that T = N∗dt,using (3.1)
as the dynamics of Xt, we first generate a π∗
t array by calculating deterministic
functions A(t) and B(t). We then generate a bivartiate Normal distribution
with correlation ρ. We can then simulate St, Yt and Xt by looping the following
code over N time steps and for M paths:
epsilon = np.random.multivariate_normal([0,0], cov)
dW_S = epsilon[0]*np.sqrt(dt)
dW_Y = epsilon[1]*np.sqrt(dt)
pi[j,i] = (mu/gamma + ((rho*sigma)/gamma)*B[j,i])
S_values[j,i] = S_values[j,i-1] +
(Y_values[j,i]*mu+r)*S_values[j,i-1]*dt +
np.sqrt(Y_values[j,i-1])*S_values[j,i-1]*dW_S
X_values[j,i] = X_values[j,i-1] +
(pi[j,i]*Y_values[j,i-1]*mu+r)*X_values[j,i-1]*dt +
pi[j,i]*np.sqrt(Y_values[j,i-1])*X_values[j,i-1]*dW_S
Y_values[j,i] = Y_values[j,i-1] + kappa*(theta - Y_values[j,i-1])*dt +
sigma*np.sqrt( Y_values[j,i-1])* dW_Y
Y_values[j,i] = abs(Y_values[j,i]) #force non-neg volatility
To verify simulated V (t, x, y) agrees with the analytic result, we use example
parameters

Table 3.1: Simulation Parameters
µ 0.05
r 0
T 1
X0 1
S0 1
Y0 0.1
θ 0.024
κ 5
ρ 0.3
ξ 0.38
As Figure 3.1 shows, the simulated V (t, x, y) is in very close agreement to
the analytic result given in Result 3.1.1.
Figure 3.1: Simulating V (t, x, y) with 100 scenarios for γ ∈ (0, 1) against ana-
lytic V (t, x, y).
Behaviour of π∗
t
From (3.7) we can see that π∗
t is a deterministic function involving B(t). In-
stinctively, by the deﬁnition of B(t) this means that it is dependent of both the
time-horizon T and current time t. Since we have terminal condition B(T) = 0
we expect that as t → T, the ρσB(t)
γ term will disappear meaning π∗
t approaches
µ
γ .

Figure 3.2 demonstrates how π∗
t evolves for diﬀerent time-horizons, using
the parameters in Table 3.1. It is clear that in a Heston type model, the in-
vestor’s horizon T aﬀects the optimal amount of wealth he should place in the
risky asset S. As expected, π∗
t approaches µ
γ with equality at T. 4
Figure 3.2: Behaviour of π∗
t with γ = 0.5. (a) = 1 Year, (b) = 2 Years, (c) =
5 Years, (d) = 10 Years
4For these parameters µ
γ
= 0.05
0.5
= 0.1

Chapter 4
Transaction Costs
In this chapter, we briefly touch on Merton’s problem when transaction costs
are present. This is an important step towards a more realistic model.
In the transaction-cost free environment, an investor would ideally re-
balance his portfolio by trading as close to continuously as possible. When
transaction costs are added however, we see that this is irrational behaviour as
the cost of rebalancing continuously is greater than the benefit in utility gained.
We will show that it is only beneficial to the investor to trade when the ratio
πt is within certain bounds.
We say the investor pays a fixed transaction cost ψ ∈ R+ every time they
buy or sell any amount of the stock.1
4.1 Proportional Transaction Costs
Let us assume the investor trades as in the Chapter 2, starting with a wealth of
$10,000, rebalancing daily, but with the presence of a 1% proportional transac-
tion cost and a small fixed fee of $5.
From Figure 4.1, we see clearly that maintaining optimal constant π∗
as in the frictionless case is far from the rational way to trade and due to the
cumulative fees paid, rebalancing regularly to keep πt at π∗
leads to the investor
losing his wealth. For this reason we must modify our strategy and framework.
Proportional transaction costs were first studied by Magill and Constan-
tinides [7] in 1976 and expanded upon by Davis and Norman in 1990 who in-
troduced the notion of the no-trade region and provided mathematical rigour.
Davis and Norman showed that the optimal strategy is to make the minimal
trade required (if neccesary) to the closest point in the ’wedge’ defined by the
no-trade region.[6]
1Many brokers charge a price to make trades. This can vary from as low as $5 to upwards
of $100.
32

CHAPTER 4. TRANSACTION COSTS 33
Figure 4.1: Actual Wealth with daily rebalancing in presence of transaction
costs vs. theoretical transaction cost free wealth
Assume St follows dSt = St(µdt + σdWt) and dBt = rBtdt as usual. We
use a similar notation as in Davis and Norman.
To model a bid/ask spread, we now assume the investor can buy the stock
at ask price SA
t and sell the stock at the bid price SB
t given by:
Investor Sells at SA
t = (1 + λ)St
Investor Buys at SB
t = (1 − )St
with , λ ∈ [0, 1)
Let Dt and Lt be the cumulative wealth from selling and buying stock
respectively. Let Xt and Yt be the amount invested in the riskless asset and stock
respectively (X0 = x, Y0 = y), so our total wealth at time t is now Zt = Xt + Yt
and starting wealth is x + y.
This gives rise to wealth equations:
dXt = (rXt − ct)dt − (1 + λ)dLt + (1 − )dDt, X0 = x
dYt = µYtdt + σYtdWt + dLt − dDt, Y0 = y
(4.1)
where ct is the investor’s consumption rate.
In the inﬁnite horizon case the value function is deﬁned for utility function
U:

V (x, y) = sup
(c,L,D)
E
∞
0
e−δt
U(c(t)) (4.2)
Davis and Norman showed that the holding’s at time t are within a closed
region, given by
Sλ, = {(x, y) ∈ R2
: x + (1 − )y ≥ 0 and x + (1 + λ)y ≥ 0}
The investor wants find a triplet (c, L, D) ∈ A(x, y) which maximizes V (x, y) =
sup(E[U(ZT )]). By using Proposition 1.3.5 we can factor V (t, x, y) = xγ
V (t, 1, y)
Using the dynamics in (4.1), the HJB equation for this problem becomes:
−δV + sup
c,l,d
[
1
2
σ2
y2 ∂2
V
∂y2
+ (rx − c)
∂V
∂x
+ µy
∂V
∂y
+
1
γ
cγ
+ −(1 + λ)
∂V
∂x
+
∂V
∂y
l + (1 − )
∂V
∂x
−
∂V
∂y
d] = 0
We differentiate with respect to c and set to zero to find maxima. This gives
optimal consumption:
c∗γ−1
=
∂V
∂x
=⇒ c∗
= (
∂V
∂x
)
1
1−γ
In the no-trade region, l and d are zero as the investor does not make any trades,
so the value function satisfies:
−δV + sup
1
2
σ2
y2 ∂2
V
∂y2
+ (rx − c)
∂V
∂x
+ µy
∂V
∂y
+
1
γ
xγ
In the sell region, dL attains it maximum and dD = 0 as the investor sells the
stock to rebalance his portfolio meaning
∂V
∂y
= (1 + λ)
∂V
∂x
(4.3)
Similarly, in the buy region,
∂V
∂y
= (1 − )
∂V
∂x
(4.4)
We know that the value function is concave and by the homotheticity property,
we have reason to believe that the no-trade region is a cone in R2
.[10]

The main diﬃculty arises when trying to solve the HJB equation directly,
as unlike in the classical or Heston case, it cannot be solved analytically[?].
By reducing the dimensionality of the problem, exploiting the homoth-
eticity property and solving a free boundary problem, it can be shown that the
optimal strategy consists of a pair of ’local time’ processes. Davis and Norman
showed how to numerically calculate the buy and sell boundaries. This is still
very much an active area of research.

Chapter 5
CONCLUSION
In this paper, we have presented solutions to Merton’s portfolio problem in
various different settings using the dynamic programming approach. We have
demonstrated and numerically verified how an investor can optimise his portfolio
when their utility function is in the class of HARA functions and stock prices
are assumed to obey Geometric Brownian Motion.
In Chapter 2, we have shown that the optimal portfolio consists of holding
a constant proportion of wealth in the risky asset in the idealised model when
volatility is constant and there are no transaction costs. This is true both in
the finite and infinite time horizon case.
In the stochastic volatility setting, we showed that rather surprisingly, an
explicit solution exists and the optimal portfolio is characterized by a deter-
ministic function. In Chapter 3 we presented a simulation model to verify this
optimal strategy via Monte Carlo estimations.
The main difficulties arise when proportional transaction costs are present,
resulting in the HJB equation no longer having an explicit solution. Various
approaches to define the boundaries of the trading regions have been proposed,
such as those by Muthuraman and Zha (2008), Budhiraja and Ross(2007).
36

Bibliography
[1] R. Bellman, ”The theory of dynamic programming”, Bull. Amer. Math.
Soc. 60, no. 6, 503-515, 1954.
[2] R. C. Merton, ”An Intertemporal capital asset pricing model,” Econo-
metrica, vol. 41, no. 5, pp. 867-887, 1973.
[3] R. C. Merton, ”Lifetime portfolio selection under uncertainty: The
continuous-time case”, The Review of Economics and Statistics, vol. 51,
no. 3, pp. 247-257, 1969.
[4] R. C. Merton, ”Optimal consumption and portfolio rules in a continuous
time model”, J. Econom. Theory vol. 3, no. 4, pp. 373-413, 1971.
[5] H. Pham, ”Continuous-time stochastic control and optimization with ﬁ-
nancial applications”. Springer-Verlag, 2009.
[6] M. H. H. Davis and A. R. Norman, ”Portfolio Selection with Trans-
action Costs”, Mathematics of Operations Research vol. 15, no. 4, pp.
676-713, 1990.
[7] M. Magill and G. M. Constantinides, ”Portfolio selection with trans-
actions costs”, J. of Econom. Theory vol. 13, no. 2, pp. 245-263, 1976.
[8] H. Liu and M. Loewenstein, ”Optimal portfolio selection with trans-
action costs and ﬁnite horizons”, The Review of Financial Studies vol. 15,
no. 3, pp. 805-835, 2002
[9] R. Liu and J. Muhle-Karbe, ”Portfolio Choice with Stochastic Invest-
ment Opportunities: a Users Guide”, 2013
[10] K. Muthuraman and S. Kumar , ” Solving Free-boundary Problems
with Applications in Finance.”, Now Publishers, 2008.
[11] S. L. Heston, ”A Closed Solution For Options With Stochastic Volatility,
With Application to Bond and Currency Options”, Review of Financial
Studies vol. 6, no. 2, pp. 327-343.
[12] E. Boguslavskaya and D. Muravey, ”An explicit solution for optimal
investment in Heston model”, Teor. Veroyatnost. i Primenen vol 60, no. 4,
pp 811-819, 2015.
37

BIBLIOGRAPHY 38
[13] J. Kallsen and J. Muhle-Karbe, ”Utility maximization in aﬃne
stochastic volatility models”, 2008.
[14] M. Monoyios, ”Finite horizon portfolio selection with transaction costs”,
Journal of Economic Dynamics and Control vol 28, pp 889-913, 2004.

Appendix A
Simulation and Graphing Code
# All code was run on Python 3.5.1
# Only dependencies: numpy, seaborn.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# GBM WEALTH PROCESS
def generateWealthPricePaths(mu,sigma,pi,X0,nSteps,nPaths,T):
delta = T/float(nSteps)
logX0 = np.log(X0)
x_values = np.zeros((nPaths,nSteps+1))
( x_values[:,0] ) = [logX0]*nPaths
for j in range(0,nPaths):
for i in range(1,nSteps+1):
x_values[j,i] = x_values[j,i-1] +
(mu*pi-0.5*(sigma*pi)**2)*delta +
sigma**pi*np.sqrt(delta)*np.random.normal()
return np.exp(x_values)
print generateWealthPricePaths(0.05,0.2,1.25,1,10,10,1)
plt.title(’’)
plt.ylabel(’Asset Price’)
plt.xlabel(’Time Steps’)
fig, ax = plt.subplots()
ax.ticklabel_format(useOffset=False)
mu = 0.05
sigma = 0.2
pi = 1.25
X0 = 1
nSteps = 100
nPaths = 10
T = 1
for scenario in generateWealthPricePaths(mu,sigma,pi,X0,nSteps,nPaths,T):
ax.plot(range(nSteps+1), scenario, alpha = 0.2, color = ’green’)
ax.plot(range(nSteps+1), scenario, alpha = 1., color = ’red’)
sns.plt.show()
def exp_utility(alpha, x):
39

APPENDIX A. SIMULATION AND GRAPHING CODE 40
return -np.exp(-alpha*x)
#EXP UTILITY MONTE CARLO AND HISTOGRAM
#Calculates Monte Carlo estimate of value function with N scenarios
def Value_function(N, alpha,X0,pi_star,mu,sigma,t):
value_functions_list = []
for i in range(N):
X_t = X0 + ((mu**2)/(alpha*sigma**2))*t +
(mu/(sigma*alpha))*np.sqrt(t)*np.random.normal()
X_t = np.random.normal(X0 + ((mu**2)/(alpha*sigma**2))*t,
np.sqrt(t)*(mu/(alpha*sigma)))
#value_functions_list.append( exp_utility(alpha,X_t) )
value_functions_list.append( -np.exp(-alpha*X_t) )
return np.mean(value_functions_list)
alpha = 1. # pre-defined
t=1. # pre-defined
X0=1 # pre-defined
mu=0.05 # pre-defined
sigma = 0.2 # pre-defined
analytic_pi_star = mu/(X0*alpha*sigma**2) # optimal risky wealth
beta = (mu**2)/(2*alpha*(sigma)**2) # analytic calculation
def analyticValueFunction(alpha,beta,X0):
return -np.exp((-alpha*X0)-alpha*beta*(t-0))
print( Value_function(10000,alpha,X0,analytic_pi_star,mu,sigma,t))
print( analyticValueFunction(alpha,beta,X0))
# Generates 2M+1 points, M values below pi_star, M values above, equally
spaced with distance epsilon
# with N Monte Carlo scenarios
def errors(M,N,epsilon):
x_axis = [analytic_pi_star]
for i in range(1,M):
x_axis.append(analytic_pi_star-i*epsilon)
x_axis.append(analytic_pi_star+i*epsilon)
x_axis = sorted(x_axis)
y_axis =[]
for pi_val in x_axis:
y_axis.append( Value_function(N, alpha,X0,pi_val,mu,sigma,t) )
plt.title(’’)
plt.ylabel(’Value (exponential utility)’)
plt.xlabel(’pi’)
x_position = 1
plt.axvline(analytic_pi_star, color = ’red’, alpha = 0.3)
plt.axhline(analyticValueFunction(alpha,beta,X0),color = ’red’, alpha
= 0.3)
plt.xticks(np.arange(-10, 10, 0.5))
plt.plot(x_axis,y_axis)
sns.plt.show()
fig.savefig(’MonteCarlo_value_function_optimality.png’, format=’png’,
dpi=800)
errors(250,1000,0.05)
#EXPONENTIAL UTILITY HISTOGRAM
def final_wealth(N, alpha,X0,pi_star,mu,sigma,t):

wealth_list = []
for i in range(N):
X_t = np.random.normal(X0 + ((mu**2)/(alpha*sigma**2))*t,
np.sqrt(t)*(mu/(alpha*sigma)))
wealth_list.append( X_t )
return wealth_list
alpha_1 = (final_wealth(1000,5,1,1.25,0.05,0.2,1))
plt.hist(alpha_1, bins=100, alpha=0.5, label=’alpha = 5.0’,color=’blue’)
plt.xlabel(’Final wealth’)
plt.ylabel(’Frequency’)
plt.legend(loc=’upper right’,prop={’size’:14})
sns.plt.show()
#OPTIMAL SHARES, S_t AND pi FOR EXPONENTIAL UTILITY
pi_t = 1.25
def optimals(mu,sigma,alpha,S0,X0,nSteps,T):
logS0 = np.log(S0)
s_values = np.zeros(nSteps+1)
s_values[0] = logS0
X = np.zeros(nSteps+1)
X[0] = X0
optimal_shares = np.zeros(nSteps+1)
optimal_shares[0] = (pi_t*X0)/S0
X[i] = X[i-1]+ (mu**2)/(alpha*sigma**2)*delta +
(mu/(sigma*alpha))*np.sqrt(delta)*np.random.normal()
s_values[i] = s_values[i-1] + (mu-0.5*sigma**2)*delta +
sigma*np.sqrt(delta)*np.random.normal()
optimal_shares[i] = (pi_t* X[i])/np.exp(s_values[i])
return np.exp(s_values),X,optimal_shares
nSteps = 250
listo = optimals(0.05,0.2,1,1,1,nSteps,1)
ax.plot(range(nSteps+1), listo[0], alpha = 1, color = ’red’,label=’Stock
Price’)
ax.plot(range(nSteps+1), listo[2], alpha = 1, color =
’green’,label=’Optimal number of shares’)
ax.plot(range(nSteps+1), [1.25]*(nSteps+1), alpha = 1, color =
’blue’,label=’Optimal pi’)
plt.xlabel(’Time Steps (days)’)
for item in ([ax.title, ax.xaxis.label, ax.yaxis.label] +
ax.get_xticklabels() + ax.get_yticklabels()):
item.set_fontsize(14)
plt.legend(loc=’upper right’, prop={’size’:14})
sns.plt.show()
#MONTE CARLO VERIFICATIONS WITH POWER UTIL AND NON-ZERO INTEREST #RATES
def power_utility(gamma, x):
if gamma >=1 or gamma<=0:
return ’ERROR, GAMMA out of bounds’
return 1/(1-float(gamma))*x**(1-float(gamma))

#Calculates Monte Carlo estimate of value function with N scenarios
def Value_function(N, gamma,X0,pi_star,mu,r,sigma,t):
value_functions_list = []
for i in range(N):
X_t = X0 * np.exp( (pi_star * mu + r - 0.5*(sigma*pi_star)**2)*t +
pi_star*sigma*np.sqrt(t)*np.random.normal() )
#value_functions_list.append( exp_utility(alpha,X_t) )
value_functions_list.append( power_utility(gamma,X_t) )
return np.mean(value_functions_list)
gamma = 0.5 # pre-defined
t=2. # pre-defined
X0=10. # pre-defined
mu=0.02 # pre-defined
sigma = 0.2 # pre-defined
r = 1 # pre-defined
analytic_pi_star = mu/(gamma*sigma**2) # optimal risky wealth
beta = 0.5*(mu**2)/(gamma*sigma**2) + r # analytic calculation
def analyticValueFunction(gamma,beta,X0):
return 1/(1-float(gamma))*X0**(1-float(gamma)) *
np.exp((t-0)*(1-gamma)*beta)
def printer(muu,rr):
analytic_pi_star = muu/(gamma*sigma**2) # optimal risky wealth
beta = 0.5*(muu**2)/(gamma*sigma**2) + rr # analytic calculation
print(’mu is ’+str(muu))
print(’r is ’ + str(rr))
print(Value_function(10000,gamma,X0,analytic_pi_star,muu,rr,sigma,t))
print(analyticValueFunction(gamma,beta,X0))
print(’nn’)
#CODE FOR VARYING REBALANCING FREQUENCIES
def generateWealthPricePaths(mu,sigma,r,pi,X0,S0,nSteps,nPaths,T,freq):
# stock
logS0 = np.log(S0)
s_values = np.zeros((nPaths,nSteps+1))
( s_values[:,0] ) = [logS0]*nPaths
B0=1
logX0 = np.log(X0)
theo_money_in_stock = np.zeros((nPaths,nSteps+1))
theo_money_in_bank = np.zeros((nPaths,nSteps+1))
theo_money_in_stock[:,0] = (pi * X0)
theo_money_in_bank[:,0] = (1-pi)*X0
money_in_bank = np.zeros((nPaths,nSteps+1))
( money_in_bank[:,0] ) = [B0]*nPaths
units_of_stock = (pi * X0)/S0
init_units_of_bank = (1-pi)*B0
money_in_stock = np.zeros((nPaths,nSteps+1))
money_in_bank[:,0] = init_units_of_bank*B0
money_in_stock[:,0] = units_of_stock*S0

w = np.random.normal()
s_values[j,i] = s_values[j,i-1] + (mu-0.5*sigma**2)*delta +
sigma*np.sqrt(delta)*w
x_values[j,i] = x_values[j,i-1] +
(mu*pi-0.5*(sigma*pi)**2)*delta + sigma*pi*np.sqrt(delta)*w
money_in_bank[j,i] = money_in_bank[j,i-1]*np.exp(r*delta)
money_in_stock[j,i] = units_of_stock*np.exp(s_values[j,i])
real_pi =
money_in_stock[j,i]/(money_in_stock[j,i]+money_in_bank[j,i])
theo_money_in_stock[j,i] = np.multiply(pi,np.exp(x_values[j,i]))
theo_money_in_bank[j,i] =
np.multiply(1-pi,np.exp(x_values[j,i]))
if i%freq==0:
CORRECTOR = np.multiply((1-pi),money_in_stock[j][i])
-np.multiply((pi),money_in_bank[j][i])
money_in_stock[j,i] = money_in_stock[j,i] - CORRECTOR
money_in_bank[j,i] = money_in_bank[j,i] + CORRECTOR
units_of_stock = (units_of_stock)
-(CORRECTOR/np.exp(s_values[j,i]))
print(units_of_stock)
else:
pass
x_values = np.exp(x_values)
’’’units_of_stock = (pi * X0)/S0
units_of_bank = (1-pi)*B0
money_in_stock = units_of_stock*s_values
money_in_bank = units_of_bank*b_values’’’
total_wealth = np.add(money_in_bank, money_in_stock)
pi_process = np.divide(money_in_stock,total_wealth)
return [total_wealth,money_in_stock,money_in_bank, x_values,
theo_money_in_stock, theo_money_in_bank ]
mu = 0.05
sigma = 0.2
r = 0.0
B0 = 1.
S0 = 100.
nSteps = 250
nPaths = 1
T = 1
pi = 1.25
X0 =1.
plt.ylabel(’Wealth’)
plt.xlabel(’Time Steps (Days)’)
listo =
generateWealthPricePaths(mu,sigma,r,pi,X0,S0,nSteps,nPaths,T,freq)
for scenario in listo[5]:
ax.plot(range(nSteps+1), scenario, alpha = 1, linestyle = ’--’, color
= ’red’,label=’optimal bank’)
ax.plot(range(nSteps+1), scenario, alpha = 1, color =
’green’,label=’Actual Stock Wealth’)

’red’,label=’Actual Bank Wealth’)
’red’,label=’optimal Stock wealth’)
’black’,label=’actual total wealth’)
’blue’,label=’optimal total Wealth’)
sns.plt.show()
#INFINITE HORIZON LOG UTILITY WITH CONSUMPTION MODEL
def utility_function(c):
return np.log(c)
def Wealth_process(mu,sigma,r,X0,S0,delta,nSteps,nPaths,T):
dt = T/float(nSteps)
pi = mu/(sigma**2)
logX0 = np.log(X0)
logS0 = np.log(S0)
init_units_of_bank = (1-pi)
b_values = np.zeros((nPaths,nSteps+1))
b_values[:,0] = 1
s_values = np.zeros((nPaths,nSteps+1))
( s_values[:,0] ) = [logS0]*nPaths
money_in_stock = np.zeros((nPaths,nSteps+1))
money_in_bank = np.zeros((nPaths,nSteps+1))
money_in_stock[:,0] = (pi * X0)
money_in_bank[:,0] = (1-pi)*X0
x_values[j,i] = x_values[j,i-1] + (-delta+0.5*(mu/sigma)**2)*dt
+ (mu/sigma)*np.sqrt(dt)*w
s_values[j,i] = s_values[j,i-1] + (mu-0.5*sigma**2)*dt +
sigma*np.sqrt(dt)*w
b_values[j,i] = b_values[j,i-1]*np.exp(r*dt)
money_in_stock[j,i] = np.multiply(pi,np.exp(x_values[j,i]))
money_in_bank[j,i] = np.multiply(1-pi,np.exp(x_values[j,i]))
S_values = np.exp(s_values)
Wealth_process = np.exp(x_values)
stock_amount_process = np.exp(s_values)
bank_amoount_process = b_values
optimal_consumption_process = np.multiply(delta, Wealth_process)
return [money_in_bank, money_in_stock, Wealth_process, S_values,
optimal_consumption_process]

#INFINITE HORIZON LOG UTILITY MONTE CARLO VERIFICATION
X0 = 1
T = 100
mu = 0.1
sigma = 0.4
delta = 0.5
nSteps = 250*100
nPaths = 1000
pi = mu/(sigma**2)
def analytic_value_function(mu,sigma,delta,X0):
C_1 = (2*delta*np.log(delta)*sigma**2 + mu**2 -
2*delta*sigma**2)/(2*(delta*sigma)**2)
return (1/delta)*np.log(X0) + C_1
def monte_carlo_value_function(mu,sigma,X0,delta,nSteps,nPaths,T):
( x_values[:,0] ) = [np.log(X0)]*nPaths
c = np.zeros((nPaths,nSteps+1))
x_values[j,i] = x_values[j,i-1] + (-delta+0.5*(mu/sigma)**2)*dt
+ (mu/sigma)*np.sqrt(dt)*w
c[j,i] = delta*np.exp(x_values[j,i])
x_values = np.exp(x_values)
integral = np.zeros((nPaths,nSteps+1))
integral[j,i] = integral[j,i-1] +
np.exp(-delta*(i-1)*dt)*np.log(c[j,i])*dt
final_integral_values = []
for integral_path in integral:
final_integral_values.append(integral_path[-1])
return (np.mean(final_integral_values))
def graph_analytic_value_vs_delta(mu,sigma,X0):
analytic_list = []
monte_carlo_list = []
x_range = np.arange(0.2,0.8,0.05)
for delta in x_range:
C_1 = (2*delta*np.log(delta)*sigma**2 + mu**2 -
2*delta*sigma**2)/(2*(delta*sigma)**2)
analytic_list.append( (1/delta)*np.log(X0) + C_1 )
monte_carlo_list.append(
monte_carlo_value_function(mu,sigma,X0,delta,nSteps,nPaths,T) )
plt.plot(x_range,analytic_list)
plt.plot(x_range,monte_carlo_list)
plt.show()
#STOCHASTIC VOLATILITY MODEL AND MONTE CARLO VERIFICATION
def analytic_utility(gamma,t,x,y):
a = ((gamma-1)/gamma)*0.5*(mu**2)
b = ((gamma-1)/gamma)*mu*rho*sigma+ kappa
c = ( ((gamma-1)/gamma)*(rho**2)-1)*0.5*(sigma**2)
D = (b**2) -(4*a*c)

print(D)
B = -(2*a) * (np.exp(np.sqrt(D)*(T-t))
-1)/(np.exp(np.sqrt(D)*(T-t))*(b+np.sqrt(D))-b+np.sqrt(D) )
#A = ((1-gamma)*r*t)-((2*kappa*theta)/(b**2 - D))* ((b+np.sqrt(D))*t
-
2*np.log(((np.exp(np.sqrt(D)*t)*(b+np.sqrt(D))-b+np.sqrt(D)))/(2*np.sqrt(D))))
A = (1-gamma)*r*(T-t)- ((2*kappa*theta * a)/((b**2) -
D))*((b+np.sqrt(D))*(T-t)
- (2*np.log( ( (np.exp(np.sqrt(D)*(T-t)) * (b+np.sqrt(D) ) - ( b +
np.sqrt(D)) )) / (2*np.sqrt(D)) ) ))
return np.exp(A+B*y)*(1/(1-gamma))*x**(1-gamma)
def simulateHestonWealthPaths(gamma, S0, Y0,X0, mu, r, kappa, theta,
sigma, rho, T, nPaths, nSteps):
cov = [[rho,0],[0,rho]]
a = ((gamma-1)/gamma)*0.5*(mu**2)
b = ((gamma-1)/gamma)*mu*rho*sigma+ kappa
c = ( ((gamma-1)/gamma)*(rho**2)-1)*0.5*(sigma**2)
D = b**2 - 4*a*c
if D <= 0:
print( ’invalid a,b,c’)
S_values = np.zeros((nPaths,nSteps+1))
X_values = np.zeros((nPaths,nSteps+1))
Y_values = np.zeros((nPaths,nSteps+1))
S_values[:,0] = [S0]*nPaths
X_values[:,0] = [X0]*nPaths
Y_values[:,0] = [Y0]*nPaths
A = np.zeros((nPaths,nSteps+1))
B = np.zeros((nPaths,nSteps+1))
A[:,0] = [(1-gamma)*r*(T)-((2*kappa*theta*a)/(b**2 -
D))*(b+np.sqrt(D))*(T)
-2*np.log(((np.exp(np.sqrt(D)*T)*(b+np.sqrt(D))-b+np.sqrt(D)))/(2*np.sqrt(D)))]*nPaths
B[:,0] = [-2*a *
(np.exp(np.sqrt(D)*(T))-1)/(np.exp(np.sqrt(D)*T)*(b+np.sqrt(D))-b+np.sqrt(D)
)]*nPaths
pi = np.zeros((nPaths,nSteps+1))
pi[:,0] = [(mu/gamma + ((rho*sigma)/gamma)*B[0,0])]*nPaths
epsilon = np.random.multivariate_normal([0,0], cov)
dW_S = epsilon[0]*np.sqrt(dt)
dW_Y = epsilon[1]*np.sqrt(dt)
B[j,i]= -(2*a) *
(np.exp(np.sqrt(D)*(T-(i*dt)))-1)/(np.exp(np.sqrt(D)*(T-(i*dt)))
*(b+np.sqrt(D))-b+np.sqrt(D) )
A[j,i] = (1-gamma)*r*(T-(i*dt))-((2*kappa*theta*a)/(b**2 -
D))*((b+np.sqrt(D))*(T-i*dt)-
2*np.log(((np.exp(np.sqrt(D)*(T-i*dt))*(b+np.sqrt(D))-b+np.sqrt(D)))/(2*np.sqrt(D))))
pi[j,i] = (mu/gamma + ((rho*sigma)/gamma)*B[j,i])
S_values[j,i] = S_values[j,i-1] +
(Y_values[j,i]*mu+r)*S_values[j,i-1]*dt
+ np.sqrt(Y_values[j,i-1])*S_values[j,i-1]*dW_S
X_values[j,i] = X_values[j,i-1] +
(pi[j,i]*Y_values[j,i-1]*mu+r)*X_values[j,i-1]*dt

+ pi[j,i]*np.sqrt(Y_values[j,i-1])*X_values[j,i-1]*dW_S
Y_values[j,i] = Y_values[j,i-1] + kappa*(theta -
Y_values[j,i-1])*dt
+ sigma*np.sqrt( Y_values[j,i-1])* dW_Y
Y_values[j,i] = abs(Y_values[j,i]) #force non-neg volatility
return [pi, X_values]
nPaths=100
nSteps=250
mu = 0.05
r = 0
T = 1
X0 = 1
kappa = 5
theta = 0.024
Y0 = 0.1
sigma = 0.38
rho = 0.3
S0 = 1
gamma_list = np.arange(0.2,0.8,0.01)
MC_value=[]
theo_value=[]
for gamma in gamma_list:
a_price_path = simulateHestonWealthPaths(gamma, S0, Y0, X0, mu, r,
kappa, theta, sigma, rho, T, nPaths, nSteps)
utils_list = []
for wealth_path in a_price_path[1]:
final_util = (1/(1-gamma))*wealth_path[-1]**(1-gamma)
utils_list.append(final_util)
MC_value.append(np.mean(utils_list))
theo_value.append(analytic_utility(gamma,0,X0,Y0))
plt.ylabel(’Value’)
plt.xlabel(’Gamma’)
ax.plot(gamma_list,theo_value, color = ’red’,label=’Theoretical Value
Function’)
ax.plot(gamma_list,MC_value, color = ’green’,label=’Simulated Value
Function’)
plt.legend(loc=’upper right’)
sns.plt.show()

Varun Balupuri - Thesis

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Varun Balupuri - Thesis

Similar a Varun Balupuri - Thesis (20)

Varun Balupuri - Thesis