Neural network and mlp

Partha pratim deb
Mtech(cse)-1st year
Netaji subhash engineering college

• Biological inspiration vs. artificial neural network
• Why Use Neural Networks?
• Neural network applications
• Learning strategy & Learning techniques
• Generalization types
• Artificial neurons
• MLP neural networks and tasks
• learning mechanism used by multilayer perceptron
• Activation functions
• Multi-Layer Perceptron example for approximation

The McCullogh-Pitts
model

neurotransmission

1.Supervised learning
2.Unsupervised learning

A B A B B B

A A
A B B A

 It is based on a
labeled training ε Class

set.
ε Class
A
 The class of each B λ Class

piece of data in
λ Class
B
training set is A
known. A ε Class

 Class labels are λ Class B
pre-determined
and provided in
the training phase.

 Task performed  Task performed
Classification Clustering
Pattern  NN Model :
Recognition Self Organizing
 NN model : Maps
Preceptron “class of data is not
Feed-forward NN defined here”

“class of data is
defined here”

Nonlinear generalization of the McCullogh-Pitts
neuron:
1
y= sigmoidal neuron
y = f ( x, w)
T
−w x−a
1+ e
|| x − w|| 2
−
y=e 2a 2 Gaussian neuron

MLP = multi-layer perceptron
Perceptron:
yout = wT x x yout

MLP neural network:
1
y1 =
k − w1 kT x − a1
, k = 1,2,3
1+ e k

y 1 = ( y1 , y 1 , y3 ) T
1
2
1

1
yk =
2
− w 2 kT y 1 − a k
2
, k = 1,2
1+ e
y 2 = ( y12 , y 2 ) T
2
yout
2
x
y out = ∑ wk y k = w3T y 2
3 2

k =1

• control
• classification These can be reformulated
in general as
• prediction
FUNCTION
• approximation
APPROXIMATION
tasks.

Approximation: given a set of values of a function g(x)
build a neural network that approximates the g(x) values
for any input x.

Activation function used for curve the input data
to know the variation

Sigmoidal (logistic) function-common in MLP
1 1
g (ai (t )) = =
1 + exp(−k ai (t )) 1 + e −k ai ( t )

where k is a positive
constant. The sigmoidal
function gives a value in
range of 0 to 1.
Alternatively can use
tanh(ka) which is same
shape but in range –1 to 1.

Input-output function of a
neuron (rate coding
assumption)
Note: when net = 0, f = 0.5

Multi-Layer Perceptron example for approximation

Algorithm (sequential)

1. Apply an input vector and calculate all activations, a and u
2. Evaluate ∆k for all output units via:
∆ (t ) =( d i (t ) − yi (t )) g ' ( ai (t ))
i

(Note similarity to perceptron learning algorithm)
3. Backpropagate ∆ks to get error terms δ for hidden layers using:

δ (t ) =g ' (ui (t ))∑ k (t ) wki
i ∆
k

vij (t + 1) = vij (t ) + ηδ i (t ) x j (t )
wij (t + 1) Evaluate ) + η∆i (t ) z j (t )
4. = w (t changes using:
ij

Here I have used simple identity activation function
with an example to understand how neural network
works

Once weight changes are computed for all units, weights are updated
at the same time (bias included as weights here). An example:

v11= -1
x1 w11= 1 y1
v21= 0 w21= -1
v12= 0
w12= 0
x2 v22= 1 y2
w22= 1
v10= 1
v20= 1

Have input [0 1] with target [1 0].
Use identity activation function (ie g(a) = a)

All biases set to 1. Will not draw them for clarity.
Learning rate η = 0.1

v11= -1
x1= 0 w11= 1 y1
v21= 0 w21= -1
v12= 0
w12= 0
x2= 1 v22= 1 y2
w22= 1

Have input [0 1] with target [1 0].

Forward pass. Calculate 1st layer activations:

v11= -1 u1 = 1
x1 w11= 1 y1
v21= 0 w21= -1
v12= 0
w12= 0
x2 v22= 1 y2
w22= 1
u2 = 2

u1 = -1x0 + 0x1 +1 = 1
u2 = 0x0 + 1x1 +1 = 2

Calculate first layer outputs by passing activations thru activation
functions

z1 = 1
v11= -1
x1 w11= 1 y1
v21= 0 w21= -1
v12= 0
w12= 0
x2 v22= 1 y2
w22= 1
z2 = 2

z1 = g(u1) = 1
z2 = g(u2) = 2

Calculate 2nd layer outputs (weighted sum thru activation functions):

v11= -1
x1 w11= 1 y1= 2
v21= 0 w21= -1
v12= 0
w12= 0
x2 v22= 1 y2= 2
w22= 1

y1 = a1 = 1x1 + 0x2 +1 = 2
y2 = a2 = -1x1 + 1x2 +1 = 2

Backward pass:

v11= -1
x1 w11= 1 ∆1= -1
v21= 0 w21= -1
v12= 0
w12= 0
x2 v22= 1 ∆2= -2
w22= 1

Target =[1, 0] so d1 = 1 and d2 = 0
So:
∆ 1 = (d1 - y1 )= 1 – 2 = -1
∆ 2 = (d2 - y2 )= 0 – 2 = -2

Calculate weight changes for 1st layer (cf perceptron learning):

v11= -1 z1 = 1
x1 w11= 1 ∆1 z1 =-1
v21= 0 w21= -1 ∆1 z2 =-2
v12= 0
w12= 0
x2 v22= 1 ∆2 z1 =-2
w22= 1 ∆2 z2 =-4
z2 = 2

Weight changes will be:

v11= -1
x1 w11= 0.9
v21= 0 w21= -1.2
v12= 0
w12= -0.2
x2 v22= 1
w22= 0.6

But first must calculate δ’s:

v11= -1
x1 ∆ 1 w11= -1 ∆1= -1
v21= 0
∆ 2 w21= 2
v12= 0 ∆ 1 w12= 0
x2 v22= 1 ∆2= -2
∆ 2 w22= -2

∆’s propagate back:

v11= -1 δ 1= 1
x1 ∆1= -1
v21= 0
v12= 0
x2 v22= 1 ∆2= -2

δ 2 = -2

δ1 = - 1 + 2 = 1
δ2 = 0 – 2 = -2

And are multiplied by inputs:

v11= -1 δ 1 x1 = 0
x1= 0
∆1= -1
v21= 0 δ 1 x2 = 1
v12= 0
δ 2 x1 = 0
x2= 1 v22= 1 ∆2= -2

δ 2 x2 = -2

Finally change weights:

x1= 0 v11= -1
w11= 0.9
v21= 0 w21= -1.2
v12= 0.1
w12= -0.2
x2= 1 v22= 0.8
w22= 0.6

Note that the weights multiplied by the zero input are
unchanged as they do not contribute to the error
We have also changed biases (not shown)

Now go forward again (would normally use a new input vector):

v11= -1 z1 = 1.2
x1= 0 w11= 0.9
v21= 0 w21= -1.2
v12= 0.1
w12= -0.2
x2= 1 v22= 0.8
w22= 0.6
z2 = 1.6

Now go forward again (would normally use a new input vector):

x1= 0 v11= -1 y1 = 1.66
w11= 0.9
v21= 0 w21= -1.2
v12= 0.1
w12= -0.2
x2= 1 v22= 0.8
w22= 0.6
y2 = 0.32

Outputs now closer to target value [1, 0]

Neural network applications
Pattern Classification
Applications examples
• Remote Sensing and image classification
• Handwritten character/digits Recognition
Control, Time series, Estimation
• Machine Control/Robot manipulation
• Financial/Scientific/Engineering Time series
Optimization
forecasting.
• Traveling sales person
Multiprocessor scheduling and task
Real World Application Examples
assignment
• Hospital patient stay length
prediction
• Natural gas price prediction

• Artificial neural networks are inspired by the learning
processes that take place in biological systems.
• Learning can be perceived as an optimisation process.
• Biological neural learning happens by the modification
of the synaptic strength. Artificial neural networks learn
in the same way.
• The synapse strength modification rules for artificial
neural networks can be derived by applying
mathematical optimisation methods.

• Learning tasks of artificial neural networks = function
approximation tasks.
• The optimisation is done with respect to the approximation
error measure.
• In general it is enough to have a single hidden layer neural
network (MLP, RBF or other) to learn the approximation of
a nonlinear function. In such cases general optimisation can
be applied to find the change rules for the synaptic weights.

1.artificial neural network,simon haykin
2.artificial neural network , yegnanarayana
3.artificial neural network , zurada
4. Hornick, Stinchcombe and White’s conclusion (1989)
Hornik K., Stinchcombe M. and White
H., “Multilayer feedforward networks are universal
approximators”, Neural Networks, vol. 2,
no. 5,pp. 359–366, 1989
5. Kumar, P. and Walia, E., (2006), “Cash Forecasting: An
Application of Artificial Neural
Networks in Finance”, International Journal of Computer
Science and Applications 3 (1): 61-
77.

Neural network and mlp

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a Neural network and mlp

Similar a Neural network and mlp (20)

Último

Último (20)

Neural network and mlp