1
The Perceptron and its Learning Rule
Carlo U. Nicola, SGI FH Aargau
With extracts from publications of :
M. Minsky, MIT, Demuth, U. of Colorado,
D.J. C. MacKay, Cambridge University
WBS WS06-07 2
Perceptron
(i) Single layer ANN
(ii) It works with continuous or binary inputs
(iii) It stores pattern pairs (Ak,Ck) where: Ak = (a1
k, …, an
k)
and Ck = (c1
k, …, cn
k) are bipolar valued [-1, +1].
(iv) It applies the perceptron error-correction procedure, which
always converges.
(v) A perceptron is a classifier.
Bias b is sometimes called θ
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
The Perceptron and its Learning Rule
1. 1
The Perceptron and its Learning Rule
Carlo U. Nicola, SGI FH Aargau
With extracts from publications of :
M. Minsky, MIT, Demuth, U. of Colorado,
D.J. C. MacKay, Cambridge University
WBS WS06-07 2
Perceptron
(i) Single layer ANN
(ii) It works with continuous or binary inputs
(iii) It stores pattern pairs (Ak,Ck) where: Ak = (a1
k, …, an
k)
and Ck = (c1
k, …, cn
k) are bipolar valued [-1, +1].
(iv) It applies the perceptron error-correction procedure, which
always converges.
(v) A perceptron is a classifier.
Bias b is sometimes called θ
2. 2
WBS WS06-07 3
Perceptron convergence procedure (1)
Step1: Initialize weights and thresholds: Set the wij and the bj to small
random values in the range [-1,+1].
Step2: Present new continuous valued input p0, p1,L, pn along
with the desired output vector T = (t0, L , tn).
Step3: Calculate actual output: aj = f(Σwijpi + bj) where f(.) = hardlim(.)
= sgn(.).
Step 4: When an error occurs adapt the weights with:
wij
new = wij
old + η[tj - aj] × pi where: 0 < η · 1 (learning rate)
and: bnew = bold + e where: e = η[tj - aj]
Step 5: Repeat by going to step 2 until no error.
WBS WS06-07 4
The important propriety of the perceptron rule
The algorithm outlined in the precedent slide, will always
converge to weights that solve the desired classification
problem, assuming that such weights do exist.
We will see, shortly, that such weights exist, as long as the
classification problem is linearly separable.
3. 3
WBS WS06-07 5
Example
The discrimination line between two classes in a two input
perceptron is determined by: w1x1 + w2x2 + θ = 0.
We plot this line in a x1, x2 plane as: x2 = - (w1/w2)x1 - θ/w2
Class 1
Class 2
In this case the classification
problem is neatly linear
separable and the perceptron
rule works very well.
WBS WS06-07 6
Example of perceptron learning rule
A perceptron is initialized with the following weights: w1= 1; w2 = 2; θ = -2. The
first point A : x= (0.5, 1.5) and desired output t(x)= +1 is presented to the
network. From step 3, it can be calculated that the network output is +1, so no
weights are adjusted. The same is the case for point B, with values x = (-0.5,
0.5) and target value t(x)= -1. Point C: x= (0.5, 0.5) gives an output -1, while
the expected target is t(x)= +1. According to the learning rule we change the
weights as follows: w1= 1.5; w2 = 2.5; θ = -1. ( η = 0.5; pC = 0.5; → ∆w1= 0.5; ∆
w2 =0.5; ∆ θ = 1). Now point C too, is correctly classified.
4. 4
WBS WS06-07 7
Failure of the perceptron ANN to solve the XOR problem
3
1 2
w13 w23
The XOR-problem (see table):
i1 xor i2i1 i2
-1 -1 -1
1 -1 1
-1 1 1
1 1 -1
Outputi =
1 if ∑ wij ii > hi
-1 if ∑ wij ii · hi
(i) i1 = 1, i2 = 1: w13+ w23· h3
(i), (ii), (iii) and (iv) lead to a contradiction!
(ii) i1 = -1, i2 = -1 : -(w13+ w23) · h3
(iii) i1 = 1, i2 = -1: w13 - w23 > h3 (iv) i1 = -1 , i2 = 1 : w23 - w13 > h3
The xor-problem is not linearly separable
WBS WS06-07 8
Linearly separable and not separable problems
XOR is not a linearly
separable classification
problem
5. 5
WBS WS06-07 9
Perceptrons net with hidden layer
Input
Output
0 1
2 3
4
v02 v13
v12
v03
w24 w34
Feedforward hidden layer
Now this perceptrons network can solve the XOR-problem!
Show this.
WBS WS06-07 10
Transfer function for perceptron
6. 6
WBS WS06-07 11
A more complex example
WBS WS06-07 12
Prototype vectors
We take 3 parameters to differentiate between a banana and a apple:
shape, texture, weight.
Shape: {1: round; -1: elliptical}
Texture: {1: smooth; -1: rough}
Weight: {1: > 100 g; -1: < 100 g}
We can now define a prototype banana and apple:
7. 7
WBS WS06-07 13
Apple/banana example
Training set:
Initial weights:
First iteration:
η = 1 in this example
WBS WS06-07 14
Second iteration
8. 8
WBS WS06-07 15
Checking the solution (test vectors)
WBS WS06-07 16
Checking the solution (testing the network)
The net recovers the correct answer from noisy information:
9. 9
WBS WS06-07 17
Summary
A Perceptron net is:
– A feedforward network
– Which builds linear decision boundaries
– and uses one artificial neuron for each decision.