2. Introduction to Artificial Neural Networks,
Artificial and human neurons (Biological Inspiration)
The learning process,
Supervised and unsupervised learning,
Reinforcement learning,
Applications Development and Portfolio
The McCulloch-Pitts Model of Neuron,
A simple network layers, Multilayer networks
Perceptron,
Back propagation algorithm,
Recurrent networks,
Associative memory,
Self Organizing maps,
Support Vector Machine and PCA,
Applications to speech, vision and control problems.
02/13/13
3. Main text books:
“Neural Networks: A Comprehensive Foundation”, S. Haykin (very good -theoretical)
“Pattern Recognition with Neural Networks”, C. Bishop (very good-more accessible)
“Neural Network Design” by Hagan, Demuth and Beale (introductory)
Books emphasizing the practical aspects:
“Neural Smithing”, Reeds and Marks
“Practical Neural Network Recipees in C++”’ T. Masters
Seminal Paper:
“Parallel Distributed Processing” Rumelhart and McClelland et al.
Other:
“Neural and Adaptive Systems”, J. Principe, N. Euliano, C. Lefebvre
02/13/13
4. Review Articles:
R. P. Lippman, “An introduction to Computing with Neural Nets”’ IEEE ASP
Magazine, 4-22, April 1987.
T. Kohonen, “An Introduction to Neural Computing”, Neural Networks, 1, 3-
16, 1988.
A. K. Jain, J. Mao, K. Mohuiddin, “Artificial Neural Networks: A Tutorial”’
IEEE Computer, March 1996’ p. 31-44.
02/13/13
5. Introduction to Artificial Neural Networks,
Artificial and human neurons (Biological Inspiration)
The learning process,
Supervised and unsupervised learning,
Reinforcement learning,
Applications Development and Portfolio
The McCulloch-Pitts Model of Neuron,
A simple network layers, Multilayer networks
Perceptron,
Back propagation algorithm,
Recurrent networks,
Associative memory,
Self Organizing maps,
Support Vector Machine and PCA,
Applications to speech, vision and control problems.
02/13/13
6. Introduction to Artificial Neural
Networks
Part I:
1. Artificial Neural Networks
2. Artificial and human neurons (Biological Inspiration)
3. Tasks & Applications of ANNs
Part II:
1. Learning in Biological Systems
2. Learning with Artificial Neural Networks
02/13/13
7. Digital Computers Artificial Neural Networks
Analyze the problem to be solved No requirements of an explicit
description of the problem.
Deductive Reasoning. We apply Inductive Reasoning. Given input
known rules to input data to and output data (training
produce output. examples), we construct the rules.
Computation is centralized, Computation is collective,
synchronous, and serial.
asynchronous, and parallel.
Not fault tolerant. One transistor
goes and it no longer works. Fault tolerant and sharing of
responsibilities.
Static connectivity.
Dynamic connectivity.
Applicable if well defined rules
with precise input data. Applicable if rules are unknown or
complicated, or if data are noisy or
partial.
02/13/13
9. Artificial Neural Networks (1)
Branch of "Artificial Intelligence". It is a system modeled based on the human brain. ANN goes
by many names, such as connectionism, parallel distributed processing, neuro-computing,
machine learning algorithms, and finally, artificial neural networks.
Developing ANNs date back to the early 1940s. It experienced a wide popularity in the late
1980s. This was a result of the discovery of new techniques and developments in PCs.
Some ANNs are models of biological neural networks and some are not.
ANN is a processing device (An algorithm or Actual hardware) whose design was motivated by
the design and functioning of human brain.
Inside ANN:
ANN’s design is what distinguishes neural networks from other mathematical techniques
ANN is a network of many simple processors ("units“ or “neurons”), each unit has a small
amount of local memory.
The units are connected by unidirectional communication channels ("connections"), which
carry numeric (as opposed to symbolic) data.
The units operate only on their local data and on the inputs they receive via the connections.
02/13/13
10. Artificial Neural Networks (2)
ANNs Operation
ANNs normally have great potential for parallelism (multiprocessor-friendly architecture), since
the computations of the units are independent of each other. Same like biological neural
networks.
Most neural networks have some kind of "training" rule whereby the weights of connections are
adjusted on the basis of presented patterns.
In other words, neural networks "learn" from examples, just like children…and exhibit some
structural capability for generalization.
02/13/13
11. Artificial Neural Networks (3)
ANNs are a powerful technique (Black Box) to solve many real world problems. They
have the ability to learn from experience in order to improve their performance
and to adapt themselves to changes in the environment.
In addition, they are able to deal with incomplete information or noisy data and can
be very effective especially in situations where it is not possible to define the
rules or steps that lead to the solution of a problem.
Once trained, the ANN is able to recognize similarities when presented with a new
input pattern, resulting in a predicted output pattern.
02/13/13
12. What can a ANN do?
Compute a known function
Approximate an unknown function
Pattern Recognition
Signal Processing
…….
Learn to do any of the above
02/13/13
13. Introduction to Artificial Neural
Networks
Part I:
1. Artificial Neural Networks (ANNs)
2. Artificial and human neurons (Biological Inspiration)
3. Tasks & Applications of ANNs
Part II:
1. Learning in Biological Systems
2. Learning with Artificial Neural Networks
02/13/13
14. Biological Neural Networks (BNN) are much more
complicated in their elementary structures than the
mathematical models we use for ANNs
Animals are able to react adaptively to changes in their
external and internal environment, and they use their
nervous system to perform these behaviours.
An appropriate model/simulation of the nervous system
should be able to produce similar responses and
behaviours in artificial systems.
The nervous system is build by relatively simple units,
the neurons, so copying their behaviour and functionality
should be the solution!
02/13/13
15. An artificial neural network
(ANN) is
a massively parallel distributed
ANN as a model of brain- processor that has a natural
propensity for storing
like Computer experimental knowledge and
making it available for use. It
means that:
Knowledge is acquired by the
network
Brain through a learning (training)
The human brain is still not well process;
understood and indeed its behavior The strength of the
is very complex! interconnections
There are about 10-11 billion
between neurons is
neurons in the human cortex each implemented by
connected to , on average, 10000
others. In total 60 trillion synapses means of the synaptic weights
of connections. used to
The brain is a highly complex, store the knowledge.
nonlinear and parallel computer
The learning process is a procedure
(information-processing system)
of the adapting the weights with a
02/13/13
learning algorithm in order to
16. How our brain A process of pattern
manipulates recognition and pattern
with patterns ? manipulation is based on:
Massive parallelism Connectionism Associative
Brain computer as an information Brain computer is a highly distributed memory
or signal processing system, is interconnected neurons system in
composed of a large number of a such a way that the state of one Storage of information in a brain is
simple processing elements, called neuron affects the potential of the supposed to be concentrated in
neurons. These neurons are large number of other neurons synaptic connections of brain
interconnected by numerous direct which are connected according to neural network, or more precisely,
links, which are called connection, weights or strength. The key idea in the pattern of these connections
and cooperate which other to of such principle is the functional and strengths (weights) of the
perform a parallel distributed capacity of biological neural nets synaptic connections.
processing (PDP) in order to soft a deters mostly not so of a single
desired computation tasks. neuron but of its connections
02/13/13
17. Biological
Neuron
- The simple
“arithmetic
computing”
element
02/13/13
19. dendrites
axon
synapses
The information transmission happens at the synapses, i.e
Synaptic connection strengths among neurons are used to
store the acquired knowledge.
In a biological system, learning involves adjustments to the
synaptic connections between neurons
02/13/13
20. 1. Soma or body cell - is a large, round
central body in which almost all the
logical functions of the neuron are
realized (i.e. the processing unit).
2. The axon (output), is a nerve fibre
attached to the soma which can serve
as a final output channel of the
neuron. An axon is usually highly
Synapses
branched.
Axon from
3. The dendrites (inputs)- represent a other
highly branching tree of fibers. These neuron
long irregularly shaped nerve fibers Soma
(processes) are attached to the soma
carrying electrical signals to the cell
Dendrite
Axon from
4. Synapses are the point of contact other
between the axon of one cell and the Dendrites
dendrite of another, regulating a
chemical connection whose strength The schematic
affects the input to the cell. model of a
02/13/13
biological neuron
21. Learning from examples
labeled or unlabeled
Adaptivity
changing the connection strengths to learn things
Non-linearity
the non-linear activation functions are essential
Fault tolerance
if one of the neurons or connections is damaged,
the whole network still works quite well
02/13/13
22. Introduction to Artificial Neural
Networks
Part I:
1. Artificial Neural Networks (ANNs)
2. Artificial and human neurons (Biological Inspiration)
3. Tasks & Applications of ANNs
Part II:
1. Learning in Biological Systems
2. Learning with Artificial Neural Networks
02/13/13
23. Classification
In marketing: consumer spending pattern classification
In defence: radar and sonar image classification
In agriculture & fishing: fruit, fish and catch grading
In medicine: ultrasound and electrocardiogram image classification, EEGs, medical
diagnosis
Recognition and Identification
In general computing and telecommunications: speech, vision and handwriting
recognition
In finance: signature verification and bank note verification
Assessment
In engineering: product inspection monitoring and control
In defence: target tracking
In security: motion detection, surveillance image analysis and fingerprint matching
Forecasting and Prediction
In finance: foreign exchange rate and stock market forecasting
In agriculture: crop yield forecasting , Deciding the category of potential food items
(e.g., edible or non-edible)
In marketing: sales forecasting
In meteorology: weather prediction
02/13/13
24. Computer scientists want to find out about the properties of non-symbolic
information processing with neural nets and about learning systems in
general.
Statisticians use neural nets as flexible, nonlinear regression and
classification models.
Engineers of many kinds exploit the capabilities of neural networks in many
areas, such as signal processing and automatic control.
Cognitive scientists view neural networks as a possible apparatus to describe
models of thinking and consciousness (High-level brain function).
Neuro-physiologists use neural networks to describe and explore medium-
level brain function (e.g. memory, sensory system, motorics).
Physicists use neural networks to model phenomena in statistical mechanics
and for a lot of other tasks.
Biologists use Neural Networks to interpret nucleotide sequences.
Philosophers and some other people may also be interested in Neural
Networks for various reasons
02/13/13
25. The spikes travelling along the axon of the pre-synaptic
neuron trigger the release of neurotransmitter
substances at the synapse.
The neurotransmitters cause excitation or inhibition in
the dendrite of the post-synaptic neuron.
The integration of the excitatory and inhibitory signals
may produce spikes in the post-synaptic neuron.
The contribution of the signals depends on the strength
of the synaptic connection.
• Excitation means positive product between the incoming
spike rate and the corresponding synaptic weight;
• Inhibition means negative product between the incoming
spike rate and the corresponding synaptic weight;
02/13/13
26. Output
Inputs
An artificial neural network is composed of many
artificial neurons that are linked together according
to a specific network architecture. The objective of
the neural network is to transform the inputs into
meaningful outputs.
02/13/13
27. Neurons are arranged in layers. Neurons work by processing information. They
receive and provide information in form of spikes.
The artificial neuron receives one or more inputs (representing the one or more
dendrites),
At each neuron, every input has an associated weight which modifies the
strength of each input and sums them together,
The sum of each neuron is passed through a function known as an
activation function or transfer function in order to produce an output
(representing a biological neuron's axon)
Inputs Output
02/13/13
28. x1
x2 w1
n Output
x3 w2 z = ∑ wi xi ; y = H ( z )
Inputs
i =1 y
.. w3
…
.
xn-1 wn-1
wn
xn
Each neuron takes one or more inputs and produces an output. At each
neuron, every input has an associated weight which modifies the strength of
each input. The neuron simply adds together all the inputs and calculates an
output to be passed on.
02/13/13
31. Three elements:
1. A set of synapses, or connection link: each of
which is characterized by a weight or strength of its own
wkj. Specifically, a signal xj at the input synapse ‘j’
connected to neuron ‘k’ is multiplied by the synaptic wkj
2. An adder: For summing the input signals, weighted by
respective synaptic strengths of the neuron in a linear
operation.
3. Activation function: For limiting of the amplitude of
the output of the neuron to limited range. The activation
function is referred to as a Squashing (i.e. limiting)
function {interval [0,1], or, alternatively [-1,1]}
02/13/13
32. The bias has the effect of increasing or lowering the net
input of the activation function depending on whether it is
+/-
yk = Ø(vk) = Ø(uk + bk) = Ø(Σ wkjxj + bk)
An artificial neuron:
-computes the weighted sum of its input (called its net input)
-adds its bias (the effect of applying affine transformation to the output vk)
-passes this value through an activation function
We say that the neuron “fires” (i.e. becomes
active) if its outputs is above zero.
This extra free variable (bias) makes the neuron
more powerful.
02/13/13
33. It defines the output of the neuron given an input or set of inputs. A
standard computer chip circuit can be seen as a digital network of
activation functions that can be "ON" (1) or "OFF" (0), depending on input,
The best activation function is the non-linear function. Linear functions are
limited because the output is simply proportional to the input.
Three basic types of activation
function:
1. Threshold function,
2. Linear function,
3. Sigmoid function.
02/13/13
36. Activation functions (4)
- A fairly simple non-linear function, such as the logistic function.
- As the slop parameter approaches infinity the sigmoid function becomes a
threshold function
Where “a” is the slope parameter of
the sigmoid function
02/13/13
37. Early ANN Models:
McCulloch-Pitts , Perceptron, ADALINE, Hopfield
Network,
Current Models:
Multilayer feed forward networks (Multilayer
perceptrons- Back propagation )
Radial Basis Function networks
Self Organizing Networks
...
02/13/13
38. Feedback is a dynamic system whenever occurs
in almost every part of the nervous system,
Feedback is giving one or more closed path for
transmission of signals around the system,
It plays important role in study of special class
of neural networks known as Recurrent
networks.
02/13/13
39. The system is assumed to be linear and has a forward path (A)
and a feedback path (B),
The output of the forward channel determines its own output
through the feedback channel.
02/13/13
40. E.g. consider A is a fixed weight and B is a unit delay operator z-1 .
02/13/13
41. Then, we may express yk(n) as an infinite weighted summation of
present and past samples of the input signal xj(n).
Therefore, feedback systems are controlled by weight.
02/13/13
42. Feedback systems are controlled
by weight.
1. For positive weight, we have
stable systems, i,e, convergent
output y,
2. For negative weight, we have,
unstable systems, i.e divergent
output y.. (Linear and
Exponential)
02/13/13
43. Three different classes of network architectures:
1. Single-layer feed forward networks,
2. Multilayer feed forward networks,
3. Recurrent networks.
02/13/13
44. - Input layer of source nodes that projects directly
onto an output layer of neurons.
- “Single-layer” referring to the output layer of
computation nodes (neuron).
02/13/13
45. It contains one or more hidden
layers (hidden neurons).
“Hidden” refers to the part of
the neural network is not seen
directly from either input or
output of the network .
The function of hidden neuron is
to intervene between input and
output.
By adding one or more hidden
layers, the network is able to
extract higher-order statistics
from input
02/13/13
46. It is different from feed forward
neural network in that it has at
least one feedback loop.
Recurrent network may consist
of single layer of neuron with
each neuron feeding its output
signal back to the inputs of all
the other neurons. Note: There
are no self-feedback.
Feedback loops have a profound
impact on learning and overall
performance.
02/13/13
47. What transfer function should be used?
How many inputs does the network need?
How many hidden layers does the network need?
How many hidden neurons per hidden layer?
How many outputs should the network have?
There is no standard methodology to determinate these values.
Even there is some heuristic points, final values are
determinate by a trial and error procedure.
02/13/13
48. Knowledge is referred to the stored information or models used
by a person or machine to interpret, predict and, appropriately,
respond to the outside.
A good solution depends on a good representation of
knowledge
The main characteristic of knowledge representation has
two folds:
1) What information is actually made explicit?
2) How the information is physically encoded for
subsequent use?
02/13/13
49. There are two kinds of
Knowledge:
1) The known world states, or
facts, (prior knowledge),
2) Observations (measurements)
of the world, obtained by sensors to
These observations
probe thepool of
represent the environment.
information, from
which examples are
used to train the NN
02/13/13
50. These Examples can be
labeled or unlabeled
In labeled examples
Each example representing an input signal is paired with
a corresponding desired response,
Labeled examples may be expensive to collect, as they
require availability of a “teacher” to provide a desired
response for each labeled example.
Un labeled examples
Unlabeled examples are usually abundant as there is no
need for supervision.
02/13/13
51. Design of neural network may
proceed as follow:
An appropriate architecture for the neural network, with
an input layer consisting of source nodes equal in number
to the pixels of an input image.
The recognition performance of trained network is
tested with data not seen before (testing).
This phase of the network design called
learning
02/13/13
52. There are four rules for knowledge representation:
Rule 1:
Similar inputs (i.e., patterns) drawn from similar
classes should usually produce similar
representation inside the network, and should
therefore be classified as belonging to the same
class.
There are plethora (many) of
measures for determining the
similarity between inputs
02/13/13
53. A commonly used measure of similarity is the Euclidian Distance
Let xi denotes an m -by-1 vector
(1)
02/13/13
54. Another measure is the dot product or inner product com
Given a pair of vectors xi a nd xj of the same dimension, their
inner product will be (the projection of vector xi onto
vector xj)
Please note that:
02/13/13
55. The smaller the Euclidean distance ║x i - xj ║(i.e. the more similar
the vector xi a nd xj are), the larger the inner product xiT xj will
be.
To formalize this relationship, we normalize
the vectors x i and xj to have a unit length, i.e.:
Using Eq.(1) to write
The minimization of the Euclidean distance d (x i , xj ) corresponds
to maximization of the inner product (x i , xj )..and, therefore, the
similarity between the vectors x i and xj
02/13/13
56. If the vectors x i and xj are stochastic (drown from
different population of data)
Where C-1 is the inverse of the covariance
matrix C. It is supposed that the
covariance matrix is the same for both
For a prescribed C, the smaller the distance d is
the more similar the vectors xi a nd xj will be
02/13/13
57. Rule 2:
Item to be categorized as separate classes should be given
widely different representation in work.
Rule 3:
If a particular feature is important, then there should be
large number of neurons involved in the representation of
that item in the network.
Rule 4:
Prior information and invariance should be built into the
design of a neural network when ever they are available,
so as to simplify the network design by its not having to
learn them.
Rule 4 is particularly important and highly
desirable
02/13/13
58. Rule 4 is particularly important and highly desirable
because it results in an NN with a Specialized
Structure (SS)
1) Biological visual and auditory networks are very specialized,
2) NN with SS has a smaller number of free parameters available for
adjustment than other networks. Then, they need a small
training dataset, learns faster and generalize better.
3) Rate of information transmission through a specialized network
is faster,
4) Cost of building a specialized network is minimum, due to small
02/13/13
size.
59. There are currently no well-defined rules for doing this; but we
have some procedure are known to yield useful rules. In
particular, we may use a combination of two techniques:
1. Restricting the network architecture (using local connections)
2. Constraining the choice of synaptic weight (using the weight
sharing)
The latter tech is
so important
because it leads to
reducing
significantly free
parameters
02/13/13
60. Consider any of the following:
1) When an object rotates, the perceived image, by observer, will
change as well,
2) The utterance of a spoken person may be soft or loud..slower or
quicker,
A classifier should be invariant to different
3) …..
transformation
Or
A class estimate represented by an output of the
classifier MUST not be affected by transformations of
the observed signal applied to the classifier input
There are three technique for rendering classifier-type NNs
invariant to transformations:
1. Invariance by structure.
2. Invariance by training.
3. Invariance by feature space
02/13/13
62. Learning approach based on modeling adaptation in
biological neural systems
Learning = learning by adaptation
The young animal learns that the green fruits are sour,
while the yellowish/reddish ones are sweet. The
learning happens by adapting the fruit picking
behaviour
02/13/13
63. From experience: examples / training data
Learning happens by changing of the synaptic
strengths,
Synapses change size and strength with experience
(or examples or training data),
Strength of connection between the neurons is
stored as a weight-value for the specific connection,
Learning the solution to a problem = changing the
connection weights
02/13/13
64. Hebbian Learning
When two connected neurons are firing at the
same time, the strength of the synapse between
them increases,
“Neurons that fire together, wire together”
02/13/13
65. We may categorize the learning process through
Neural Networks function as follows:
1. Learning with a teacher,
- Supervised Learning
2. Learning without a teacher,
- Unsupervised Learning
- Reinforcement Learning
02/13/13
66. Supervised Learning
In supervised learning, both the
inputs and the outputs are
provided. The network then
processes the inputs and
compares its resulting outputs
against the desired outputs
Errors are then calculated,
causing the system to adjust
the weights which control the
network. This process occurs
over and over as the weights
are continually improved.
Supervised learning process
constitutes a closed-loop
feedback system but unknown
environment is outside the
loop,
02/13/13
67. Supervised Learning
It is based on a labeled
training set. (2)
The class of each piece of ε Class
data in training set is
known.
ε Class
A
Class labels are pre- B λ Class
determined and provided λ Class
B
in the training phase. A
A ε Class
λ Class B
02/13/13
70. Various steps have to be considered:
1. Determine the type of training examples,
2. Gather a training data set that satisfactory describe the given
problem,
3. After the training process we can test the performance of
learned artificial neural network with the test (validation) data set,
4. Test data set consist of data that has not been introduced to
artificial neural network while learning.
02/13/13
71. The learning of input –output
mapping is performed through
continued interaction with the
environment in order to
minimize a scalar index of
performance.
Or
A machine learning technique
that sets parameters of an
artificial neural network,
where data is usually not
given, but generated by
interactions with the
environment.
02/13/13
72. Reinforcement learning is built around critic that converts primary
reinforcement signal received from the environment into a higher
quality reinforcement signal
02/13/13
73. No help from the outside,
No information available on the desired output,
Input: set of patterns P, from n-dimensional space S, but
little / no information about their classification,
evaluation, interesting features, etc.
It must learn these by itself!
Learning by doing
Tasks: Used to pick out structure in the input
Clustering - Group patterns based on similarity,
Vector Quantization - Fully divide up S into a small set
of regions (defined by codebook vectors) that also helps
cluster P,
Feature Extraction - Reduce dimensionality of S by
removing unimportant features (i.e. those that do not
help in clustering P)
02/13/13
74. Task performed Task performed
Classification Clustering, Pattern
Pattern Recognition Recognition
NN model Feature Extraction, VQ
Preceptron,
NN Model
Feed-Forward NN Self Organizing Maps,
ART
02/13/13
Notas del editor
Propensity ميل
Axon is like محور عصبى Dendrite is like الغصن Synapse is like مشبك
The brain basically learns from experience. Neural networks are sometimes called machine learning algorithms, because changing of its connection weights (training) causes the network to learn the solution to a problem . The strength of connection between the neurons is stored as a weight-value for the specific connection. The system learns new knowledge by adjusting these connection weights . The learning ability of a neural network is determined by its architecture and by the algorithmic method chosen for training .
Unsupervised learning The hidden neurons must find a way to organize themselves without help from the outside. In this approach, no sample outputs are provided to the network against which it can measure its predictive performance for a given vector of inputs. This is learning by doing .