SlideShare a Scribd company logo
1 of 16
Download to read offline
The Back Propagation Learning Algorithm




  For networks with hidden units.
  Error Correcting algorithm.
  Solves the credit (blame) assignment problem.




                         1
What is supervised learning?


Can we teach a network to learn to associate a pattern of
inputs with corresponding outputs?
i.e. given initial set of weights, how can they be adapted
to produce the desired output? Use a training set:
                                                    y


                 a       f?        d
   payment




                     b        e?


             c                                  w       p



                               workload
             person      workload      pay   P(happy)
             a           0.1           0.9   0.95
             b           0.3           0.7   0.8
             c           0.07          0.2   0.2
             d           0.9           0.9   0.3
             e           0.7           0.5   ??
             f           0.4           0.8   ??

After training, how does network generalise to patterns
unseen during learning?

                                   2
Learning by Error Correction


In the perceptron there was a binary valued output Ý and
a target Ø.
               x1            x2                        xN




                        w1        w2           wN




                             output y
                             target t
                                                        y


                Æ                                           1

   Ý    step            ÛÜ
                    ¼
                                                            0
                                                                Σwi xi
                                                                i


Define this error measure:
                              ½            ´Ø   Ý µ¾
                              ¾
It counts the number of incorrect outputs.

We want to design a weight changing procedure that
minimises .

                                       3
Learning by Error Correction



How do we change the weights Û¼ Û½         ÛÆ so that
error decreases?
                     E

Suppose error
                           slope                  slope
varies with weight         -ve                    +ve
Û like this.

                                                 wi


If we could measure the slope


                           Û
then changing weights by the negative of the slope will
minimise .



  slope +ve   ¡Û -ve     move towards minimum of
  slope -ve   ¡Û +ve



                           4
More Perceptron Problems


For the perceptron, can’t be differentiated with respect
to weights Û¼ Û½        ÛÆ because involves output Ý
which is not differentiable.
           ½    ´Ø   Ý µ¾           Ý   step
                                                       Æ
                                                           ÛÜ
           ¾                                           ¼


Threshold Unit:
                                               y

     ´        ÈÆ Û Ü                               1
       ½   if               ¼
Ý
       ¼   if
              ÈÆ ¼ Û Ü      ¼
                    ¼
                                                   0
                                                                Σwi xi
                                                                i


Sigmoid Unit:
                                               y


                ½                              1
Ý              ÈÆ ¡
     ½ · ÜÔ      ÛÜ
                                               0
                                                                 Σwi xi
                                                                 i




                                5
Gradient Descent


                         E

The error is now               slope               slope
a differentiable               -ve                 +ve
function.

                                                   wi


Change weights using negative slope

                         ¡Û            Û




    Û
        +ve   ¡Û   -ve
                             move towards minimum of

    Û
        -ve   ¡Û   +ve



This approach is called Gradient Descent




                                6
Derivation of Back Propagation


         x1                      v1                      y1


         x2                      v2                      y2




         xk                       vj                       yi
                      uj k                    wi j



         xN                      vN                      yN



         inputs                  hidden                    outputs
           xk                      vj                        yi



                                              È            ¡
          output             Ý         sig           Û Ú
                                              È               ¡
         hidden              Ú         sig           Ù Ü

              error                    ½È     È  Ø   Ý ¡¾
                                       ¾

We need to find the derivatives of                    with respect to weights
Û and Ù .

                                       7
Preliminaries


                                                      xk   ujk vj   wij yi

On a single pattern (drop )
                        ½                    ¡¾
                        ¾           Ø    Ý
and
                                    ½
             Ý                   ÈÆ ¡
                   ½ · ÜÔ          Û Ú

Note that:
                        Ý                     ¡
                        Ú
                                Ý       ½ Ý       Û


                        Ý                     ¡
                        Û
                                Ý       ½ Ý       Ú



              since if Ý
                                      ½
                                ½ · ÜÔ´ Üµ
                        Ý
                 then           Ý ´½   Ý µ
                        Ü




                            8
Between Hidden and Output                               Û

                                                            xk   ujk vj   wij yi

For weights between hidden units
and output units.

                                 ½                 ¡¾
                                 ¾        Ø    Ý

                                              Ý
                            Û             Ý   Û
                        ¡
       Ý
               Ý    Ø
       Ý
       Û
               Ý ´½     ÝµÚ
                                    ¡
                   Û
                             Ý     Ø ßÞ ´½   Ý µ Ú
                                      Ý
                                     call this Æ




                                      9
Between Input and Hidden                                    Ù

                                                                 xk   ujk vj   wij yi

For weights between input units
and hidden units.

                                      ½                     ¡¾
                                      ¾            Ø    Ý

                                                       Ý     Ú
                         Ù                     Ý       Ú     Ù

                             ¡
       Ý
                   Ý    Ø
       Ý
       Ú
               Ý ´½      ÝµÛ
       Ú
       Ù
                   Ú   ´½   Ú µ Ü

                                          ¡
           Ù
                                 Ý     Ø Ý ´½   Ý µ Û Ú ´½   Ú µ Ü


           Ù
                             ÆÛ           Ú   ´½   Ú µ Ü

                                              10
Between Hidden and Output                                 ¡Û

                                                                  xk      ujk vj   wij yi

Modifying weights between hidden
units and output units using
gradient descent.


          ¡Û                Û

                                        ¡
                            Ý    
                                ßÞ Ø        Ý ´½      
                                                    ßÞ Ý µ Ú
                                                 close to ¼ ½
                                                 small for Ý
                 Learning
                 constant




                                                                “input”
                                error




                                            ßÞ
                                            Æ




                                 11
Between Input and Hidden             ¡Ù

                                            xk   ujk vj   wij yi

Modifying weights between input
units and hidden units using
gradient descent.


          ¡Ù            Ù


                            Æ Û Ú   ´½   Ú µÜ

               back propagation of error


The same procedure is applicable to a net with many
hidden layers.




                            12
An Example



        x1      u                                  x2
                      =0                  2.0
                 21
                         .8              =
    u 11 =2.0                     u 12
                                                 u 22 =0.8
                                                                        ܽ ܾ target Ø
     u 10 = -1.0                                u 20 = -1.0              0       0   0
                       v1           v2
1                                                             1
                                                                         0       1   1
                                                                         1       0   1
             w1 =2.0                     w2 = -1.0
                                                                         1       1   0
        1                     y
             w0 = -1.0



                                                                             ¡
                 hidden Ú½                         sig Ù½½Ü½ · Ù½¾Ü¾ · Ù½¼
                                                       
                                                   0.9526                  ¡
                                   Ú¾              sig Ù¾½Ü½ · Ù¾¾Ü¾ · Ù¾¼
                                                       
                                                   0.6457               ¡
                 output Ý                          sig Û½Ú½ · Û¾Ú¾ · Û¼
                                                   0.5645

                       error                        ½  Ø   Ý ¡¾
                                                    ¾
                                                   0.1593




                                                         13
An Example: updating the weights


 Learning constant        ½¼

             output        Æ        ´Ý   ص Ý´½   ݵ
                                    0.1388
                        ¡Û¼           ƽ ¼
                                    -0.1388
                        ¡Û½             ÆÚ½
                                    -0.1322
                        ¡Û¾             ÆÚ¾
                                    -0.0896


 hidden (to Ú½)                         hidden (to Ú¾)
¡Ù½¼        ÆÛ½ Ú½´½     Ú½µ½ ¼ ¡Ù¾¼               ÆÛ¾ Ú¾´½     Ú¾µ½ ¼
        -0.0125                                0.0318
¡Ù½½        ÆÛ½ Ú½´½     Ú½µÜ½ ¡Ù¾½                ÆÛ¾ Ú¾´½     Ú¾µÜ½
        -0.0125                                0.0318
¡Ù½¾        ÆÛ½ Ú½´½     Ú½µÜ¾ ¡Ù¾¾                ÆÛ¾ Ú¾´½     Ú¾µÜ¾
        -0.0125                                0.0318




                               14
An Example: a New Error



        x1      u                                  x2
                                              8
                      =0                  1.9
                 21
                        .83              =
 u 11 =1.98                       u 12
                                                  u 22 =0.83
                                                                        ܽ ܾ target Ø
     u 10 = -1.01                            u 20 = -0.96                0       0   0
                      v1            v2
 1                                                             1
                                                                         0       1   1
                                                                         1       0   1
            w1 =1.86                     w2 = -1.08
                                                                         1       1   0
        1                     y
             w0 = -1.13



                                                                             ¡
                hidden Ú½                          sig Ù½½Ü½ · Ù½¾Ü¾ · Ù½¼
                                                       
                                                   0.9509                  ¡
                                   Ú¾              sig Ù¾½Ü½ · Ù¾¾Ü¾ · Ù¾¼
                                                       
                                                   0.6672               ¡
                 output Ý                          sig Û½Ú½ · Û¾Ú¾ · Û¼
                                                   0.4776

                      error                         ½  Ø   Ý ¡¾
                                                    ¾
                                                   0.1140

The error has reduced for this pattern.



                                                         15
Summary




  Credit-assignment problem solved for hidden units:

           Input                                Output

                                           ƽ
                             Û½

                             Û¾
                 Æ                         ƾ
                             Û¿

             Æ       ¼

                         ´ µÈ Û Æ          Æ¿

                             Errors

       total input to unit ;            1st derivative of acti-
                                    ¼




  vation function (sigmoid)
  Outstanding issues:
  1. Number of layers; number and type of units in
     layer
  2. Learning rates
  3. Local or distributed representations

                              16

More Related Content

Viewers also liked

The Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmThe Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmESCOM
 
Artificial Neural Network
Artificial Neural Network Artificial Neural Network
Artificial Neural Network Iman Ardekani
 
Artificial Neural Network Lect4 : Single Layer Perceptron Classifiers
Artificial Neural Network Lect4 : Single Layer Perceptron ClassifiersArtificial Neural Network Lect4 : Single Layer Perceptron Classifiers
Artificial Neural Network Lect4 : Single Layer Perceptron ClassifiersMohammed Bennamoun
 
lecture07.ppt
lecture07.pptlecture07.ppt
lecture07.pptbutest
 
Back propagation
Back propagationBack propagation
Back propagationNagarajan
 
neural network
neural networkneural network
neural networkSTUDENT
 

Viewers also liked (9)

Hopfield Networks
Hopfield NetworksHopfield Networks
Hopfield Networks
 
The Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmThe Back Propagation Learning Algorithm
The Back Propagation Learning Algorithm
 
Artificial Neural Network
Artificial Neural Network Artificial Neural Network
Artificial Neural Network
 
Artificial Neural Network Lect4 : Single Layer Perceptron Classifiers
Artificial Neural Network Lect4 : Single Layer Perceptron ClassifiersArtificial Neural Network Lect4 : Single Layer Perceptron Classifiers
Artificial Neural Network Lect4 : Single Layer Perceptron Classifiers
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
lecture07.ppt
lecture07.pptlecture07.ppt
lecture07.ppt
 
HOPFIELD NETWORK
HOPFIELD NETWORKHOPFIELD NETWORK
HOPFIELD NETWORK
 
Back propagation
Back propagationBack propagation
Back propagation
 
neural network
neural networkneural network
neural network
 

More from ESCOM

redes neuronales tipo Som
redes neuronales tipo Somredes neuronales tipo Som
redes neuronales tipo SomESCOM
 
redes neuronales Som
redes neuronales Somredes neuronales Som
redes neuronales SomESCOM
 
redes neuronales Som Slides
redes neuronales Som Slidesredes neuronales Som Slides
redes neuronales Som SlidesESCOM
 
red neuronal Som Net
red neuronal Som Netred neuronal Som Net
red neuronal Som NetESCOM
 
Self Organinising neural networks
Self Organinising  neural networksSelf Organinising  neural networks
Self Organinising neural networksESCOM
 
redes neuronales Kohonen
redes neuronales Kohonenredes neuronales Kohonen
redes neuronales KohonenESCOM
 
Teoria Resonancia Adaptativa
Teoria Resonancia AdaptativaTeoria Resonancia Adaptativa
Teoria Resonancia AdaptativaESCOM
 
ejemplo red neuronal Art1
ejemplo red neuronal Art1ejemplo red neuronal Art1
ejemplo red neuronal Art1ESCOM
 
redes neuronales tipo Art3
redes neuronales tipo Art3redes neuronales tipo Art3
redes neuronales tipo Art3ESCOM
 
Art2
Art2Art2
Art2ESCOM
 
Redes neuronales tipo Art
Redes neuronales tipo ArtRedes neuronales tipo Art
Redes neuronales tipo ArtESCOM
 
Neocognitron
NeocognitronNeocognitron
NeocognitronESCOM
 
Neocognitron
NeocognitronNeocognitron
NeocognitronESCOM
 
Neocognitron
NeocognitronNeocognitron
NeocognitronESCOM
 
Fukushima Cognitron
Fukushima CognitronFukushima Cognitron
Fukushima CognitronESCOM
 
Counterpropagation NETWORK
Counterpropagation NETWORKCounterpropagation NETWORK
Counterpropagation NETWORKESCOM
 
Counterpropagation NETWORK
Counterpropagation NETWORKCounterpropagation NETWORK
Counterpropagation NETWORKESCOM
 
Counterpropagation
CounterpropagationCounterpropagation
CounterpropagationESCOM
 
Teoría de Resonancia Adaptativa Art2 ARTMAP
Teoría de Resonancia Adaptativa Art2 ARTMAPTeoría de Resonancia Adaptativa Art2 ARTMAP
Teoría de Resonancia Adaptativa Art2 ARTMAPESCOM
 
Teoría de Resonancia Adaptativa ART1
Teoría de Resonancia Adaptativa ART1Teoría de Resonancia Adaptativa ART1
Teoría de Resonancia Adaptativa ART1ESCOM
 

More from ESCOM (20)

redes neuronales tipo Som
redes neuronales tipo Somredes neuronales tipo Som
redes neuronales tipo Som
 
redes neuronales Som
redes neuronales Somredes neuronales Som
redes neuronales Som
 
redes neuronales Som Slides
redes neuronales Som Slidesredes neuronales Som Slides
redes neuronales Som Slides
 
red neuronal Som Net
red neuronal Som Netred neuronal Som Net
red neuronal Som Net
 
Self Organinising neural networks
Self Organinising  neural networksSelf Organinising  neural networks
Self Organinising neural networks
 
redes neuronales Kohonen
redes neuronales Kohonenredes neuronales Kohonen
redes neuronales Kohonen
 
Teoria Resonancia Adaptativa
Teoria Resonancia AdaptativaTeoria Resonancia Adaptativa
Teoria Resonancia Adaptativa
 
ejemplo red neuronal Art1
ejemplo red neuronal Art1ejemplo red neuronal Art1
ejemplo red neuronal Art1
 
redes neuronales tipo Art3
redes neuronales tipo Art3redes neuronales tipo Art3
redes neuronales tipo Art3
 
Art2
Art2Art2
Art2
 
Redes neuronales tipo Art
Redes neuronales tipo ArtRedes neuronales tipo Art
Redes neuronales tipo Art
 
Neocognitron
NeocognitronNeocognitron
Neocognitron
 
Neocognitron
NeocognitronNeocognitron
Neocognitron
 
Neocognitron
NeocognitronNeocognitron
Neocognitron
 
Fukushima Cognitron
Fukushima CognitronFukushima Cognitron
Fukushima Cognitron
 
Counterpropagation NETWORK
Counterpropagation NETWORKCounterpropagation NETWORK
Counterpropagation NETWORK
 
Counterpropagation NETWORK
Counterpropagation NETWORKCounterpropagation NETWORK
Counterpropagation NETWORK
 
Counterpropagation
CounterpropagationCounterpropagation
Counterpropagation
 
Teoría de Resonancia Adaptativa Art2 ARTMAP
Teoría de Resonancia Adaptativa Art2 ARTMAPTeoría de Resonancia Adaptativa Art2 ARTMAP
Teoría de Resonancia Adaptativa Art2 ARTMAP
 
Teoría de Resonancia Adaptativa ART1
Teoría de Resonancia Adaptativa ART1Teoría de Resonancia Adaptativa ART1
Teoría de Resonancia Adaptativa ART1
 

Recently uploaded

Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 

Recently uploaded (20)

Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 

The Back Propagation Learning Algorithm

  • 1. The Back Propagation Learning Algorithm For networks with hidden units. Error Correcting algorithm. Solves the credit (blame) assignment problem. 1
  • 2. What is supervised learning? Can we teach a network to learn to associate a pattern of inputs with corresponding outputs? i.e. given initial set of weights, how can they be adapted to produce the desired output? Use a training set: y a f? d payment b e? c w p workload person workload pay P(happy) a 0.1 0.9 0.95 b 0.3 0.7 0.8 c 0.07 0.2 0.2 d 0.9 0.9 0.3 e 0.7 0.5 ?? f 0.4 0.8 ?? After training, how does network generalise to patterns unseen during learning? 2
  • 3. Learning by Error Correction In the perceptron there was a binary valued output Ý and a target Ø. x1 x2 xN w1 w2 wN output y target t y Æ 1 Ý step ÛÜ ¼ 0 Σwi xi i Define this error measure: ½ ´Ø   Ý µ¾ ¾ It counts the number of incorrect outputs. We want to design a weight changing procedure that minimises . 3
  • 4. Learning by Error Correction How do we change the weights Û¼ Û½ ÛÆ so that error decreases? E Suppose error slope slope varies with weight -ve +ve Û like this. wi If we could measure the slope Û then changing weights by the negative of the slope will minimise . slope +ve ¡Û -ve move towards minimum of slope -ve ¡Û +ve 4
  • 5. More Perceptron Problems For the perceptron, can’t be differentiated with respect to weights Û¼ Û½ ÛÆ because involves output Ý which is not differentiable. ½ ´Ø   Ý µ¾ Ý step Æ ÛÜ ¾ ¼ Threshold Unit: y ´ ÈÆ Û Ü 1 ½ if ¼ Ý ¼ if ÈÆ ¼ Û Ü ¼ ¼ 0 Σwi xi i Sigmoid Unit: y ½ 1 Ý  ÈÆ ¡ ½ · ÜÔ   ÛÜ 0 Σwi xi i 5
  • 6. Gradient Descent E The error is now slope slope a differentiable -ve +ve function. wi Change weights using negative slope ¡Û   Û Û +ve ¡Û -ve move towards minimum of Û -ve ¡Û +ve This approach is called Gradient Descent 6
  • 7. Derivation of Back Propagation x1 v1 y1 x2 v2 y2 xk vj yi uj k wi j xN vN yN inputs hidden outputs xk vj yi  È ¡ output Ý sig Û Ú  È ¡ hidden Ú sig Ù Ü error ½È È  Ø   Ý ¡¾ ¾ We need to find the derivatives of with respect to weights Û and Ù . 7
  • 8. Preliminaries xk ujk vj wij yi On a single pattern (drop ) ½   ¡¾ ¾ Ø  Ý and ½ Ý  ÈÆ ¡ ½ · ÜÔ   Û Ú Note that: Ý   ¡ Ú Ý ½ Ý Û Ý   ¡ Û Ý ½ Ý Ú since if Ý ½ ½ · ÜÔ´ Üµ Ý then Ý ´½   Ý µ Ü 8
  • 9. Between Hidden and Output Û xk ujk vj wij yi For weights between hidden units and output units. ½   ¡¾ ¾ Ø  Ý Ý Û Ý Û   ¡ Ý Ý  Ø Ý Û Ý ´½  ÝµÚ   ¡ Û Ý   Ø ßÞ ´½   Ý µ Ú Ý call this Æ 9
  • 10. Between Input and Hidden Ù xk ujk vj wij yi For weights between input units and hidden units. ½   ¡¾ ¾ Ø  Ý Ý Ú Ù Ý Ú Ù   ¡ Ý Ý  Ø Ý Ú Ý ´½  ÝµÛ Ú Ù Ú ´½   Ú µ Ü   ¡ Ù Ý   Ø Ý ´½   Ý µ Û Ú ´½   Ú µ Ü Ù ÆÛ Ú ´½   Ú µ Ü 10
  • 11. Between Hidden and Output ¡Û xk ujk vj wij yi Modifying weights between hidden units and output units using gradient descent. ¡Û   Û   ¡   Ý   ßÞ Ø Ý ´½   ßÞ Ý µ Ú close to ¼ ½ small for Ý Learning constant “input” error ßÞ Æ 11
  • 12. Between Input and Hidden ¡Ù xk ujk vj wij yi Modifying weights between input units and hidden units using gradient descent. ¡Ù   Ù   Æ Û Ú ´½   Ú µÜ back propagation of error The same procedure is applicable to a net with many hidden layers. 12
  • 13. An Example x1 u x2 =0 2.0 21 .8 = u 11 =2.0 u 12 u 22 =0.8 ܽ ܾ target Ø u 10 = -1.0 u 20 = -1.0 0 0 0 v1 v2 1 1 0 1 1 1 0 1 w1 =2.0 w2 = -1.0 1 1 0 1 y w0 = -1.0   ¡ hidden Ú½ sig Ù½½Ü½ · Ù½¾Ü¾ · Ù½¼   0.9526 ¡ Ú¾ sig Ù¾½Ü½ · Ù¾¾Ü¾ · Ù¾¼   0.6457 ¡ output Ý sig Û½Ú½ · Û¾Ú¾ · Û¼ 0.5645 error ½  Ø   Ý ¡¾ ¾ 0.1593 13
  • 14. An Example: updating the weights Learning constant ½¼ output Æ ´Ý   ص Ý´½   ݵ 0.1388 ¡Û¼   ƽ ¼ -0.1388 ¡Û½   ÆÚ½ -0.1322 ¡Û¾   ÆÚ¾ -0.0896 hidden (to Ú½) hidden (to Ú¾) ¡Ù½¼   ÆÛ½ Ú½´½   Ú½µ½ ¼ ¡Ù¾¼   ÆÛ¾ Ú¾´½   Ú¾µ½ ¼ -0.0125 0.0318 ¡Ù½½   ÆÛ½ Ú½´½   Ú½µÜ½ ¡Ù¾½   ÆÛ¾ Ú¾´½   Ú¾µÜ½ -0.0125 0.0318 ¡Ù½¾   ÆÛ½ Ú½´½   Ú½µÜ¾ ¡Ù¾¾   ÆÛ¾ Ú¾´½   Ú¾µÜ¾ -0.0125 0.0318 14
  • 15. An Example: a New Error x1 u x2 8 =0 1.9 21 .83 = u 11 =1.98 u 12 u 22 =0.83 ܽ ܾ target Ø u 10 = -1.01 u 20 = -0.96 0 0 0 v1 v2 1 1 0 1 1 1 0 1 w1 =1.86 w2 = -1.08 1 1 0 1 y w0 = -1.13   ¡ hidden Ú½ sig Ù½½Ü½ · Ù½¾Ü¾ · Ù½¼   0.9509 ¡ Ú¾ sig Ù¾½Ü½ · Ù¾¾Ü¾ · Ù¾¼   0.6672 ¡ output Ý sig Û½Ú½ · Û¾Ú¾ · Û¼ 0.4776 error ½  Ø   Ý ¡¾ ¾ 0.1140 The error has reduced for this pattern. 15
  • 16. Summary Credit-assignment problem solved for hidden units: Input Output ƽ Û½ Û¾ Æ Æ¾ Û¿ Æ ¼ ´ µÈ Û Æ Æ¿ Errors total input to unit ; 1st derivative of acti- ¼ vation function (sigmoid) Outstanding issues: 1. Number of layers; number and type of units in layer 2. Learning rates 3. Local or distributed representations 16