SlideShare una empresa de Scribd logo
1 de 121
Descargar para leer sin conexión
Imperial College London

Department of Electrical and Electronic Engineering

Final Year Project Report 2007




Project Title:            Synaptic
                      The Synaptic Processing Unit

Student:              Anthony Hsiao

Course:               4T

Project Supervisor:   Dr. George Constantinides

Second Marker:        Professor Alessandro Astolfi
Abstract
A small but growing community of engineers and scientists around the world are
breaking new grounds in the field of Neuromorphic Engineering, and succeed in
designing ever more complex brain-inspired artificial neural systems and
implementing them in low power analogue VLSI silicon chips.
A recently proposed synapse model called binary cascade synapse has memory
properties that are superior to other comparable models, and it is suitable for
implementation into digital hardware. Recent efforts have succeeded in designing
FPGA implementations of these binary cascade synapses, but failed to implement a
usefully large number of them onto one single chip.
This project focuses on developing the FPGA implementation of binary cascade
synapses further, and by radically changing the digital architecture, essentially
designing a microprocessor that processes cascade synapses. This processor is called
Synaptic Processing Unit (SPU) and the prototype implementation can currently host
up to 8192 cascade synapses.
This report describes the development of the SPU, which necessitated the
development of a novel learning rule alongside of it, called Spike Timing and Activity
Dependent Plasticity (STADP), and portrays a characterisation of this learning rule.
Both the hardware implementation of the SPU and of the learning rule are
implemented onto an FPGA and evaluated in-circuit.
Then, to put the SPU to an ultimate test, it was used together with an aVLSI neuron
chip to form a neural system with binary cascade synapses, and was given a real
classification task, whereby it was taught to classify two greyscale images. And
indeed, the system does successfully classify the two images, which is a very
encouraging result.
To the best of the knowledge of the author, the SPU presented here is the first
hardware implementation with such large number of synapses of its kind, in the
world.
The Synaptic Processing Unit                                             Anthony Hsiao


Acknowledgements
       Thank you to all those people who have helped me get this far, both
       academically and otherwise, and to those that accompanied me along the way.
       In particular, I would like to thank Dylan Muir at the Institute of
       Neuroinformatics for supervising my project, and being there whenever I
       needed help, especially during the crazy hours before the FPGA decided to
       take a holiday in the US.
       I would also like to thank Dr. George Constantinides at Imperial College
       London for supervising my project and Prof. Alessandro Astolfi for second
       marking it.
       More words of thanks go to Prof. Alessandro Astolfi for coordinating my
       exchange to ETH Zurich, and for being patient when necessary and laidback
       whenever possible.
       Thank you Stefano Fusi, one of the most impressive characters I met at the
       Institute, for giving me initial feedback and coming up with the basis for what
       later became STADP.
       Special thanks to Sungdo Choi and Daniel Fasnacht for all the help and
       support with the hardware and infrastructure; my computer was not struck by
       a particle from space, it turned out.
       Special thanks to Johanna von Lindeiner for good nights on the bench, and
       the many inspiring exchanges. I actually mean it !
       A very special thank you goes out to Pantha Roy, who is just amazing. Thanks
       for the good times, and for attempting to save me from becoming a social
       recluse during the final few weeks of this project.
       An equally special thank you goes out to Siddharta Jha, another amazing
       character. Thank you for all those discussions and creative breaks, which
       really enriched my time at the institute.
       A massive thank you to a fellow brother in work, Christopher Maltby, for
       enduring all those long days and longer nights of work with me. As you know,
       without your company, I would not have been able to get any work done, let
       alone finish.
       I would like to thank my parents, Wendy and Tien-Wen for their unconditional
                                                     Tien-
       support and for opening so many doors for me. Without your efforts and
       sacrifices, I would not be where I am today, and would probably not get
       wherever I will get in five, ten years!
       Finally, I would like to thank Dylan Muir again, because I am actually very
       grateful for all the help! Without your razor-sharp brain lobes and you
       patience and support, I would not have been able to achieve half of what I
       managed to do!

                                         1-2
The Synaptic Processing Unit                                         Anthony Hsiao


Table of contents
1   INTRODUCTION                                                             1-9


1.1 WHAT IS NEUROMORPHIC ENGINEERING?                                       1-10
1.2 THE TOPIC OF THIS PROJECT
                      PROJECT                                               1-11
1.3 AIMS                                                                    1-12
1.4 FURTHER REPORT STRUCTURE
                   STRUCTURE                                                1-12


2   BACKGROUND                                                              2-15


2.1 OF BRAINS, NEURONS AND SYNAPSES
                           SYNAPSES                                         2-15
2.2 SYNAPTIC PLASTICITY AT THE HEART OF LEARNING IN NEURAL SYSTEMS
                        AT              LEARNING           SYSTEMS          2-20
2.3 THE CASCADE SYNAPSE MODEL
                        MODEL                                               2-21
2.4 PREVIOUS WORK                                                           2-24
2.5 OVERVIEW OF THE HARDWARE ENVIRONMENT
                    HARDWARE                                                2-25
2.5.1 SILICON NEURONS                                                       2-26
2.5.2 SILICON SYNAPSES                                                      2-27
2.5.3 COMMUNICATION USING AER                                               2-27
2.5.4 THE FPGA BOARD                                                        2-28
2.5.5 SOFTWARE                                                              2-30
2.5.6 FINALLY…                                      ERROR! BOOKMARK NOT DEFINED.


3                           LEARNING
    STADP – A NOVEL HEBBIAN LEARNING RULE                                   3-31


3.1 STADP – YET ANOTHER LEARNING RULE?                                      3-31
3.1.1 FROM SPIKE TIME TO SPIKE RATE                                         3-33
3.2 CHARACTERISTICS OF STADP                                                3-35


4   DESIGN                                                                  4-38


4.1 SUMMARY OF FEATURES OF THE SYNAPTIC PROCESSING UNIT
                        OF                                                  4-38
4.2 SYSTEM LEVEL DESIGN                                                     4-38



                                         1-3
The Synaptic Processing Unit                   Anthony Hsiao

4.2.1 THE SPU IN A NEURAL SYSTEM                      4-39
4.2.2 INPUT AND OUTPUT PORTS                          4-39
4.3 VIRTUALISING THE CASCADE SYNAPSE
                     CASCADE                          4-40
4.4 SPU INTERNAL ADDRESSING                           4-42
4.5 MODULAR DESIGN OF THE SPU                         4-43
4.6 MODULE SPECIFICATIONS                             4-44
4.6.1 FORWARDING                                      4-45
4.6.2 LEARNING RULE (STADP)                           4-45
4.6.3 CASCADE PROCESS                                 4-46
4.6.4 CASCADE MEMORY                                  4-46
4.6.5 GLOBAL SIGNALS                                  4-47


5   IMPLEMENTATION                                    5-48


5.1 PSEUDO-RANDOM NUMBER GENERATORS
                         GENERATORS                   5-48
5.2 DESCRIPTION OF GENERICS
     ESCRIPTION                                       5-49
5.3 MODULE LEVEL DESIGN                               5-51
5.3.1 SPIKE FORWARDING                                5-51
5.3.2 LEARNING RULE (STADP)                           5-52
5.3.3 CASCADE SYNAPSE                                 5-56
5.3.4 CASCADE MEMORY                                  5-58
5.3.5 SIGNAL SELECTOR                                 5-60
5.4 SYSTEM INTEGRATION                                5-60
5.5 INTEGRATION INTO THE FPGA BOARD
                              BOARD                   5-62
5.5.1 ON CLOCKS                                       5-64


6   VERIFICATION                                      6-65


7   EVALUATION & EXPERIMENTATION                      7-67


7.1 IN-HARDWARE CHARACTERISATION OF STADP
                CHARACTERISATION                      7-67
7.2 MODIFICATIONS FOR THE EXPERIMENTAL SETUP          7-71
7.3 CIRCUIT CALIBRATION                               7-73


                                       1-4
The Synaptic Processing Unit                      Anthony Hsiao

7.4 IN-CIRCUIT VERIFICATION                              7-75
7.4.1 FORWARDING                                         7-75
7.4.2 POTENTIATION                                       7-77
7.4.3 DEPRESSION                                         7-78
7.5 A REAL CLASSIFICATION TASK
      REAL                TASK                           7-80
7.5.1 FROM IMAGE TO PRE-SYNAPTIC STIMULI                 7-80
7.5.2 TEACHING METHODS                                   7-83
7.5.3 RESULTS – NORMAL TEACHING                          7-86
7.5.4 RESULTS - BOTTOM UP TEACHING                       7-91
7.5.5 REMARKS ON THE CLASSIFICATION EXPERIMENTS          7-95


8    DISCUSSION                                          8-97


8.1 THE HARDWARE                                         8-97
8.2 STADP                                                8-98
8.3 THE CLASSIFICATION TASK
                       TASK                              8-99
8.4 CALIBRATION OF THE NEURAL SYSTEM
                       NEURAL                           8-103


9    CONCLUSION                                         9-105


9.1 REFINEMENTS                                         9-106


10   REFERENCES                                        10-108


10.1.1 WEB REFERENCES                                  10-109
10.1.2 DATASHEETS AND REFERENCE BOOKS                  10-110


11   APPENDIX I – SUPPLEMENTARY FILES                  11-111


12                              CHECKLISTS
     APPENDIX II – VERIFICATION CHECKLISTS             12-112


12.1 MODULE LEVEL VERIFICATION                         12-112
12.2 SYSTEM LEVEL VERIFICATION                         12-114



                                           1-5
The Synaptic Processing Unit                          Anthony Hsiao

13                                    THE
     APPENDIX III – A JOURNEY THROUGH THE SPU              13-117


13.1 PRE-SYNAPTIC SPIKE                                    13-117
13.2 POST-SYNAPTIC SPIKE                                   13-119


14   APPENDIX IV – DESIGN HIERARCHY OF SOURCE FILES        14-120




                                    1-6
The Synaptic Processing Unit                                                                                              Anthony Hsiao


List of figures
FIGURE 1: IMAGE OUTPUT OF A SILICON RETINA .................................................................................... 1-11
FIGURE 2: NEURONS OF THE WORLD. ................................................................................................... 2-16
FIGURE 3: ACTION POTENTIALS (PIKES) ARE COMMONLY DESCRIBED BY THREE PROPERTIES:...................... 2-17
FIGURE 4: ACTION POTENTIALS OF THE WORLD. .................................................................................... 2-18
FIGURE 5: CGI OF A SYNAPSE WITH PRE- AND POST-SYNAPTIC NEURONS. ................................................ 2-19
FIGURE 6: MICROGRAPH OF A SYNAPSE TAKEN AT THE UNIVERSITY OF ST. LUIS. ..................................... 2-19
FIGURE 7: DIFFERENT FORMS OF SYNAPTIC PLASTICITY .......................................................................... 2-21
FIGURE 8: SCHEMATIC OF A CASCADE MODEL OF SYNAPTIC PLASTICITY. ............................................... 2-22
FIGURE 9: INITIAL SIGNAL-TO-NOISE-RATIO AS A FUNCTION OF MEMORY LIFETIME, FROM [1]..................... 2-24
FIGURE 10: CIRCUIT DIAGRAM OF AN ULTRA LOW POWER INTEGRATE & FIRE NEURON. ............................ 2-26
FIGURE 11: CIRCUIT DIAGRAM OF THE SO CALLED DIFF-PAIR INTEGRATOR (DPI) SYNAPSE........................ 2-27
FIGURE 12: PROTOTYPE FPGA BOARD DEVELOPED BY DANIEL FASNACHT. ............................................. 2-29
FIGURE 13: EXPERIMENTAL HARDWARE SETUP...................................................................................... 2-30
FIGURE 14: STADP ........................................................................................................................... 3-33
FIGURE 15: THE STADP MECHANISM. ................................................................................................. 3-34
FIGURE 16: SIMULATED BEHAVIOUR OF STADP. .................................................................................. 3-36
FIGURE 17: SYSTEM LEVEL INTERACTION OF SPU AND AVLSI NEURON CHIP............................................ 4-39
FIGURE 18: BIT REPRESENTATION OF CASCADE SYNAPSES ...................................................................... 4-40
FIGURE 19: SPU INTERNAL ADDRESSING FORMAT ................................................................................. 4-42
FIGURE 20: CONCEPTUAL ARCHITECTURE OF THE SPU.......................................................................... 4-43
FIGURE 21: A HYBRID CELLULAR AUTOMATA LINEAR ARRAY ................................................................ 5-49
FIGURE 22: CONVENTIONS ON THE ARROWS USED IN BLOCK DIAGRAMS .................................................. 5-51
FIGURE 23: SPIKE FORWARDING MODULE BLOCK DIAGRAM.................................................................... 5-52
FIGURE 24: STADP LEARNING RULE BLOCK DIAGRAM........................................................................... 5-54
FIGURE 25: INITIALISATION OF DELTA_T LOOK-UP TABLE. ...................................................................... 5-55
FIGURE 26: FLOW DIAGRAM OF THE CASCADE SYNAPSE'S STATE UPDATE RULE ........................................ 5-56
FIGURE 27: CASCADE MODULE BLOCK DIAGRAM .................................................................................. 5-58
FIGURE 28: CASCADE MEMORY BLOCK DIAGRAM ................................................................................. 5-59
FIGURE 29: INPUT SOURCE SELECTOR BLOCK DIAGRAM ......................................................................... 5-60
FIGURE 30: PIPELINED SPU BLOCK DIAGRAM ....................................................................................... 5-61
FIGURE 31: PIPELINED DATAFLOW THROUGH THE SPU .......................................................................... 5-62
FIGURE 32: BLOCK DIAGRAM OF THE INTEGRATION OF THE SPU WITHIN THE FPGA BOARD ...................... 5-63
FIGURE 33: COMPARISON OF DELTA_T_LUT CONTENT FOR 5KHZ AND 90MHZ....................................... 7-69
FIGURE 34: SIMULATED HARDWARE BEHAVIOUR OF STADP AT 5KHZ SIMULATION CLOCK FREQUENCY. .... 7-71



                                                                     1-7
The Synaptic Processing Unit                                                                                        Anthony Hsiao

FIGURE 35: FREQUENCY RESPONSE OF THE NEURAL SYSTEM. ..................................................................7-74
FIGURE 36: OSCILLOSCOPE SCREENSHOT OF POST-SYNAPTIC MEMBRANE POTENTIAL:................................7-74
FIGURE 37: EXAMPLE OF A COHERENT 30HZ POISSON SPIKE TRAIN TO ALL 256 SYNAPSES. ........................7-76
FIGURE 38: OSCILLOSCOPE SCREENSHOT OF POST-SYNAPTIC MEMBRANE POTENTIAL:................................7-77
FIGURE 39: IN-CIRCUIT VERIFICATION OF POTENTIATION. ........................................................................7-78
FIGURE 40: IN-CIRCUIT VERIFICATION OF DEPRESSION. ...........................................................................7-79
FIGURE 41: OSCILLOSCOPE SCREENSHOT OF DECREASING POST-SYNAPTIC FIRING RATE: ............................7-80
FIGURE 42: USING PICTURES AS PRE-SYNAPTIC STIMULI. .........................................................................7-82
FIGURE 43: SPIKE TRAINS DERIVED FROM 16X16 PIXEL GREYSCALE IMAGES OF ANTHONY AND DYLAN. .....7-82
FIGURE 44: CONCEPTUAL PROCEDURE OF A REAL CLASSIFICATION TASK. .................................................7-85
FIGURE 45: CLASSIFICATION TASK: TEACH DYLAN, SHOW DYLAN FIRST, AT 22HZ. ..................................7-87
FIGURE 46: CLASSIFICATION TASK: TEACH DYLAN, SHOW ANTHONY FIRST, AT 22HZ. ..............................7-87
FIGURE 47: CLASSIFICATION TASK: TEACH DYLAN, SHOW DYLAN FIRST, AT 25HZ. ..................................7-88
FIGURE 48: CLASSIFICATION TASK: TEACH DYLAN, SHOW ANTHONY FIRST, AT 25HZ. ..............................7-88
FIGURE 49: CLASSIFICATION TASK: TEACH ANTHONY, SHOW ANTHONY FIRST, AT 22HZ...........................7-89
FIGURE 50: CLASSIFICATION TASK: TEACH ANTHONY, SHOW DYLAN FIRST, AT 22HZ. ..............................7-89
FIGURE 51: CLASSIFICATION TASK: TEACH ANTHONY, SHOW ANTHONY FIRST, AT 25HZ...........................7-90
FIGURE 52: CLASSIFICATION TASK: TEACH ANTHONY, SHOW DYLAN FIRST, AT 25HZ. ..............................7-90
FIGURE 53: CLASSIFICATION TASK: BOTTOM-UP TEACHING DYLAN, AT 50HZ..........................................7-92
FIGURE 54: CLASSIFICATION TASK: BOTTOM-UP TEACHING DYLAN, AT 70HZ. .........................................7-92
FIGURE 55: CLASSIFICATION TASK: BOTTOM-UP TEACHING DYLAN, FOR 2S AT 50HZ................................7-93
FIGURE 56: CLASSIFICATION TASK: BOTTOM-UP TEACHING ANTHONY, AT 50HZ. .....................................7-93
FIGURE 57: CLASSIFICATION TASK: BOTTOM-UP TEACHING ANTHONY, AT 70HZ. .....................................7-94
FIGURE 58: CLASSIFICATION TASK: BOTTOM-UP TEACHING ANTHONY, FOR 2S AT 50HZ. ..........................7-94
FIGURE 59: EXPECTED EFFECTS ON A SYNAPSE ....................................................................................8-101
FIGURE 60: PRE-SYNAPTIC SPIKE ARRIVES AT SPU. ............................................................................13-117
FIGURE 61: VALID PRE-SYNAPTIC SPIKE GETS FORWARDED, AFTER TWO CLOCK DELAYS ........................13-117
FIGURE 62: VALID PRE-SYNAPTIC SPIKE GENERATES A PLASTICITY EVENT. ............................................13-117
FIGURE 63: CASCADE SYNAPSE CHANGES IN OPERATION ....................................................................13-118
FIGURE 64: PLASTICITY EVENTS .......................................................................................................13-118
FIGURE 65: VALID POST-SYNAPTIC SPIKE ARRIVES AT SPU..................................................................13-119
FIGURE 66: POST-SYNAPTIC SPIKE DOES NOT GET FORWARDED ...........................................................13-119
FIGURE 67: POST-SYNAPTIC SPIKE SETS POST-SYNAPTIC EXPIRY TIME. ..................................................13-119




                                                                  1-8
The Synaptic Processing Unit                                               Anthony Hsiao



1 Introduction
          ‘The brain – that’s my second most favourite organ!’ – Woody Allen

Solving the mystery behind how the human brain works and computes will be one of
the most significant discoveries in the history of science. A profound understanding
of our most important organ (bar Woody Allen…) will have significant implications
to healthcare, psychology and ethics, as well as to computing, robotics and artificial
intelligence. Visionaries such as Ray Kurzweil go as far as predicting, that before the
middle of the 21st century, humans and machines will be able to merge in a way
never seen before, as brain interfaces enable users to bridge the gap between the real
and virtual worlds to a level where the distinction between ‘real’ and ‘not real’ might
lose its importance. Artificial systems would reach computational powers that
matched those of the human brain, just to surpass them a few years later.
Most people find it difficult to imagine such scenarios, especially since even the most
powerful computers to date, which can perform billions of operations per second,
cannot reproduce some of the computational-magic that human brains perform on a
day to day basis, such as pattern recognition or visual processing. ‘Intelligent’ and
‘interactive’ systems are neither intelligent nor interactive, the most advanced robots
in the world are no match for a young child when it comes to performing motor tasks
or recognition; the thought of ever meeting a machine with intelligence, humor or an
opinion goes far beyond what most people think their computers will ever be able to
do.
Such future scenarios have been the topic of several books and films, and are
portrayed as horror scenarios more often than not, ignoring many of the potential
opportunities that such a future could bear. Without attempting to make any
qualifying judgments, it should be noted that change happens, whether it is welcome
or not.
This change could well be initiated by a small but growing community of engineers
and scientists, driven by impressive advances in neuroscience, who are making


                                          1-9
The Synaptic Processing Unit                                                Anthony Hsiao


significant progress in copying neuronal organization and function into artificial
systems. The secret to the human brain’s superior abilities appears to reside in how
the brain organises its slow acting electrical and chemical components (namely
neurons, as basic computational unit in the brain, synapses, which are the interfaces
of neurons and possess rich dynamics allowing neurons to form interconnected
neural circuits). Researchers sometimes speak of ‘morphing’ these structures of
neural connections into silicon circuits, creating neuromorphic microchips. If
successful, this work could lead to implantable silicon retinas for the blind or sound
processors for the deaf that last for 30 years on a single nine-volt battery or to low-
cost, highly effective visual, audio or olfactory recognition chips for robots and other
smart machines. The long term goal is to engineer ever more complex artificial
systems with ever richer behaviour, and ultimately, the construction of an artificial
brain.

1.1 What is neuromorphic engineering?
The term neuromorphic was coined by Carver Mead, in the late 1980s to describe
Very Large Scale Integration (VLSI) systems containing analogue electronic circuits
that mimic neuro-biological architectures present in the nervous system.
Neuromorphic Engineering is a new interdisciplinary field that takes inspiration from
biology, physics, mathematics and engineering to design analog, digital or mixed-
mode analog/digital VLSI artificial neural systems. These include vision systems,
head-eye systems, auditory processors and autonomous robots, whose physical
architecture and design principles are based on those of biological nervous systems.
Although the field of neuromorphic engineering is still relatively new, impressive and
encouraging results have already been achieved. Ranging from ‘simple’ chips with
silicon neurons or synapses [13] to more complex systems such as a silicon retina or
cochlea [13] have been demonstrated in the past.




                                          1-10
The Synaptic Processing Unit                                                         Anthony Hsiao




                                                        sili
                           Figure 1: Image output of a silicon retina
         Showing the head of a person at the Brains in Silicon Lab at Stanford University.

1.2 The topic of this project
This project focuses on one aspect of neuromorphic systems which is at the heart of
some of the dynamics of neural networks, namely on synapses. Fusi et. al. have
demonstrated how using ordinary bounded synapse models can have devastating
effects on memory in scenarios with ongoing modifications, and proposed a new
synapse model, the binary Cascade Synapse [1], which outperforms ordinary (binary)
synapse models on several aspects [9].
The nature of the Cascade Synapse makes it convenient to implement in digital
hardware rather than analogue VLSI, and it would be useful to augment existing
neuromorphic neuron chips with Cascade Synapse functionality. Such a neural
system could then act as one single entity in a larger multi chip environment.
Previous efforts have successfully designed individual cascade synapses and
implemented a small number – eight, to be precise – of them on an FPGA; however,
in order to perform useful computation in a reasonably sized neural system, a massive
up-scaling of the number of synapses on one chip is necessary. In order to augment a
typical aVLSI neuron chip with cascade synapse functionality, any number upwards
of 4000 synapses would be desirable, or rather, necessary.
One way of doing this is to fundamentally change the way cascade synapses are
implemented on the FPGA, referred to as virtualisation: rather than having a number
of fixed hardware cascade synapses, which is logic-real-estate inefficient, an
abstraction of each synapse could be stored in memory, and only retrieved, processed
on and stored on demand. Since memory is generally cheap and abundant, unlike


                                              1-11
The Synaptic Processing Unit                                                Anthony Hsiao


logic, in digital circuits, this Synaptic Processing Unit (SPU) can potentially allow for
a very large scale implementation of cascade synapses on one single FPGA.

1.3 Aims
   1. To develop a Synaptic Processing Unit based on an FPGA that implements a
       large number of cascade synapses
   2. To integrate the SPU with an aVLSI neuron chip to form a working neural
       system
   3. To demonstrate the capabilities of the neural system by performing a real
       classification task

1.4 Further report structure
This report is written for the scientifically and technically minded reader, with
background knowledge of the concepts of electronic engineering, and is further
structured as follows:
      Background
   2. Background
   This chapter attempts to brief the reader on all the necessary interdisciplinary
   background knowledge required for this project. In particular, it outlines some of
   the relevant biology and neuroscience, explains the used binary cascade model in
   more detail and describes the hardware and infrastructure environment the SPU
   will be working in.
   3. STADP – a novel Hebbian learning rule
   This chapter will argue the case for developing a new learning rule called STADP,
   and describe how it works. It will also present an initial characterisation of the
   learning rule derived from simulation.
   4. Design
   This chapter starts by providing a summary of the features of the SPU, to allow
   the reader to get a first impression. Then, it outlines the high level design and
   argues for the system architecture used. It finishes by giving a set of specifications
   for a modular implementation of the design.


                                          1-12
The Synaptic Processing Unit                                               Anthony Hsiao


   5. Implementation
   This chapter starts by going off on a tangent, diving into the realm of random
   number generators. Then, it describes how the specifications given in the previous
   chapter were implemented in each module, and how the SPU integrates within
   the FPGA and its environment.
   6. Verification
   This chapter is a very short one, which only portrays the efforts undertaken in
   order to verify the design and implementation. It will not reproduce the
   verification efforts themselves.
   7. Evaluation & experimentation
   This is one of the key chapters and describes all the in-circuit verification and
   experimentation that has been carried out. Furthermore, it explains the real
   classification task given to the neural system, and presents the results.
   8. Discussion
   This chapter discusses the evaluation and experimentation results, and tries to
   make general statements about the operation of the SPU, and conclusions about
   the success of the classification tasks itself.
   9. Conclusion
   This chapter wraps up the report, and includes the conclusions derived from the
   work presented here. It objectively assesses advantages and disadvantages of the
   SPU, and suggests further improvements or changes to the system that might be
   worthwhile.
   10. References
   This chapter enlists the sources that have been referred to while writing the report
   as well as sources that have been used throughout the design and implementation
   of the SPU.
       Append
   11. Appendices
   There are four appendices, Appendix I with a list of supplementary Matlab files
   used throughout the project, Appendix II with a copy of the checklist used for
   verification, Appendix III with screeshots of waveforms showing the journey of a

                                           1-13
The Synaptic Processing Unit                                            Anthony Hsiao


   pre- and a post-synaptic spike through the SPU and finally Appendix IV, listing
   the design hierarchy of the VHDL source files used.




                                        1-14
The Synaptic Processing Unit                                                  Anthony Hsiao



2 Background
   ‘If the human brain were so simple that we could understand it, we would be so
                     simple that we couldn't’ – Emerson M. Pugh

2.1 Of brains, neurons and synapses
When IBM’s Deep Blue supercomputer beat then world chess champion Garry
Kasparov during their rematch in 1997, it did so by means of sheer brute force and
computational power. The machine evaluated some 200 million potential board
moves a second, whereas Kasparov considered only three each second, at most
10.1.1. But despite Deep Blue’s victory (in fact, Kasparov won the first match against
Deep Blue the year earlier, and IBM refused to agree to a third ‘deciding’ match [21]),
computers are no real competition for the human brain in areas such as vision,
hearing, pattern recognition, and learning, not to mention their inability to display
creativity, humour or emotions. And when it comes to operational efficiency, there is
no contest at all. A typical room-size supercomputer weighs roughly 1,000 times
more, occupies 10,000 times more space and consumes a millionfold more power
than does the neural tissue that makes up the brain [22].
Clearly, computers and brains are fundamentally different, both in terms of
architecture and performance. Table 1 summarises important key differences of
brains and (conventional) computers.
           Processing       Element      Energy        Speed           Style of      Fault
            elements          size         use                      computation     tolerant
Brain    ~1011 neurons       10-6m        30W         100Hz            Parallel,      Yes
         ~1014 synapses                                              distributed,
                                                                     memory at
                                                                    computation
  PC      109 transistors    10-6m        30W         109Hz +           Serial,       No
                                         (CPU)                       centralized,
                                                                       memory
                                                                      distant to
                                                                    computation
                     Table 1: A comparison between computers and brains



                                           2-15
The Synaptic Processing Unit                                                         Anthony Hsiao


At the most basic cellular level, brains consist of a vast number of brain cells, an
estimated 100 billion of them, called neurons. These are also believed to constitute
the basic building blocks of computation within the central nervous system, and are
in many ways analogous to logic gates in digital electronics. The brain's network of
neurons forms a massively parallel information processing system.
While there are a large number of different types of neurons, each with different
functions and morphologies, most neurons are typically composed of a soma, or cell
body, a dendritic tree and an axon, as shown in Figure 2.




                                   Figure 2: Neurons of the world.
There are many different types of neurons, each with different morphologies and functions, which are
                   found in different parts of brains. Image courtesy of G. Indiveri


One of the most important properties of a neuron is its membrane potential, the
potential difference across the cell membrane, which is used to communicate
between neurons. A complicated molecular mechanism that stems from the cell’s
highly complex membrane can give rise to so called action potentials or spikes, which
are sharp a increase followed by an equally sharp drop in the membrane potential
within a few ms. A neuron receives inputs, i.e. spikes, from other neurons, typically
many thousands, on its dendritic tree, and integrates them (approximately) on its
membrane potential. Once the membrane potential exceeds a certain threshold, the
neuron generates a spike which travels from the body down the axon, commonly


                                               2-16
The Synaptic Processing Unit                                                             Anthony Hsiao


described as the output of a neuron, to the next neuron(s) (or other receptors). This
spiking event is also called depolarization, and is followed by a refractory period,
during which the neuron is unable to fire. The membrane potential of a spiking
neuron is shown in Figure 3, conceptually, while Figure 4 shows some measurements
of real action potentials of the world. Typically, neurons fire at rates between 0Hz
and about 100Hz, and both the precise timing of individual spikes and the firing rates
of neurons are believed to play an important role in neural communication and
computation.




           Figure 3: Action potentials (pikes) are commonly described by three properties:
                                         (pike
                                          pikes)                                   properties
                                                                                    roperties:
  Pulse width, firing rate or inter-spike-interval and refractory period. Courtesy of Giacomo Indiveri.




                                                 2-17
The Synaptic Processing Unit                                                 Anthony Hsiao




                            Figure 4: Action potentials of the world.
                   Courtesy of Giacomo Indiveri, modified by Anthony Hsiao

The axon endings of neurons almost touch the dendrites or cell body of the next
neuron. The gap between two neurons is a specialized structure called synapse and is
the point of transmission of spikes from the pre-synaptic neuron to the post-synaptic
neuron, as shown in Figure 5 and Figure 6. This transmission is effected by
neurotransmitters, chemicals which are released from the pre-synaptic neuron upon
depolarization, which bind to receptors in the post-synaptic neuron, thereby
advancing the depolarisation of it. Most synapses are excitatory, i.e. they increase the
depolarisation of the post-synaptic neuron, although there are so called inhibitory
synapses (with inhibitory neurotransmitters), which render a post-synaptic neuron less
excitable. The human brain is estimated to have a vast 1014 synapses.
The extent to which a spike from one neuron is transmitted on to the next, the
synaptic efficacy or weight, depends on many factors, such as the amount of
neurotransmitter available or the number and arrangement of receptors, and is not
constant, but changes over time. This property is called synaptic plasticity, and it is
this variable synaptic strength, that is believed to give rise to both memory and
learning capabilities, which makes it particularly interesting to study synapses!




                                           2-18
The Synaptic Processing Unit                                                          Anthony Hsiao




                                                   pre-     post-
                   Figure 5: CGI of a Synapse with pre- and post-synaptic neurons.
 Excerpt of the 2005 Winner of the Science and Engineering Visualisation Challenge. By G. Johnson.
                                    Medical Media, Boulder, CO




                  Figure 6: Micrograph of a Synapse taken at the University of St. Luis.
  In the center of the image is the Synaptic Cleft, which separates the pre- (top) and post-synaptic
neuron (bottom). The pre-synaptic neuron has clearly visible vesicles which contain neurotransmitters.
Upon pre-synaptic depolarisation, these neurotransmitters are released and diffuse across the synaptic
     cleft, to be received by receptors on the post-synaptic neuron, advancing its depolarisation.


Scientists have developed various models of the underlying molecular mechanisms of
synaptic plasticity, describing it to good levels of accuracy; however it is important to
appreciate, that there are details to synaptic plasticity which are still subject of
ongoing research.



                                                2-19
The Synaptic Processing Unit                                               Anthony Hsiao


2.2 Synaptic plasticity at the heart of learning in neural systems
There are several underlying mechanisms that cooperate to achieve synaptic plasticity,
including changes in the quantity of neurotransmitter released into a synapse and
changes in how effectively cells respond to those neurotransmitters [7]. As memories
are believed to be represented by vastly interconnected networks of synapses in the
brain, synaptic plasticity is one of the important neuro-chemical foundations of
learning and memory. Thereby, strengthening, Long-Term Potentiation (LTP), and
weakening of a synapse, Long-Term Depression (LTD), are widely considered to be
the major mechanisms by which learning happens and memories are stored in the
brain.
Many models of learning assume some kind of activity based plasticity, whereby an
increase in synaptic efficacy arises from the pre-synaptic cell's repeated and persistent
stimulation of the post-synaptic cell. These kinds of learning rules are commonly
referred to as Hebbian learning rules, popularly summarised as ‘What fires together,
wires together’.
Another particularly prominent experimentally observed form of long term plasticity
is called Spike-Timing Dependent Plasticity (STDP), and depends on the relative
timing of pre- and post-synaptic action potentials. If a pre-synaptic spike is succeeded
quickly by a post-synaptic spike, then there appears to exist some kind of causality
since the pre-synaptic neuron has contributed to the depolarization of the post-
synaptic neuron, and they should be connected more strongly, by potentiating the
synapse. Conversely, if a pre-synaptic spike is directly preceded by a post-synaptic
spike, their connection should be weakened, and the synapse gets depressed.
Different forms of observed plasticity that can be described by STDP are shown in
Figure 7.




                                          2-20
The Synaptic Processing Unit                                                           Anthony Hsiao




                             Figure 7: Different forms of synaptic plasticity
 The amount (qualitatively) and type of synaptic modification evoked by repeated pairing of pre- and
                       post-synaptic action potentials in different preparations.
     The horizontal axis is the difference tpre-tpost of these spike-times. Results are shown for slice
recordings of different neurons. Without going into unnecessary detail, the important point to note is
             that different forms of plasticity exist. Figure from Abbott & Nelson 2000.

Several other models of synaptic plasticity exist, ranging over several levels of
complexity and biological plausibility. Each has its advantages and disadvantages,
proposing different mechanisms of synaptic plasticity, trying to explain different
types of experimentally observed plasticity. Other global regulatory processes of
learning, such as synaptic scaling or synaptic redistribution are thought to be
necessary alongside activity based learning rules [5].
While learning rules and models of synaptic plasticity attempt to describe the
mechanism by which synaptic plasticity is generated, different models of synapses
themselves exist, which can vary greatly in the way they respond to ‘plasticity signals’.

2.3 The cascade synapse model
Storing memories of ongoing, everyday experiences requires a high degree of
synaptic plasticity, while retaining these memories demands protection against
changes induced by further activity and experiences. Models in which memories are
stored through switch-like transitions in synaptic efficacy are good at storing but bad
at retaining memories if these transitions are likely, and they are poor at storage but
good at retention if they are unlikely [1]. In order to address this dilemma, Fusi et. al.
developed the model of binary cascade synapses, which combines high levels of
memory storage with long retention times and significantly outperforms conventional
models [9].


                                                2-21
The Synaptic Processing Unit                                                           Anthony Hsiao


They consider the case of binary synapses, i.e. a synapse with only two efficacies, (for
example potentiated and depressed, weak or strong), which is not implausible, since
biological synapses have been reported to display binary states of efficacy as well [2].
The structure of a binary cascade model is shown in Figure 8, specifying two
independent dimensions for each synapse. Just like ordinary models of binary
synapses, a binary cascade synapse can be in one of two states of efficacy, weak or
strong, but while ordinary models only allow one fixed value of plasticity, cascade
synapses possess a cascade of n states with varying degree of plasticity,
implementing metaplasticity (i.e. the plasticity of plasticity). Ongoing plasticity then
corresponds to transitions of a synapse between states characterized by different
degrees of plasticity, rather than (only) different synaptic strengths.




                       Figure 8: Schematic of a Cascade Model of Synaptic Plasticity.
 Courtesy of Stefano Fusi. There are two levels of synaptic strength, weak (yellow) and strong (blue),
   denoted by + and -. Associated with these strengths is a cascade of n sates (n = 5 in this case).
 Transitions between state I of the cascade of any strength and state 1 of the opposite strength take
    place with probability qi, corresponding to conventional synaptic plasticity. Transitions with
 probabilities p i ±link the states within the respective cascade (downward arrows), corresponding to
                                                metaplasticity.

Binary cascade synapses can respond to any learning rule with binary plasticity
signals, i.e. signals that are either ‘potentiate’ or ‘depress’, and responds to them
stochastically; plasticity signals are only responded to with a given probability which

                                                2-22
The Synaptic Processing Unit                                                  Anthony Hsiao


is determined by the state along the cascade the synapse is in. So it is the varying
probability of responding to plasticity signals that implement the different degrees of
plasticity described above.
In the highest state (state 1 of the cascade in Figure 8), the probability of responding
to a plasticity event is 1, and decreases for states further down the cascades, where
the synapse becomes less plastic. In the model analysed by Fusi, the plasticity actually
halves for every state down the cascade, i.e. 50% chance of responding to a plasticity
signal in the second cascade, 25% in the third, and so forth.
A cascade synapse can respond to plasticity events in two ways, depending on
whether it already has the ‘right’ efficacy, referred to as switching and chaining. If it
switches, then it is changing efficacy, i.e. from weak to strong, or vice versa. If a
synapse switches, it will always make a transition to state 1, i.e. the most plastic state,
of the opposite cascade, regardless of what state it was in before. In Figure 8, these
transitions are represented by the arrows between the two cascades, with plasticity
probabilities given by qi. If the synapse chains, i.e. it already has the right efficacy,
then it is moving down one state in the cascade, thereby reducing (halving) its
plasticity probability, becoming less plastic. In Figure 8, this is represented by the
downward arrows connecting consecutive states within each cascade, with plasticity
probabilities given by pi+/-.
Thus, cascade synapses can respond to ongoing modifications by reducing their
plasticity, thereby ‘reassuring’ their state of efficacy. Another way of looking at it is
that synaptic efficacies and their degree of plasticity are dependent on the history of
the synapses and the plasticity signals they received.
Fusi et. al. assess the performance of cascade synapses to that of ordinary binary
synapses by comparing the strength of an initial memory trace, the initial signal-to-
noise ratio, as well as the average memory lifetime, the point at which this signal-to-
noise ratio becomes equal to 1 for both synapse model (it is worthwhile to reiterate,
that it was this trade-off, ability to store memories easily vs. retaining them for a long
time, that originally led them to develop the cascade synapse model in the first place).
They find that cascade models arrive at a better compromise, storing new memories

                                            2-23
The Synaptic Processing Unit                                                            Anthony Hsiao


more easily and faithfully, yet retaining them for a longer period of time, as shown in
Figure 9. Without going into unnecessary detail (the interested reader is advised to
consult [1] for more information), they find that the better performance of cascade
synapses stems the fact that they experience power-law forgetting, unlike ordinary
binary synapses, which experience exponentially fast decay of their memories.




              Figure 9: Initial Signal-to-noise-ratio as a function of memory lifetime, from [1].
                                Signal-to-noise-                                             [1].
                                                                          5
    The initial signal-to-noise ratio of a memory trace stored using 10 synapses plotted against the
 memory lifetime (in units of 1 over the rate of candidate plasticity events). The blue (lower) curve is
for a binary model with synaptic modification occurring with probability q that varies along the curve.
 The red (upper) line applies to the cascade model described by Fusi et. Al. The two curves have been
  normalised so that the binary model with q = 1 gives the same result as the n = 1 cascade model to
which it is identical. Clearly, the cascade model performs better than the ‘normal’ binary model both in
                         terms of initial signal-to-noise ratio and memory lifetime.


In summary, binary cascade synapses outperform their ‘ordinary counterpart’ in terms
of memory storage and retention, which derives from the more complex structure
allowing the synapse to respond to ongoing modifications along two dimensions –
efficacy and metaplasticity. It is desirable to implement these nice properties into real
hardware, and previous attempts have already laid good groundwork for that.

2.4 Previous work
This project mainly builds up on two previous projects. The first one, titled ‘A
stochastic synapse for reconfigurable hardware’, a short project during the Telluride
workshop for Neuromorphic Engineering by Dylan Muir [15], laid the ground work



                                                 2-24
The Synaptic Processing Unit                                                 Anthony Hsiao


for both the following and this project. In particular, it succeeded in creating a first
VHDL implementation of the cascade synapse and verified its operation in
simulations. One of the biggest contributions of this project is the design of one
particular type of pseudo-random number generator, the Hybrid Cellular Automata
array pseudo-random number generator, which also found extensive use in this
current project. However, no actual hardware was synthesised from the digital design.
The second project, ‘A VHDL implementation of the Cascade Synapse Model’, a
diploma project by Tobias Kringe [16], succeeded in designing and implementing a
small array of cascade synapses onto an FPGA. The operation of the digital cascade
synapses was verified both in simulation and in hardware, and encouraging results
were achieved in confirming the complex behaviour of the cascade synapse (which is
why this current project will not focus on reproducing and re-verifying the properties
of hardware implemented cascade synapses). However, the VHDL implementation
was rather large, and only a small number of synapses could be implemented onto
the FPGA. It was Tobias Kringe who proposed to virtualise the cascade synapses
(which is one of the aims of this current project) in order to realise a useful number of
synapses onto one FPGA. Due to the radically different architecture of the virtualised
synapses to the static hardware synapses, next to none of his VHDL implementation
was reused.
To the best of the knowledge of the author, there has been no other working
hardware implementation of a large number of cascade synapses (in fact, of any
number of synapses) to date.

2.5 Overview of the hardware environment
Neuromorphic aVLSI hardware commonly comprises low power analogue CMOS
circuits operating in the subthreshold regime, that mimic (morph) the properties of
real neural systems and elements. In particular, a neuromorphic aVLSI neuron chip
was used, which comprised an array of leaky Integrate & Fire (IF) silicon neurons
with Diff-Pair Integrator (DPI) synapses. Communication to the outside world was
done using the asynchronous Address Event Representation (AER) protocol. The


                                           2-25
The Synaptic Processing Unit                                                            Anthony Hsiao


FPGA is sitting on an FPGA board developed at the Institute of Neuroinformatics in
Zurich.

2.5.1 Silicon neurons
There are different types of silicon neurons, such as conductance based models which
aim to map molecular conductance mechanisms underlying neuron behaviour in
detail into analogue electronic circuits, or more qualitative models such as the I&F
neuron model, which merely implements the observed characteristics of neuron
behaviour into silicon, such as integration, firing or the refractory period.
The aVLSI chip used in this project contained 128 I&F neurons similar to the circuit
depicted in Figure 10. Qualitatively, this I&F circuit works by integrating input
current from on-chip synapses on its membrane, and elicits a (voltage) spike if the
membrane voltage crosses a firing threshold.




              Figure 10: Circuit diagram of an ultra low power Integrate & Fire Neuron.
                     10:                       an        low
 Labelled functional circuit elements mimic the behaviour of real neurons. Transistors operate in the
sub-threshold regime to exploit their desirable exponential characteristics. A capacitor Cmem integrates
incoming post-synaptic current into a membrane voltage Vmem. If the membrane potential crosses the
        spiking threshold, it will ‘spike’ just like a real neuron. Courtesy of Giacomo Indiveri.




                                                 2-26
The Synaptic Processing Unit                                                          Anthony Hsiao


2.5.2 Silicon synapses
Each I&F neuron has 32 silicon synapses with different properties and behaviour
connected to it, but only one type of synapse was used in this project, namely the
static DPI synapse. The circuit of such a synapse is depicted in Figure 11.
Qualitatively, the DPI synapse works by receiving a (voltage) spike from a pre-
synaptic neuron (or from the outside world), and then injects a given amount of
current onto the membrane of the post-synaptic neuron it is connected to in response.
The amount of current produced by every incoming spike is dependent on the static
synaptic weight and the time constant of the synapse, which can be adjusted to
achieve the desired static synaptic weight.




            Figure 11: Circuit diagram of the so called Diff-Pair Integrator (DPI) synapse.
                   11:                                  Diff-
                                                          iff                      synapse.
 For every pre-synaptic spike it receives, it dumps a post-synaptic current onto the membrane of the
  post-synaptic neuron connected to it. The amount of current, and other dynamics, can be set by
       parameters such as the synaptic weight, the time constant tau or the threshold voltage.



      Communication
2.5.3 Communication using AER
The Address Event Representation (AER) protocol is used to allow for
communication in multi-chip environments. It is a serial asynchronous four-phase
handshaking protocol (using request-acknowledge signals) which encodes events (i.e.
spikes) of individual neurons by assigning each neuron a unique address (up to

                                               2-27
The Synaptic Processing Unit                                                          Anthony Hsiao


16bits). Every time a neuron fires, it generates an address event, which is then
transmitted over the AER bus to receiving hardware. Unlike conventional electronic
systems with arrays of information sources, such as digital cameras, neuromorphic
systems using the AER protocol do not scan through every one of its elements to
transmit one frame after another, but rather, information is transmitted on demand.
Only if a neuron spikes, will an address event be transmitted. Therein, one of the
most important points about the AER protocol is its asynchrony, whereby the precise
timing of the address event is implicitly encoding the time of the spike itself – no
need to communicate timestamps for individual spikes.
Conveniently, since electronic circuits implementing neuromorphic hardware are very
fast, while neural activity is rather slow (<100Hz), a large number of neurons can
share the same AER bus without problem. Typically, an AER bus would have a
bandwidth of about 1Mevent/second.

2.5.4 The FPGA board
The FPGA used in this project is a Xilinx Spartan 3 (xc3s400pq208) that sits on a
prototype FPGA board developed by Daniel Fasnacht during his diploma project at
the Institute of Neuroinformatics in Zurich, depicted in Figure 12. Features used in
this project are the USB interface and the two AER ports (one input, one output). It
has an external clock of 106.125MHz, and is programmed using JTAG.
Apart from developing the board itself, Daniel Fasnacht further developed a Linux
driver to allow communication with the USB board. A program developed by
Giacomo Indiveri is used to send data to the FPGA board. In particular, pre-synaptic
spikes are sent through the USB bus to the SPU by specifying a synapse address and
an inter-spike interval to the previous spike, data which is easily generated using the
piking neuron toolbox1 in Matlab. The aVLSI neuron chip is configured using Matlab2.



1
    Developed by Dylan Muir at the Institute of Neuroinformatics
2
    To set up the environment variable for the aVLSI chip in Matlab: chipinit.m. To load the required
calibration settings to the chip: bias_050607.m


                                                  2-28
The Synaptic Processing Unit                                                     Anthony Hsiao


It should be noted, that his is a prototype board, and with experimental or prototype
hardware, extra consideration should be taken, since not all functions necessarily
have to work as expected. However, seeing experimental hardware work and become
‘alive’ is one of the most gratifying moments of hardware development.
In the experimental setup used for the classification task (as described in 7.5A real
classification task) the FPGA board interfaces with an aVLSI ‘IFSLTWA’ neuron chip,
using the AER connections to send address events to, and receiving feedback from
the neurons. Figure 13 illustrates this experimental setup.




                  Figure 12: Prototype FPGA board developed by Daniel Fasnacht.
                         12:
         1. Xilinx Spartan 3 (xc3s400pq208) 2. USB port 3. AER-out port 4. AER-in port




                                            2-29
The Synaptic Processing Unit                                                       Anthony Hsiao




                                      13:
                              Figure 13: Experimental hardware setup.
 1. FPGA SPU 2. Forward AER connection 3. aVLSI chip with array of I&F neurons 4. Oscilloscope
measuring the post-synaptic membrane potential 5. post-synaptic feedback AER connection (with logic
                       analyzer) 6. pre-synaptic stimuli input USB connection.

2.5.5 Software
Throughout this project, three software packages were used, namely Xilinx ISE 9.1i
Webpack to code the VHDL design, Modelsim PE Student Edition to simulate VHDL
code and Matlab, for various things, including plotting, initialization file generation,
analysis or spike train generation.
A project diary was kept on GoogleDocuments.




                                              2-30
The Synaptic Processing Unit                                                 Anthony Hsiao



                                   rule
3 STADP – a novel Hebbian learning r ule
  ‘The illiterate of the 21st century will not be those who cannot read and write, but
             those who cannot learn, unlearn, and relearn’ – Alvin Toffler

In the previous section, the general concept of synaptic plasticity was introduced.
While different learning rules have been proposed, for the task at hand, keeping in
mind that the Synaptic Processing Unit is to be tested on a real classification task, it is
necessary to implement a learning rule that is both suitable for the learning task in a
general environment, as well as easily implemented into digital hardware. There are
several learning rules out there that would be interesting to be implemented, most
prominently STDP, amongst also others [18], [3], [20], but none really meet the needs
for this project.
From [19] and [20], it was concluded that ordinary STDP would not be sufficient as a
general learning rule. Instead, the system would either have to be taught with
specifically crafted and highly correlated temporal patterns (not a general
environment), or a more elaborate version of STDP would have to be constructed,
which is impractical for the implementation, both in terms of hardware real estate
(memory in particular, but also logic) and circuit complexity. Prototype designs for
STDP were rejected on the basis of it requiring excessive memory and
overcomplicating the digital circuit.
Instead, a novel but very simple, easily implemented learning rule was developed
together with [20], called Spike-Timing and Activity Dependent Plasticity (STADP),
which produces simple binary plasticity events, depress and potentiate, as required by
the binary cascade synapse model.

3.1 STADP – Yet another learning rule?
At the heart of STADP is the same Hebbian learning paradigm, that ‘what fires
together, wires together’. Unlike STDP, which derives the causality for ‘firing
together’ from the difference in spike times, STADP uses a mixture of firing time and



                                           3-31
The Synaptic Processing Unit                                                   Anthony Hsiao


firing rate based measures to determine, whether pre- and post-synaptic neuron ‘fire
together’.
As the name suggests, STADP produces plasticity signals depending on spike timing
as well as activity. In particular, it is dependent on the state of activity of the post-
synaptic neuron, and the timing of pre-synaptic spikes.
STADP says, that the post-synaptic neuron can be in one of two states at any point in
time: active and inactive. This state is determined by a threshold function of the post-
synaptic firing frequency: if it is above a mean firing rate fm, it is said to be active,
otherwise it is inactive. For example, a setup of aVLSI I&F neurons could have a
mean firing rate fm = 50Hz, which is biologically plausible, and be said to be active
for firing rates above 50Hz, and inactive for firing rates below 50Hz.
Then, two neurons are said to ‘fire together’ if a pre-synaptic spike arrives while the
post-synaptic neuron is active, and the synapse should be potentiated (LTP). The
reverse is also true, i.e. when a pre-synaptic spike arrives at the synapse while the
post-synaptic neuron is inactive, then the synapse should be depressed (LTD).
However, this scheme would result in one plasticity signal for every pre-synaptic
spike, so in order to condition the number of plasticity signals produced, STADP is
stochastic, and only produces potentiation or depression signals with a certain
probability, called the probability of plasticity, p(plasticity). Figure 14 below
summarises how STADP produces plasticity events.




                                            3-32
The Synaptic Processing Unit                                                              Anthony Hsiao




                                                    14:
                                            Figure 14: STADP
Plasticity events are elicited with a probability p(plasticity), and depend on the spike time of the pres-
                          synaptic, and the activity of the post-synaptic neuron.


3.1.1 From spike time to spike rate
The two state abstraction of the post-synaptic neuron’s activity essentially requires an
integration of its spike-times to produce spike rates. However, integration of spikes
arriving at irregular intervals into spike rates can be a non-trivial task in real time
processing in digital hardware (it would be very easy in analogue electronics
actually!). In STADP, this is elegantly performed using a stochastic process, inspired
by quantum physics [20]. The main idea behind this is that the post-synaptic neuron
is in an unknown state of activity until it gets ‘measured’, in this case by an incoming
pre-synaptic spike.
Every time the post-synaptic neuron spikes, its state of activity is set to active
independent on the current state. A neuron in active state can then make a transition
to the inactive state with a probability p(deactivate) (this can also be regarded as a
two state hidden Markov process), as depicted in Figure 15.
Without specifying what the p(deactivate) is at any point of time, it can be
appreciated how a post-synaptic neuron firing at mean firing rate fm should have a
probability of being in active state, p(active) of 0.5, a more active neuron should have
a higher p(active) and a less active neuron should have a lower p(active).




                                                  3-33
The Synaptic Processing Unit                                                                Anthony Hsiao




                                             15:             mechanism.
                                   Figure 15: The STADP mechanism.
   A post-synaptic neuron can be in one of two states: active and inactive. The STADP mechanism
 determines the state of the post-synaptic neuron by integrating the post-synaptic firing times. A post-
synaptic spike sets the neuron to active state, which then stochastically resets to the inactive state after
an amount of time equal to the mean postsynaptic inter-spike interval. Clearly, the probability that the
post-synaptic neuron is in active state at any given time increases as it’s firing rate increases, and is 0.5
                                   if it is firing at the mean firing rate.


In order to implement this in real hardware (it would be rather challenging to actually
instantiate some kind of quantum process), the STADP mechanism proposed here is
using an abstraction of the stochastic deactivation of the post-synaptic neuron. This
abstraction is based on the assumption that the neuron fires as a poisson process with
mean firing rate fm, which has an exponentially distributed inter-spike interval (the
time interval between two consecutive spikes) ~ exp(1/fm). Then, upon every
incoming post-synaptic spike (which sets the neuron’s state to active), an
exponentially distributed ‘expiry time’ is drawn, after which the neuron is said to
reset to the inactive state.
This way, the desired properties can be achieved: if the post-synaptic neuron is firing
at the mean firing rate fm, it will have an equal chance of being in active or inactive
state, on average, at any point in time. Similarly, if it is firing at a higher rate, it has a
higher chance of being active since it is being set to active faster than it is expiring to
inactive, while if it is firing at a lower rate, it has a lower chance of being active at
any point in time.




                                                   3-34
The Synaptic Processing Unit                                                Anthony Hsiao


One question remains. Whether a plasticity event is a depression or a potentiation
event is dependent on the post-synaptic neuron’s activity as explained above – but
then, how does STADP behave for different pre-synaptic frequencies? As the name
suggests, the plasticity is dependent on spike timing, since the state of activity of the
post-synaptic neuron is only ever evaluated on an incoming pre-synaptic spike, but in
fact, its rate plays a role too.
In general, the higher the pre-synaptic frequency, the more plasticity events will be
produced. However, since potentiation and depression are only elicited with
probability p(plasticity), the dependence on the pre-synaptic rate is slightly more
complex. While high pre-synaptic frequencies are likely to lead to a high rate of
plasticity, low, but non-zero, pre-synaptic frequencies are likely not to result in any
plasticity event at all, as only few of the already rare pre-synaptic spikes would ever
lead to a plasticity event.
In summary, the pre-synaptic firing rate can be said to determine the rate (probability)
of plasticity events, while the post-synaptic frequency is best described as setting the
type of the plasticity events. Synapses with high pre-synaptic firing rates are more
likely to be receiving plasticity signals, while synapses with low pre-synaptic firing
rates are likely to remain static, as they receive none or only few plasticity events.

    Characteristics
3.2 Characteristics of STADP
The previous section explained how, conceptually, STADP works, and how the actual
STADP mechanism, which draws an exponentially distributed expiry time for the
post-synaptic neuron to reset to the inactive state, works. The following paragraphs
describe some of its characteristics as well as the expected plasticity signals that
STADP would produce.
When characterising the behaviour or the results of STADP, the two important points
to be noted are firstly whether the expiry time mechanism works at all, and secondly
what plasticity profile it produces over a range of pre- and post-synaptic frequencies.
By observing p(active), the correct operation of the mechanism can be verified, by




                                          3-35
The Synaptic Processing Unit                                                             Anthony Hsiao


observing the plasticity rates, i.e. how many potentiation or depression events are
elicited per second, insights into the plasticity profile can be gained.
The following plots were obtained from a simple Matlab simulation3 done by Dylan
Muir, and show the rate of potentiation (LTP rate), rate of depression (LTD rate), the
net effect of plasticity (LTP rate – LTD rate) as well as p(active), over pre- and post-
synaptic frequency ranges of 0-100Hz.




                                         16:
                                  Figure 16: Simulated behaviour of STADP.
     Left column: rate of potentiation and depression events per second, over a range of pre- and post-
      synaptic frequencies [1:100Hz] (ignore the axis labels). Right column: Net effect of STADP and
                  probability of the postsynaptic neuron being in active state per unit time.

These simulation results suggest that STADP indeed works as a Hebbian learning rule,
and has the desired characteristics. The p(active) is approximately 0.5 at a post-
synaptic frequency of 50Hz, is increases for higher frequencies, and decreases for
lower frequencies. Furthermore, the plasticity rate increases with pre-synaptic

3
    p(active) curve: make_prob_active_vs_freq_plot.m other plots: make_freq_sim_plot.m


                                                   3-36
The Synaptic Processing Unit                                                Anthony Hsiao


frequency for both potentiation and depression, which also have a qualitatively
correct behaviour, best summarized by the net effect of LTP and LTD: with increasing
pre-synaptic frequencies, there are more plasticity events, with potentiation
dominating for high post-synaptic frequencies, and depression dominating for low
post-synaptic frequencies.
One important characteristic to note, however, is that potentiation and depression are
not symmetric within the regime of operation, and that the net effect of plasticity has
a bias towards depression, or equivalently, reluctance towards potentiation. This is
due to the p(active) curve, which is not linear or symmetric about the (50Hz, 0.5)
point. As will be described later in the experimental section, this will have an
observable effect.
Possible remedies for this could include measures such as pre-biasing or distorting the
p(active) curve so that it saturates at 100Hz, or by setting a minimum expiry time of
10ms (1/100Hz) in order to ensure that p(active) is 1 at 100Hz. The remedy used
would have to be matched to the particular implementation of STADP.
While more detailed and formal analysis of STADP would be desirable, this would go
beyond the scope of this report. These initial simulation results are satisficing ( =
satisfying enough), and confidence in the learning rule further derives from [20].




                                          3-37
The Synaptic Processing Unit                                             Anthony Hsiao



4 Design
 ‘I am enough of an artist to draw freely upon my imagination. Imagination is more
important than knowledge. Knowledge is limited. Imagination encircles the world’ –
                                    Albert Einstein

4.1 Summary of features of the Synaptic Processing Unit
The Synaptic Processing Unit designed here has the following features:
   •   Speed of operation: Clocked at 90MHz internally
   •   System architecture:
           o Fully pipelined design – the SPU can theoretically process a new
               address event every clock cycle, although this never happens in
               practice
           o Modular design – allows for easy plug-in of a new learning rule
   •   On-chip learning rule: STADP with 11.1ns time resolution
   •   I/O ports: 1x USB input, 1x AER input, 1x AER output
   •   Cascade representation: 6bit, reconfigurable, allowing for synapses with up to
       32 cascades
   •   Cascade memory address width: 13bit, reconfigurable, allowing for up to
       8192 binary cascade synapses
   •   Addressing: Configurable number of neurons (up to 256)
   •   One teacher synapse per neuron

4.2 System level design
Although this project builds upon previous work as mentioned earlier, most parts of
the Synaptic Processing Unit were designed from scratch, since the pipelined and
virtualized cascade synapse requires a very different architecture.




                                         4-38
The Synaptic Processing Unit                                                          Anthony Hsiao


4.2.1 The SPU in a neural system
From a high level point of view, the SPU is supposed to integrate with one aVLSI
neuron chip, forming one coherent neural system containing an array of neurons with
cascade synapse functionality. This system could, for example, be used as one layer
of a larger network of spiking neurons, as depicted in Figure 17.




                  Figure 17: System level interaction of SPU and aVLSI neuron chip.
                         17: System       interaction
 Together, these form one freely reconfigurable integrated array of N Integrate and Fire neurons with
                                      binary cascade synapses.


4.2.2 Input and output ports
In order to act as one coherent system, the SPU has to be able to communicate both
with the neuron chip, as well as with the outside world. Here, this is done using the
USB port of the FPGA board as pre-synaptic input, and the two AER ports to connect
the SPU to the neuron chip.
Clearly, a forward connection, whereby pre-synaptic spikes are routed towards the
right post-synaptic neuron is necessary. However, in order to be able to perform
learning using STADP, and indeed most other learning rules, an additional feedback
connection from the neuron chip back to the SPU is necessary, in order to obtain
information about the post-synaptic neurons, which in this case means to estimate
their state of activity.



                                                4-39
The Synaptic Processing Unit                                                  Anthony Hsiao


4.3 Virtualising the cascade synapse
The binary cascade model is quite a nice model to be implemented in digital
hardware. It has essentially only two important properties, namely its binary efficacy
and its current state, which at the same time encodes the plasticity, which in turn is
represented by a plasticity probability, which halves for every higher cascade. This
has ‘digital’ written all over it.
In order to virtualise the cascade synapses, some conceptual ‘cascade mechanism’ by
which to process them has to be devised. The basic idea is to trade hardware real
estate on the FPGA for memory, and to process synapses on demand. This has two
immediate design deliverables:
    •   In order to virtualise the cascade synapses, an abstraction or memory
        representation of them has to be defined,
    •   A mechanism, by which they are processed on, i.e. how individual synapses
        respond to plasticity signals, has to be developed
Conveniently, the cascade synapse can be represented by a bit vector very intuitively.
One bit encodes the synaptic efficacy, while a number of other bits encode the state
of the synapse, i.e. the synaptic plasticity, i.e. the plasticity probability, depending on
the number of cascades. Then, halving the plasticity probability is just a matter of a
bit shifting operation. As depicted in Figure 18, an Nbit representation where the
MSB represents the efficacy, and the word [N-1...0] represents the plasticity
probability, as an unsigned binary number.




                                18:
                         Figure 18: Bit representation of cascade synapses




                                              4-40
The Synaptic Processing Unit                                                Anthony Hsiao


Using this representation, the plasticity probability ranges from 0 to 2N-1-1 rather than
from 0 to 1, but this is not a problem, since it can be regarded as the numerator of a
rational number with denominator 2N-1-1. Such a representation can easily be stored
in and retrieved from memory, and provides the functionality required to implement
the virtualisation.
Here, N = 6 was fixed as a reasonable maximum cascade representation width,
allowing for synapses with up to 32 cascades. This is more than sufficient, and in fact,
too large a number of cascades can actually decrease the memory performance of the
synapses [1].
The processing on the cascade synapse can be expected to be relatively simple, since
there is only a small number of things the synapse ‘can do’: switch or chain, with a
probability given by its state. The exact mechanism implemented is described in
detail in the




                                          4-41
The Synaptic Processing Unit                                                          Anthony Hsiao


Implementation section, but from a high level description point of view, it has to:
      •    Obtain the right cascade from memory
      •    Perform the necessary operations on its state representation (i.e. switch, chain
           or do nothing)
      •    Produce a new cascade state representation, and pass it back to the cascade
           memory

4.4 SPU internal addressing
Since incoming and outgoing events are following the AER protocol, whereby
neurons are identified by addresses, the SPU internal representation is also using
addresses as identifiers of synapses.




                                        19:
                                 Figure 19: SPU internal addressing format


At the heart of the addressing scheme are the synapses, which can be identified
uniquely by an Nbit synapse address, as shown in Figure 19. For historical reasons4,
this synapse address is set to 13bits, allowing it to uniquely identify up to 8192
synapses. The top few bits of the synapse address represent the neuron address,
which uniquely identify the post-synaptic neuron which the cascade synapse is
connecting to. The aspect ratio of the neural system, i.e. how many neurons there are
and how many synapses each has can be changed freely within the SPU by changing


4
    The SPU was originally designed to interact with an aVLSI chip with 256 neurons and 8192 synapses,
the largest of its kind at that time


                                                  4-42
The Synaptic Processing Unit                                              Anthony Hsiao


this neuron address width, and does not have to correspond to the actual number of
neurons (or synapses) on the aVLSI chip.

    Modular
4.5 Modular design of the SPU
Apart from implementing cascade synapse behaviour in a virtualised fashion, the SPU
has to perform two other important tasks: spike forwarding and learning.
Overall, the core of the SPU, i.e. ignoring data I/O and FPGA board particulars, will
have the following four modules:
      •   Forwarding module
      •   Learning module
      •   Cascade module
      •   Cascade memory
The conceptual architecture that stems from these four modules is depicted in Figure
20.




                        Figure 20: Conceptual Architecture of the SPU
                               20:


The principle of operation of the SPU is as follows:




                                           4-43
The Synaptic Processing Unit                                                Anthony Hsiao


   1. The signal selector (not one of the core functions of the SPU) performs
       arbitration between pre- and post-synaptic inputs, and forwards this address
       into the SPU, to the forwarding module, the cascade memory and the learning
       module.
   2. The cascade memory retrieves the cascade synapse representation
       corresponding to the synapse address, and, at the same time, writes new
       cascade states to (another location in) memory.
   3. The learning rule (stochastically) produces plasticity signals as required by
       STADP and the pre- and post-synaptic spikes the SPU receives.
   4. The forwarding module forwards pre-synaptic addresses on to the output of
       the SPU, depending if, and only if, the efficacy of the synapse is high.
   5. The cascade module (stochastically) processes the cascade representation
       according to the plasticity signals it receives from the learning module and
       passes on a new cascade state to be written by the cascade memory
This architecture can be fully pipelined, so that the SPU can process one ‘instruction’,
i.e. one address event, per clock cycle. This is particularly important in order to ensure
that the SPU is operating fast enough, since in a multi-chip environment, it should not
be the processing bottleneck, but rather, it should be able to process whatever is
being thrown its way by the pre-synaptic input (USB). Since the AER bus can
typically transmit about 1Mevent/second, the SPU should be able to process a
multiple of that, which a fully pipelined architecture allows.
In order to ensure that only the ‘right’ signals are being processed and that no wrong
data is written to memory, the SPU uses an extra level of control signals that indicate
the validity of the data shown in Figure 20.

4.6 Module specifications
The high level relationship between the individual modules described above
translates into precise input/output and functional specifications, described below.




                                          4-44
The Synaptic Processing Unit                                             Anthony Hsiao


4.6.1 Forwarding
Function:
   •    To forward valid pre-synaptic spikes to the post-synaptic neuron address over
        the AER output of the SPU, if the ‘target’ synapse has high efficacy or a
        teacher signal was sent.
Input signals:
   •   neuron_address: address of the synapse the current pre-synaptic spike is
       addressed to. Up to 13bits
   •   target_synapse_efficacy: MSB of the cascade representation of the
      addressed synapse. 1bit.
   • address_pre_post: control signal issued by the signal selector which
      indicates whether current data comes from the pre-synaptic (‘0’) or the post-
      synaptic (‘1’) feedback input. 1bit.
   • address_valid: control signal that indicates whether current data is a valid
Outputs:
   •   target_neuron_address: address of the post-synaptic neuron that is to be
       sent out through the AER output. up to 8bits.
   •   target_address_valid: control signal that indicates whether the target
       neuron address is valid. 1bit.

4.6.2 Learning Rule (STADP)
Function:
   • To implement STADP
   • To correctly produce plasticity events (dep./pot.)
Inputs:
   •   synapse_address: address of the incoming pre- or post-synaptic spike. Up to
       13bits.
   •   address_pre_post: control signal issued by the signal selector which
       indicates whether current data comes from the pre-synaptic (‘0’) or the post-
       synaptic (‘1’) feedback input. 1bit.
   •   address_valid: control signal that indicates whether current data is a valid.
      1bit.
Outputs:
   •   cascade_synapse_address: address of the cascade synapse that the plasticity
       signals are valid for. Up to 13bits.
   •   plasticity_dep_pot: plasticity signal, indicating whether the cascade
       synapse should be depressed (‘0’) or potentiated (‘1’). 1bit.
   •   plasticity_valid: control signal that indicates whether the plasticity signal
       and the cascade synapse address are valid. 1bit.


                                        4-45
The Synaptic Processing Unit                                                Anthony Hsiao


      Cascade
4.6.3 Cascade Process
Function:
   •      To process cascade states according to plasticity signals from the learning
          module
Inputs:
   •      cascade_synapse_state: cascade state representation of the cascade synapse
      that is to be processed. Up to 6bits.
   • cascade_synapse_address: address of the current cascade synapse that the
      plasticity signals are valid for. Up to 13bits.
   • plasticity_dep_pot: plasticity signal, indicating whether the cascade
      synapse should be depressed (‘0’) or potentiated (‘1’). 1bit.
   • plasticity_valid: control signal that indicates whether the plasticity signal
      and the cascade synapse address are valid. 1bit.
Outputs:
   •      cascade_address_out: address of the new cascade state representation of
          the valid new state. Up to 6bits.
   •      new_state: new processed cascade state representation ready to be written
          back to memory. Up to 6bits.
   •      new_state_valid: control signal that indicates whether the new state and the
          cascade out address is valid. 1bit.

4.6.4 Cascade memory
Function:
   •    To retrieve cascade representations of synapses addressed at its read port
   •    To store valid and new cascade representations of synapses addressed at its
        write port
Input signals:
   •      synapse_address: address of the cascade the current pre-synaptic spike is
      addressed to. Up to 13bits.
   • new_state_address: address of the new state that has undergone plasticity.
      Up to 13bits.
   • new_state: new state of cascade synapse after processing. Up to 6bits.
   • new_state_valid: control signal that indicates whether the new state for the
      new state address is a valid. 1bit.
Outputs:
   •      current_state: address of the post-synaptic neuron that is to be sent out
          through the AER output. Up to 6bits.




                                            4-46
The Synaptic Processing Unit                                                 Anthony Hsiao


4.6.5 Global signals
In addition to the inputs specified above, all modules share clock, clock enable and
asynchronous reset inputs to reset all internal registers and FIFOs. Note that the
content of memory is not reset to the initial state by this reset signal, but only the
output registers of the memory are cleared. All signals internal to the SPU are active
high.




                                           4-47
The Synaptic Processing Unit                                                Anthony Hsiao



5 Implementation
     ‘It's not good enough that we do our best; sometimes we have to do what's
                             required’ – Winston Churchill

    Pseudo-
5.1 Pseudo-random number generators
The performance of stochastic learning processes, indeed of any stochastic process, is
heavily dependent on the ‘quality’ of the underlying randomness. Since the SPU has
random processes in two of its major functional components, the cascade synapse
module and the learning rule, implementing a good pseudo-random number
generator (pRNG) is even more important.
A good pRNG generates highly uncorrelated sequences of pRNs with a very long
maximum-length, before the sequence repeats. A good review on ‘classical’ pRNGs
can be found in [8], however the pRNG used here is more unconventional. Instead of
performing mathematical manipulation, including multiplication by prime numbers
and modulo division to generate pRNs, which is what most classical pRNGs do and is
rather resource intensive for a digital logic implementation, a so called Hybrid cellular
automata (HCA) array pRNG is employed, which, on the contrary, are a very efficient
choice for FPGA implementation.
Cellular automata consist of grids of ‘cells’, where each cell can be in one of a finite
number of states. Time is discrete, and each cell has a local update rule to determine
the state of it in the next unit of time. One of the most popular cellular automata is
Conway’s 2D ‘Game of Life’.
Here, we consider a one dimensional binary HCA, i.e. an array of bits, where each
cell (bit) has one of two local update rules, namely Rule 90 or Rule 150, as shown in
Figure 21, classified by Wolfram [16]. Rule 90 takes the XOR of both of its
neighbours to determine the next state of a cell, while Rule 150 adds the XOR of the
current value of the cell as well. Cells beyond the boundaries of the array are
considered to be '1' at all times, which ensures that the automaton does not freeze in
case of all cells being '0'. These choices and the right configuration for the rules used

                                          5-48
The Synaptic Processing Unit                                                    Anthony Hsiao


ensure that the pRNG produces maximum length sequences of uniform pRNs. In [8],
there is a detailed description of which rules to use for what bit position to generate
maximum length sequences for HCA arrays of a given size.




                            Figure 21: A Hybrid Cellular Automata linear array
                                   21:
The HCA pRNG makes use of two different nearest neighbour update rules, namely Rule 90 and Rule
    150. It is very suitable for implementation on an FPGA, and further produces maximal-length
               sequences of highly uncorrelated patterns. Figure courtesy of Dylan Muir.


If used as described above, HCA pRNGs would introduce high correlation for
adjacent cells, which can be avoided by only using a subset of non-neighbouring bits
from a larger array to generate random numbers. One possible choice for creating a
32bit random number is to use a 128bit HCA, tapping off every fourth bit to form the
pRN, for example.
By using this method to generate pRNs as required by the different modules, the
stochastic processes in the SPU can be trusted to be as random as is possible, to the
best of the knowledge of the author.

5.2 Description of generics
                   generics
Before explaining the architecture of the individual SPU internal modules, it is helpful
to understand the parameterisation of the VHDL code that was carried out in order to




                                            5-49
The Synaptic Processing Unit                                               Anthony Hsiao


keep the SPU reconfigurable. The following is a brief description of the generics used
within the implementation that allow a customisation of the SPU.
   •   SYNAPSE_ADDRESS_WIDTH : natural := 13: The synapse address width is

       the width of most the addresses within the SPU, and sets the maximum
       number of synapses that can be addressed. By default, it is set to 13bits,
       allowing for up to 8192 cascade synapses to be addressed. The fixed depth of
       the cascade memory (the memory itself is not parameterisable) also limits the
       maximum number of synapses to be implemented to 8192, although fewer
       synapses may be used (manual reconfiguration of the memory would be
       required to increase the depth of the cascade memory; this is not difficult).
   •   NEURON_ADDRESS_WIDTH : natural := 8: The neuron address width is the

       width of the neuron address, and tells the SPU how many of the synapse
       addresses’ MSBs are attributed to identifying the neuron. By default, it is set
       to 8bits allowing for up to 256 neurons to be addressed, and a smaller number
       of neurons can be specified without problems.
   •   CASCADE_WIDTH : natural := 5: The cascade width is the number of bits

       that the cascade representation uses. It can be up to 6 bits wide, as limited by
       the width of the cascade memory, but fewer bits, such as the default value of
       5 bits may be specified. The cascade width includes both the efficacy bit and
       the plasticity probability width. At the same time, the cascade width specifies
       the width of the pRN generated in the cascade synapse module, which is
       always one bit less than the cascade width (since the plasticity probability in
       the cascade representation, which will be compared to the pRN, is one bit
       smaller than the cascade width).
   •   PRE_THRESHOLD : natural := 230: The pre threshold sets the p(plasticity)

       with which STADP elicits plasticity events; the higher the threshold, the
       smaller is the p(plasticity). It may range from 0 to 255, where p(plasticity)
       would be 1 and 0 respectively.
Using these four parameters, the SPU can be configured, at compile time, to have the
desired characteristics.

                                          5-50
The Synaptic Processing Unit                                                    Anthony Hsiao


5.3 Module level design
The following sections will individually describe the implementations of the SPU’s
modules on a functional level. In order to save paper and time, no VHDL code is
reproduced here. The interested reader is advised to consult the supplementary CD
for the VHDL code.
In all of the diagrams shown in the following sections, the convention shown in
Figure 22 for arrows is used. In particular, dotted arrows are used to represent the
flow of control signals, dashed arrows for addresses and solid lined arrows are used
to represent the flow of data.




                         22:
                  Figure 22: Conventions on the arrows used in block diagrams


Furthermore, light blue vertical bars are used to indicate register levels or clocked
processes.

5.3.1 Spike forwarding
The forwarding module is the simplest out of all the four major functional modules.
As specified in the previous chapter, it ‘only’ has to forward valid pre-synaptic spikes
if the synapse it was addressed to has high efficacy, or if it is being sent to the
teacher synapse. The basic structure of the learning module is shown in Figure 23.
The outputs are generated in a very simple way. The target neuron address is simply
forwarded directly from the incoming neuron address, while the target address valid
signal is a simple chain of logic operations. Note that the target address valid signal is
dependent on the negation of the address_pre_post signal, since a pre-synaptic
input spike is represented by a ‘0’.




                                            5-51
The Synaptic Processing Unit                                                Anthony Hsiao




                         Figure 23: Spike forwarding module block diagram
                                23:


The teacher synapse is defined to be the 0th synapse of every neuron, i.e. if the
synapse address’ bottom (depending on how wide the neuron address width is) bits
are zero, then it is sent to the teacher synapse, and should be forwarded regardless of
the synaptic efficacy.
Due to its simplicity, the forwarding module only requires one clock cycle to perform
the processing.

5.3.2 Learning rule (STADP)
The learning rule module is much more complex, as shown in Figure 24. It contains
some logic, several registers, a look-up table implemented by a 256x36bit single port
ROM, a 256x36bit single port memory block RAM, a 36bit timer with 11.1ns
resolution and an 8bit pRNG. In order to understand it, it is best to work from the
outputs backwards, and considering separately what happens on a pre- and on a post-
synaptic synapse address (spike).
There are three output signals: the cascade synapse address, the plasticity signal and
the plasticity valid signal, which need to be considered first.

                                              5-52
The Synaptic Processing Unit                                                      Anthony Hsiao


The cascade synapse address is simply a forwarded version of the input synapse
address.
The plasticity signal, i.e. whether a synapse should be depressed or potentiated,
depends on the activity of the postsynaptic neuron. As mentioned earlier, this is
implemented by drawing pseudo-random exponentially distributed expiry times for
the post-synaptic neuron, at which it becomes inactive, and comparing this expiry
time to the current time is all it needs to elicit the right plasticity signal. So, if the
current time, i.e. the output of the timer, is greater than the post-synaptic neuron’s
expiry time which is given by the output of the expiry time memory, i.e. it has already
expired, then a depression signal is produced (plasticity_dep_pot is reset to ‘0’).
If the current time is less than or equal to the expiry time, then the neuron has not yet
expired but is still active, and a potentiation signal is produced (plasticity_dep_pot
is reset to ‘1’).
The plasticity valid signal is only valid, if the incoming spike is valid and pre-synaptic.
Furthermore, since plasticity signals are only elicited with a probability p(plasticity),
the plasticity valid signal is further only valid, if an 8bit pRN is above the plasticity
threshold pre_threshold.
That is really all there is to the generation of plasticity signals, i.e. that is all that
happens on arrival of a pre-synaptic spike. The rest of the STADP learning rule
module is concerned with handling post-synaptic spikes and setting pseudo-random
exponentially distributed expiry times.




                                             5-53
The Synaptic Processing Unit                                         Anthony Hsiao




                      Figure 24: STADP learning rule block diagram
                             24:


                                         5-54
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao
Synaptic processing unit   final year project - anthony hsiao

Más contenido relacionado

Similar a Synaptic processing unit final year project - anthony hsiao

Diplomatiki word
Diplomatiki wordDiplomatiki word
Diplomatiki wordXaris1985
 
46260004 blue-brain-seminar-report
46260004 blue-brain-seminar-report46260004 blue-brain-seminar-report
46260004 blue-brain-seminar-reportvishnuchitiki
 
Digital_video_processing by M Tekalp.pdf
Digital_video_processing by M Tekalp.pdfDigital_video_processing by M Tekalp.pdf
Digital_video_processing by M Tekalp.pdfssuserbe3944
 
Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2Gaurav Raina
 
Modeling_Future_All_Optical_Networks_without_Buff
Modeling_Future_All_Optical_Networks_without_BuffModeling_Future_All_Optical_Networks_without_Buff
Modeling_Future_All_Optical_Networks_without_BuffMiguel de Vega, Ph. D.
 
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon PhiDeep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon PhiGaurav Raina
 
Deep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiDeep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiGaurav Raina
 
2019 imta bouklihacene-ghouthi
2019 imta bouklihacene-ghouthi2019 imta bouklihacene-ghouthi
2019 imta bouklihacene-ghouthiHoopeer Hoopeer
 
steganography using visual cryptography_report
steganography using visual cryptography_reportsteganography using visual cryptography_report
steganography using visual cryptography_reportSaurabh Nambiar
 
Ivo Pavlik - thesis (print version)
Ivo Pavlik - thesis (print version)Ivo Pavlik - thesis (print version)
Ivo Pavlik - thesis (print version)Ivo Pavlik
 
Circiut design and testing
Circiut design and testingCirciut design and testing
Circiut design and testingogunlanadavid
 
Crypto notes
Crypto notesCrypto notes
Crypto notesvedshri
 
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....1 2d
 
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....1 2d
 

Similar a Synaptic processing unit final year project - anthony hsiao (20)

Diplomatiki word
Diplomatiki wordDiplomatiki word
Diplomatiki word
 
Pw user guide
Pw user guidePw user guide
Pw user guide
 
46260004 blue-brain-seminar-report
46260004 blue-brain-seminar-report46260004 blue-brain-seminar-report
46260004 blue-brain-seminar-report
 
Blue brain doc
Blue brain docBlue brain doc
Blue brain doc
 
Digital_video_processing by M Tekalp.pdf
Digital_video_processing by M Tekalp.pdfDigital_video_processing by M Tekalp.pdf
Digital_video_processing by M Tekalp.pdf
 
Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2
 
Modeling_Future_All_Optical_Networks_without_Buff
Modeling_Future_All_Optical_Networks_without_BuffModeling_Future_All_Optical_Networks_without_Buff
Modeling_Future_All_Optical_Networks_without_Buff
 
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon PhiDeep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
 
Deep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiDeep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon Phi
 
2019 imta bouklihacene-ghouthi
2019 imta bouklihacene-ghouthi2019 imta bouklihacene-ghouthi
2019 imta bouklihacene-ghouthi
 
Dyna85
Dyna85Dyna85
Dyna85
 
SIG-NOC Tools survey results
SIG-NOC Tools survey resultsSIG-NOC Tools survey results
SIG-NOC Tools survey results
 
steganography using visual cryptography_report
steganography using visual cryptography_reportsteganography using visual cryptography_report
steganography using visual cryptography_report
 
mscthesis
mscthesismscthesis
mscthesis
 
Ivo Pavlik - thesis (print version)
Ivo Pavlik - thesis (print version)Ivo Pavlik - thesis (print version)
Ivo Pavlik - thesis (print version)
 
Circiut design and testing
Circiut design and testingCirciut design and testing
Circiut design and testing
 
Crypto notes
Crypto notesCrypto notes
Crypto notes
 
AUDIBERT_Julien_2021.pdf
AUDIBERT_Julien_2021.pdfAUDIBERT_Julien_2021.pdf
AUDIBERT_Julien_2021.pdf
 
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....
 
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....
Cisco.press.cisco.ospf.command.and.configuration.handbook.(ccie.professional....
 

Último

Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxElton John Embodo
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxJanEmmanBrigoli
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 

Último (20)

Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 

Synaptic processing unit final year project - anthony hsiao

  • 1. Imperial College London Department of Electrical and Electronic Engineering Final Year Project Report 2007 Project Title: Synaptic The Synaptic Processing Unit Student: Anthony Hsiao Course: 4T Project Supervisor: Dr. George Constantinides Second Marker: Professor Alessandro Astolfi
  • 2. Abstract A small but growing community of engineers and scientists around the world are breaking new grounds in the field of Neuromorphic Engineering, and succeed in designing ever more complex brain-inspired artificial neural systems and implementing them in low power analogue VLSI silicon chips. A recently proposed synapse model called binary cascade synapse has memory properties that are superior to other comparable models, and it is suitable for implementation into digital hardware. Recent efforts have succeeded in designing FPGA implementations of these binary cascade synapses, but failed to implement a usefully large number of them onto one single chip. This project focuses on developing the FPGA implementation of binary cascade synapses further, and by radically changing the digital architecture, essentially designing a microprocessor that processes cascade synapses. This processor is called Synaptic Processing Unit (SPU) and the prototype implementation can currently host up to 8192 cascade synapses. This report describes the development of the SPU, which necessitated the development of a novel learning rule alongside of it, called Spike Timing and Activity Dependent Plasticity (STADP), and portrays a characterisation of this learning rule. Both the hardware implementation of the SPU and of the learning rule are implemented onto an FPGA and evaluated in-circuit. Then, to put the SPU to an ultimate test, it was used together with an aVLSI neuron chip to form a neural system with binary cascade synapses, and was given a real classification task, whereby it was taught to classify two greyscale images. And indeed, the system does successfully classify the two images, which is a very encouraging result. To the best of the knowledge of the author, the SPU presented here is the first hardware implementation with such large number of synapses of its kind, in the world.
  • 3. The Synaptic Processing Unit Anthony Hsiao Acknowledgements Thank you to all those people who have helped me get this far, both academically and otherwise, and to those that accompanied me along the way. In particular, I would like to thank Dylan Muir at the Institute of Neuroinformatics for supervising my project, and being there whenever I needed help, especially during the crazy hours before the FPGA decided to take a holiday in the US. I would also like to thank Dr. George Constantinides at Imperial College London for supervising my project and Prof. Alessandro Astolfi for second marking it. More words of thanks go to Prof. Alessandro Astolfi for coordinating my exchange to ETH Zurich, and for being patient when necessary and laidback whenever possible. Thank you Stefano Fusi, one of the most impressive characters I met at the Institute, for giving me initial feedback and coming up with the basis for what later became STADP. Special thanks to Sungdo Choi and Daniel Fasnacht for all the help and support with the hardware and infrastructure; my computer was not struck by a particle from space, it turned out. Special thanks to Johanna von Lindeiner for good nights on the bench, and the many inspiring exchanges. I actually mean it ! A very special thank you goes out to Pantha Roy, who is just amazing. Thanks for the good times, and for attempting to save me from becoming a social recluse during the final few weeks of this project. An equally special thank you goes out to Siddharta Jha, another amazing character. Thank you for all those discussions and creative breaks, which really enriched my time at the institute. A massive thank you to a fellow brother in work, Christopher Maltby, for enduring all those long days and longer nights of work with me. As you know, without your company, I would not have been able to get any work done, let alone finish. I would like to thank my parents, Wendy and Tien-Wen for their unconditional Tien- support and for opening so many doors for me. Without your efforts and sacrifices, I would not be where I am today, and would probably not get wherever I will get in five, ten years! Finally, I would like to thank Dylan Muir again, because I am actually very grateful for all the help! Without your razor-sharp brain lobes and you patience and support, I would not have been able to achieve half of what I managed to do! 1-2
  • 4. The Synaptic Processing Unit Anthony Hsiao Table of contents 1 INTRODUCTION 1-9 1.1 WHAT IS NEUROMORPHIC ENGINEERING? 1-10 1.2 THE TOPIC OF THIS PROJECT PROJECT 1-11 1.3 AIMS 1-12 1.4 FURTHER REPORT STRUCTURE STRUCTURE 1-12 2 BACKGROUND 2-15 2.1 OF BRAINS, NEURONS AND SYNAPSES SYNAPSES 2-15 2.2 SYNAPTIC PLASTICITY AT THE HEART OF LEARNING IN NEURAL SYSTEMS AT LEARNING SYSTEMS 2-20 2.3 THE CASCADE SYNAPSE MODEL MODEL 2-21 2.4 PREVIOUS WORK 2-24 2.5 OVERVIEW OF THE HARDWARE ENVIRONMENT HARDWARE 2-25 2.5.1 SILICON NEURONS 2-26 2.5.2 SILICON SYNAPSES 2-27 2.5.3 COMMUNICATION USING AER 2-27 2.5.4 THE FPGA BOARD 2-28 2.5.5 SOFTWARE 2-30 2.5.6 FINALLY… ERROR! BOOKMARK NOT DEFINED. 3 LEARNING STADP – A NOVEL HEBBIAN LEARNING RULE 3-31 3.1 STADP – YET ANOTHER LEARNING RULE? 3-31 3.1.1 FROM SPIKE TIME TO SPIKE RATE 3-33 3.2 CHARACTERISTICS OF STADP 3-35 4 DESIGN 4-38 4.1 SUMMARY OF FEATURES OF THE SYNAPTIC PROCESSING UNIT OF 4-38 4.2 SYSTEM LEVEL DESIGN 4-38 1-3
  • 5. The Synaptic Processing Unit Anthony Hsiao 4.2.1 THE SPU IN A NEURAL SYSTEM 4-39 4.2.2 INPUT AND OUTPUT PORTS 4-39 4.3 VIRTUALISING THE CASCADE SYNAPSE CASCADE 4-40 4.4 SPU INTERNAL ADDRESSING 4-42 4.5 MODULAR DESIGN OF THE SPU 4-43 4.6 MODULE SPECIFICATIONS 4-44 4.6.1 FORWARDING 4-45 4.6.2 LEARNING RULE (STADP) 4-45 4.6.3 CASCADE PROCESS 4-46 4.6.4 CASCADE MEMORY 4-46 4.6.5 GLOBAL SIGNALS 4-47 5 IMPLEMENTATION 5-48 5.1 PSEUDO-RANDOM NUMBER GENERATORS GENERATORS 5-48 5.2 DESCRIPTION OF GENERICS ESCRIPTION 5-49 5.3 MODULE LEVEL DESIGN 5-51 5.3.1 SPIKE FORWARDING 5-51 5.3.2 LEARNING RULE (STADP) 5-52 5.3.3 CASCADE SYNAPSE 5-56 5.3.4 CASCADE MEMORY 5-58 5.3.5 SIGNAL SELECTOR 5-60 5.4 SYSTEM INTEGRATION 5-60 5.5 INTEGRATION INTO THE FPGA BOARD BOARD 5-62 5.5.1 ON CLOCKS 5-64 6 VERIFICATION 6-65 7 EVALUATION & EXPERIMENTATION 7-67 7.1 IN-HARDWARE CHARACTERISATION OF STADP CHARACTERISATION 7-67 7.2 MODIFICATIONS FOR THE EXPERIMENTAL SETUP 7-71 7.3 CIRCUIT CALIBRATION 7-73 1-4
  • 6. The Synaptic Processing Unit Anthony Hsiao 7.4 IN-CIRCUIT VERIFICATION 7-75 7.4.1 FORWARDING 7-75 7.4.2 POTENTIATION 7-77 7.4.3 DEPRESSION 7-78 7.5 A REAL CLASSIFICATION TASK REAL TASK 7-80 7.5.1 FROM IMAGE TO PRE-SYNAPTIC STIMULI 7-80 7.5.2 TEACHING METHODS 7-83 7.5.3 RESULTS – NORMAL TEACHING 7-86 7.5.4 RESULTS - BOTTOM UP TEACHING 7-91 7.5.5 REMARKS ON THE CLASSIFICATION EXPERIMENTS 7-95 8 DISCUSSION 8-97 8.1 THE HARDWARE 8-97 8.2 STADP 8-98 8.3 THE CLASSIFICATION TASK TASK 8-99 8.4 CALIBRATION OF THE NEURAL SYSTEM NEURAL 8-103 9 CONCLUSION 9-105 9.1 REFINEMENTS 9-106 10 REFERENCES 10-108 10.1.1 WEB REFERENCES 10-109 10.1.2 DATASHEETS AND REFERENCE BOOKS 10-110 11 APPENDIX I – SUPPLEMENTARY FILES 11-111 12 CHECKLISTS APPENDIX II – VERIFICATION CHECKLISTS 12-112 12.1 MODULE LEVEL VERIFICATION 12-112 12.2 SYSTEM LEVEL VERIFICATION 12-114 1-5
  • 7. The Synaptic Processing Unit Anthony Hsiao 13 THE APPENDIX III – A JOURNEY THROUGH THE SPU 13-117 13.1 PRE-SYNAPTIC SPIKE 13-117 13.2 POST-SYNAPTIC SPIKE 13-119 14 APPENDIX IV – DESIGN HIERARCHY OF SOURCE FILES 14-120 1-6
  • 8. The Synaptic Processing Unit Anthony Hsiao List of figures FIGURE 1: IMAGE OUTPUT OF A SILICON RETINA .................................................................................... 1-11 FIGURE 2: NEURONS OF THE WORLD. ................................................................................................... 2-16 FIGURE 3: ACTION POTENTIALS (PIKES) ARE COMMONLY DESCRIBED BY THREE PROPERTIES:...................... 2-17 FIGURE 4: ACTION POTENTIALS OF THE WORLD. .................................................................................... 2-18 FIGURE 5: CGI OF A SYNAPSE WITH PRE- AND POST-SYNAPTIC NEURONS. ................................................ 2-19 FIGURE 6: MICROGRAPH OF A SYNAPSE TAKEN AT THE UNIVERSITY OF ST. LUIS. ..................................... 2-19 FIGURE 7: DIFFERENT FORMS OF SYNAPTIC PLASTICITY .......................................................................... 2-21 FIGURE 8: SCHEMATIC OF A CASCADE MODEL OF SYNAPTIC PLASTICITY. ............................................... 2-22 FIGURE 9: INITIAL SIGNAL-TO-NOISE-RATIO AS A FUNCTION OF MEMORY LIFETIME, FROM [1]..................... 2-24 FIGURE 10: CIRCUIT DIAGRAM OF AN ULTRA LOW POWER INTEGRATE & FIRE NEURON. ............................ 2-26 FIGURE 11: CIRCUIT DIAGRAM OF THE SO CALLED DIFF-PAIR INTEGRATOR (DPI) SYNAPSE........................ 2-27 FIGURE 12: PROTOTYPE FPGA BOARD DEVELOPED BY DANIEL FASNACHT. ............................................. 2-29 FIGURE 13: EXPERIMENTAL HARDWARE SETUP...................................................................................... 2-30 FIGURE 14: STADP ........................................................................................................................... 3-33 FIGURE 15: THE STADP MECHANISM. ................................................................................................. 3-34 FIGURE 16: SIMULATED BEHAVIOUR OF STADP. .................................................................................. 3-36 FIGURE 17: SYSTEM LEVEL INTERACTION OF SPU AND AVLSI NEURON CHIP............................................ 4-39 FIGURE 18: BIT REPRESENTATION OF CASCADE SYNAPSES ...................................................................... 4-40 FIGURE 19: SPU INTERNAL ADDRESSING FORMAT ................................................................................. 4-42 FIGURE 20: CONCEPTUAL ARCHITECTURE OF THE SPU.......................................................................... 4-43 FIGURE 21: A HYBRID CELLULAR AUTOMATA LINEAR ARRAY ................................................................ 5-49 FIGURE 22: CONVENTIONS ON THE ARROWS USED IN BLOCK DIAGRAMS .................................................. 5-51 FIGURE 23: SPIKE FORWARDING MODULE BLOCK DIAGRAM.................................................................... 5-52 FIGURE 24: STADP LEARNING RULE BLOCK DIAGRAM........................................................................... 5-54 FIGURE 25: INITIALISATION OF DELTA_T LOOK-UP TABLE. ...................................................................... 5-55 FIGURE 26: FLOW DIAGRAM OF THE CASCADE SYNAPSE'S STATE UPDATE RULE ........................................ 5-56 FIGURE 27: CASCADE MODULE BLOCK DIAGRAM .................................................................................. 5-58 FIGURE 28: CASCADE MEMORY BLOCK DIAGRAM ................................................................................. 5-59 FIGURE 29: INPUT SOURCE SELECTOR BLOCK DIAGRAM ......................................................................... 5-60 FIGURE 30: PIPELINED SPU BLOCK DIAGRAM ....................................................................................... 5-61 FIGURE 31: PIPELINED DATAFLOW THROUGH THE SPU .......................................................................... 5-62 FIGURE 32: BLOCK DIAGRAM OF THE INTEGRATION OF THE SPU WITHIN THE FPGA BOARD ...................... 5-63 FIGURE 33: COMPARISON OF DELTA_T_LUT CONTENT FOR 5KHZ AND 90MHZ....................................... 7-69 FIGURE 34: SIMULATED HARDWARE BEHAVIOUR OF STADP AT 5KHZ SIMULATION CLOCK FREQUENCY. .... 7-71 1-7
  • 9. The Synaptic Processing Unit Anthony Hsiao FIGURE 35: FREQUENCY RESPONSE OF THE NEURAL SYSTEM. ..................................................................7-74 FIGURE 36: OSCILLOSCOPE SCREENSHOT OF POST-SYNAPTIC MEMBRANE POTENTIAL:................................7-74 FIGURE 37: EXAMPLE OF A COHERENT 30HZ POISSON SPIKE TRAIN TO ALL 256 SYNAPSES. ........................7-76 FIGURE 38: OSCILLOSCOPE SCREENSHOT OF POST-SYNAPTIC MEMBRANE POTENTIAL:................................7-77 FIGURE 39: IN-CIRCUIT VERIFICATION OF POTENTIATION. ........................................................................7-78 FIGURE 40: IN-CIRCUIT VERIFICATION OF DEPRESSION. ...........................................................................7-79 FIGURE 41: OSCILLOSCOPE SCREENSHOT OF DECREASING POST-SYNAPTIC FIRING RATE: ............................7-80 FIGURE 42: USING PICTURES AS PRE-SYNAPTIC STIMULI. .........................................................................7-82 FIGURE 43: SPIKE TRAINS DERIVED FROM 16X16 PIXEL GREYSCALE IMAGES OF ANTHONY AND DYLAN. .....7-82 FIGURE 44: CONCEPTUAL PROCEDURE OF A REAL CLASSIFICATION TASK. .................................................7-85 FIGURE 45: CLASSIFICATION TASK: TEACH DYLAN, SHOW DYLAN FIRST, AT 22HZ. ..................................7-87 FIGURE 46: CLASSIFICATION TASK: TEACH DYLAN, SHOW ANTHONY FIRST, AT 22HZ. ..............................7-87 FIGURE 47: CLASSIFICATION TASK: TEACH DYLAN, SHOW DYLAN FIRST, AT 25HZ. ..................................7-88 FIGURE 48: CLASSIFICATION TASK: TEACH DYLAN, SHOW ANTHONY FIRST, AT 25HZ. ..............................7-88 FIGURE 49: CLASSIFICATION TASK: TEACH ANTHONY, SHOW ANTHONY FIRST, AT 22HZ...........................7-89 FIGURE 50: CLASSIFICATION TASK: TEACH ANTHONY, SHOW DYLAN FIRST, AT 22HZ. ..............................7-89 FIGURE 51: CLASSIFICATION TASK: TEACH ANTHONY, SHOW ANTHONY FIRST, AT 25HZ...........................7-90 FIGURE 52: CLASSIFICATION TASK: TEACH ANTHONY, SHOW DYLAN FIRST, AT 25HZ. ..............................7-90 FIGURE 53: CLASSIFICATION TASK: BOTTOM-UP TEACHING DYLAN, AT 50HZ..........................................7-92 FIGURE 54: CLASSIFICATION TASK: BOTTOM-UP TEACHING DYLAN, AT 70HZ. .........................................7-92 FIGURE 55: CLASSIFICATION TASK: BOTTOM-UP TEACHING DYLAN, FOR 2S AT 50HZ................................7-93 FIGURE 56: CLASSIFICATION TASK: BOTTOM-UP TEACHING ANTHONY, AT 50HZ. .....................................7-93 FIGURE 57: CLASSIFICATION TASK: BOTTOM-UP TEACHING ANTHONY, AT 70HZ. .....................................7-94 FIGURE 58: CLASSIFICATION TASK: BOTTOM-UP TEACHING ANTHONY, FOR 2S AT 50HZ. ..........................7-94 FIGURE 59: EXPECTED EFFECTS ON A SYNAPSE ....................................................................................8-101 FIGURE 60: PRE-SYNAPTIC SPIKE ARRIVES AT SPU. ............................................................................13-117 FIGURE 61: VALID PRE-SYNAPTIC SPIKE GETS FORWARDED, AFTER TWO CLOCK DELAYS ........................13-117 FIGURE 62: VALID PRE-SYNAPTIC SPIKE GENERATES A PLASTICITY EVENT. ............................................13-117 FIGURE 63: CASCADE SYNAPSE CHANGES IN OPERATION ....................................................................13-118 FIGURE 64: PLASTICITY EVENTS .......................................................................................................13-118 FIGURE 65: VALID POST-SYNAPTIC SPIKE ARRIVES AT SPU..................................................................13-119 FIGURE 66: POST-SYNAPTIC SPIKE DOES NOT GET FORWARDED ...........................................................13-119 FIGURE 67: POST-SYNAPTIC SPIKE SETS POST-SYNAPTIC EXPIRY TIME. ..................................................13-119 1-8
  • 10. The Synaptic Processing Unit Anthony Hsiao 1 Introduction ‘The brain – that’s my second most favourite organ!’ – Woody Allen Solving the mystery behind how the human brain works and computes will be one of the most significant discoveries in the history of science. A profound understanding of our most important organ (bar Woody Allen…) will have significant implications to healthcare, psychology and ethics, as well as to computing, robotics and artificial intelligence. Visionaries such as Ray Kurzweil go as far as predicting, that before the middle of the 21st century, humans and machines will be able to merge in a way never seen before, as brain interfaces enable users to bridge the gap between the real and virtual worlds to a level where the distinction between ‘real’ and ‘not real’ might lose its importance. Artificial systems would reach computational powers that matched those of the human brain, just to surpass them a few years later. Most people find it difficult to imagine such scenarios, especially since even the most powerful computers to date, which can perform billions of operations per second, cannot reproduce some of the computational-magic that human brains perform on a day to day basis, such as pattern recognition or visual processing. ‘Intelligent’ and ‘interactive’ systems are neither intelligent nor interactive, the most advanced robots in the world are no match for a young child when it comes to performing motor tasks or recognition; the thought of ever meeting a machine with intelligence, humor or an opinion goes far beyond what most people think their computers will ever be able to do. Such future scenarios have been the topic of several books and films, and are portrayed as horror scenarios more often than not, ignoring many of the potential opportunities that such a future could bear. Without attempting to make any qualifying judgments, it should be noted that change happens, whether it is welcome or not. This change could well be initiated by a small but growing community of engineers and scientists, driven by impressive advances in neuroscience, who are making 1-9
  • 11. The Synaptic Processing Unit Anthony Hsiao significant progress in copying neuronal organization and function into artificial systems. The secret to the human brain’s superior abilities appears to reside in how the brain organises its slow acting electrical and chemical components (namely neurons, as basic computational unit in the brain, synapses, which are the interfaces of neurons and possess rich dynamics allowing neurons to form interconnected neural circuits). Researchers sometimes speak of ‘morphing’ these structures of neural connections into silicon circuits, creating neuromorphic microchips. If successful, this work could lead to implantable silicon retinas for the blind or sound processors for the deaf that last for 30 years on a single nine-volt battery or to low- cost, highly effective visual, audio or olfactory recognition chips for robots and other smart machines. The long term goal is to engineer ever more complex artificial systems with ever richer behaviour, and ultimately, the construction of an artificial brain. 1.1 What is neuromorphic engineering? The term neuromorphic was coined by Carver Mead, in the late 1980s to describe Very Large Scale Integration (VLSI) systems containing analogue electronic circuits that mimic neuro-biological architectures present in the nervous system. Neuromorphic Engineering is a new interdisciplinary field that takes inspiration from biology, physics, mathematics and engineering to design analog, digital or mixed- mode analog/digital VLSI artificial neural systems. These include vision systems, head-eye systems, auditory processors and autonomous robots, whose physical architecture and design principles are based on those of biological nervous systems. Although the field of neuromorphic engineering is still relatively new, impressive and encouraging results have already been achieved. Ranging from ‘simple’ chips with silicon neurons or synapses [13] to more complex systems such as a silicon retina or cochlea [13] have been demonstrated in the past. 1-10
  • 12. The Synaptic Processing Unit Anthony Hsiao sili Figure 1: Image output of a silicon retina Showing the head of a person at the Brains in Silicon Lab at Stanford University. 1.2 The topic of this project This project focuses on one aspect of neuromorphic systems which is at the heart of some of the dynamics of neural networks, namely on synapses. Fusi et. al. have demonstrated how using ordinary bounded synapse models can have devastating effects on memory in scenarios with ongoing modifications, and proposed a new synapse model, the binary Cascade Synapse [1], which outperforms ordinary (binary) synapse models on several aspects [9]. The nature of the Cascade Synapse makes it convenient to implement in digital hardware rather than analogue VLSI, and it would be useful to augment existing neuromorphic neuron chips with Cascade Synapse functionality. Such a neural system could then act as one single entity in a larger multi chip environment. Previous efforts have successfully designed individual cascade synapses and implemented a small number – eight, to be precise – of them on an FPGA; however, in order to perform useful computation in a reasonably sized neural system, a massive up-scaling of the number of synapses on one chip is necessary. In order to augment a typical aVLSI neuron chip with cascade synapse functionality, any number upwards of 4000 synapses would be desirable, or rather, necessary. One way of doing this is to fundamentally change the way cascade synapses are implemented on the FPGA, referred to as virtualisation: rather than having a number of fixed hardware cascade synapses, which is logic-real-estate inefficient, an abstraction of each synapse could be stored in memory, and only retrieved, processed on and stored on demand. Since memory is generally cheap and abundant, unlike 1-11
  • 13. The Synaptic Processing Unit Anthony Hsiao logic, in digital circuits, this Synaptic Processing Unit (SPU) can potentially allow for a very large scale implementation of cascade synapses on one single FPGA. 1.3 Aims 1. To develop a Synaptic Processing Unit based on an FPGA that implements a large number of cascade synapses 2. To integrate the SPU with an aVLSI neuron chip to form a working neural system 3. To demonstrate the capabilities of the neural system by performing a real classification task 1.4 Further report structure This report is written for the scientifically and technically minded reader, with background knowledge of the concepts of electronic engineering, and is further structured as follows: Background 2. Background This chapter attempts to brief the reader on all the necessary interdisciplinary background knowledge required for this project. In particular, it outlines some of the relevant biology and neuroscience, explains the used binary cascade model in more detail and describes the hardware and infrastructure environment the SPU will be working in. 3. STADP – a novel Hebbian learning rule This chapter will argue the case for developing a new learning rule called STADP, and describe how it works. It will also present an initial characterisation of the learning rule derived from simulation. 4. Design This chapter starts by providing a summary of the features of the SPU, to allow the reader to get a first impression. Then, it outlines the high level design and argues for the system architecture used. It finishes by giving a set of specifications for a modular implementation of the design. 1-12
  • 14. The Synaptic Processing Unit Anthony Hsiao 5. Implementation This chapter starts by going off on a tangent, diving into the realm of random number generators. Then, it describes how the specifications given in the previous chapter were implemented in each module, and how the SPU integrates within the FPGA and its environment. 6. Verification This chapter is a very short one, which only portrays the efforts undertaken in order to verify the design and implementation. It will not reproduce the verification efforts themselves. 7. Evaluation & experimentation This is one of the key chapters and describes all the in-circuit verification and experimentation that has been carried out. Furthermore, it explains the real classification task given to the neural system, and presents the results. 8. Discussion This chapter discusses the evaluation and experimentation results, and tries to make general statements about the operation of the SPU, and conclusions about the success of the classification tasks itself. 9. Conclusion This chapter wraps up the report, and includes the conclusions derived from the work presented here. It objectively assesses advantages and disadvantages of the SPU, and suggests further improvements or changes to the system that might be worthwhile. 10. References This chapter enlists the sources that have been referred to while writing the report as well as sources that have been used throughout the design and implementation of the SPU. Append 11. Appendices There are four appendices, Appendix I with a list of supplementary Matlab files used throughout the project, Appendix II with a copy of the checklist used for verification, Appendix III with screeshots of waveforms showing the journey of a 1-13
  • 15. The Synaptic Processing Unit Anthony Hsiao pre- and a post-synaptic spike through the SPU and finally Appendix IV, listing the design hierarchy of the VHDL source files used. 1-14
  • 16. The Synaptic Processing Unit Anthony Hsiao 2 Background ‘If the human brain were so simple that we could understand it, we would be so simple that we couldn't’ – Emerson M. Pugh 2.1 Of brains, neurons and synapses When IBM’s Deep Blue supercomputer beat then world chess champion Garry Kasparov during their rematch in 1997, it did so by means of sheer brute force and computational power. The machine evaluated some 200 million potential board moves a second, whereas Kasparov considered only three each second, at most 10.1.1. But despite Deep Blue’s victory (in fact, Kasparov won the first match against Deep Blue the year earlier, and IBM refused to agree to a third ‘deciding’ match [21]), computers are no real competition for the human brain in areas such as vision, hearing, pattern recognition, and learning, not to mention their inability to display creativity, humour or emotions. And when it comes to operational efficiency, there is no contest at all. A typical room-size supercomputer weighs roughly 1,000 times more, occupies 10,000 times more space and consumes a millionfold more power than does the neural tissue that makes up the brain [22]. Clearly, computers and brains are fundamentally different, both in terms of architecture and performance. Table 1 summarises important key differences of brains and (conventional) computers. Processing Element Energy Speed Style of Fault elements size use computation tolerant Brain ~1011 neurons 10-6m 30W 100Hz Parallel, Yes ~1014 synapses distributed, memory at computation PC 109 transistors 10-6m 30W 109Hz + Serial, No (CPU) centralized, memory distant to computation Table 1: A comparison between computers and brains 2-15
  • 17. The Synaptic Processing Unit Anthony Hsiao At the most basic cellular level, brains consist of a vast number of brain cells, an estimated 100 billion of them, called neurons. These are also believed to constitute the basic building blocks of computation within the central nervous system, and are in many ways analogous to logic gates in digital electronics. The brain's network of neurons forms a massively parallel information processing system. While there are a large number of different types of neurons, each with different functions and morphologies, most neurons are typically composed of a soma, or cell body, a dendritic tree and an axon, as shown in Figure 2. Figure 2: Neurons of the world. There are many different types of neurons, each with different morphologies and functions, which are found in different parts of brains. Image courtesy of G. Indiveri One of the most important properties of a neuron is its membrane potential, the potential difference across the cell membrane, which is used to communicate between neurons. A complicated molecular mechanism that stems from the cell’s highly complex membrane can give rise to so called action potentials or spikes, which are sharp a increase followed by an equally sharp drop in the membrane potential within a few ms. A neuron receives inputs, i.e. spikes, from other neurons, typically many thousands, on its dendritic tree, and integrates them (approximately) on its membrane potential. Once the membrane potential exceeds a certain threshold, the neuron generates a spike which travels from the body down the axon, commonly 2-16
  • 18. The Synaptic Processing Unit Anthony Hsiao described as the output of a neuron, to the next neuron(s) (or other receptors). This spiking event is also called depolarization, and is followed by a refractory period, during which the neuron is unable to fire. The membrane potential of a spiking neuron is shown in Figure 3, conceptually, while Figure 4 shows some measurements of real action potentials of the world. Typically, neurons fire at rates between 0Hz and about 100Hz, and both the precise timing of individual spikes and the firing rates of neurons are believed to play an important role in neural communication and computation. Figure 3: Action potentials (pikes) are commonly described by three properties: (pike pikes) properties roperties: Pulse width, firing rate or inter-spike-interval and refractory period. Courtesy of Giacomo Indiveri. 2-17
  • 19. The Synaptic Processing Unit Anthony Hsiao Figure 4: Action potentials of the world. Courtesy of Giacomo Indiveri, modified by Anthony Hsiao The axon endings of neurons almost touch the dendrites or cell body of the next neuron. The gap between two neurons is a specialized structure called synapse and is the point of transmission of spikes from the pre-synaptic neuron to the post-synaptic neuron, as shown in Figure 5 and Figure 6. This transmission is effected by neurotransmitters, chemicals which are released from the pre-synaptic neuron upon depolarization, which bind to receptors in the post-synaptic neuron, thereby advancing the depolarisation of it. Most synapses are excitatory, i.e. they increase the depolarisation of the post-synaptic neuron, although there are so called inhibitory synapses (with inhibitory neurotransmitters), which render a post-synaptic neuron less excitable. The human brain is estimated to have a vast 1014 synapses. The extent to which a spike from one neuron is transmitted on to the next, the synaptic efficacy or weight, depends on many factors, such as the amount of neurotransmitter available or the number and arrangement of receptors, and is not constant, but changes over time. This property is called synaptic plasticity, and it is this variable synaptic strength, that is believed to give rise to both memory and learning capabilities, which makes it particularly interesting to study synapses! 2-18
  • 20. The Synaptic Processing Unit Anthony Hsiao pre- post- Figure 5: CGI of a Synapse with pre- and post-synaptic neurons. Excerpt of the 2005 Winner of the Science and Engineering Visualisation Challenge. By G. Johnson. Medical Media, Boulder, CO Figure 6: Micrograph of a Synapse taken at the University of St. Luis. In the center of the image is the Synaptic Cleft, which separates the pre- (top) and post-synaptic neuron (bottom). The pre-synaptic neuron has clearly visible vesicles which contain neurotransmitters. Upon pre-synaptic depolarisation, these neurotransmitters are released and diffuse across the synaptic cleft, to be received by receptors on the post-synaptic neuron, advancing its depolarisation. Scientists have developed various models of the underlying molecular mechanisms of synaptic plasticity, describing it to good levels of accuracy; however it is important to appreciate, that there are details to synaptic plasticity which are still subject of ongoing research. 2-19
  • 21. The Synaptic Processing Unit Anthony Hsiao 2.2 Synaptic plasticity at the heart of learning in neural systems There are several underlying mechanisms that cooperate to achieve synaptic plasticity, including changes in the quantity of neurotransmitter released into a synapse and changes in how effectively cells respond to those neurotransmitters [7]. As memories are believed to be represented by vastly interconnected networks of synapses in the brain, synaptic plasticity is one of the important neuro-chemical foundations of learning and memory. Thereby, strengthening, Long-Term Potentiation (LTP), and weakening of a synapse, Long-Term Depression (LTD), are widely considered to be the major mechanisms by which learning happens and memories are stored in the brain. Many models of learning assume some kind of activity based plasticity, whereby an increase in synaptic efficacy arises from the pre-synaptic cell's repeated and persistent stimulation of the post-synaptic cell. These kinds of learning rules are commonly referred to as Hebbian learning rules, popularly summarised as ‘What fires together, wires together’. Another particularly prominent experimentally observed form of long term plasticity is called Spike-Timing Dependent Plasticity (STDP), and depends on the relative timing of pre- and post-synaptic action potentials. If a pre-synaptic spike is succeeded quickly by a post-synaptic spike, then there appears to exist some kind of causality since the pre-synaptic neuron has contributed to the depolarization of the post- synaptic neuron, and they should be connected more strongly, by potentiating the synapse. Conversely, if a pre-synaptic spike is directly preceded by a post-synaptic spike, their connection should be weakened, and the synapse gets depressed. Different forms of observed plasticity that can be described by STDP are shown in Figure 7. 2-20
  • 22. The Synaptic Processing Unit Anthony Hsiao Figure 7: Different forms of synaptic plasticity The amount (qualitatively) and type of synaptic modification evoked by repeated pairing of pre- and post-synaptic action potentials in different preparations. The horizontal axis is the difference tpre-tpost of these spike-times. Results are shown for slice recordings of different neurons. Without going into unnecessary detail, the important point to note is that different forms of plasticity exist. Figure from Abbott & Nelson 2000. Several other models of synaptic plasticity exist, ranging over several levels of complexity and biological plausibility. Each has its advantages and disadvantages, proposing different mechanisms of synaptic plasticity, trying to explain different types of experimentally observed plasticity. Other global regulatory processes of learning, such as synaptic scaling or synaptic redistribution are thought to be necessary alongside activity based learning rules [5]. While learning rules and models of synaptic plasticity attempt to describe the mechanism by which synaptic plasticity is generated, different models of synapses themselves exist, which can vary greatly in the way they respond to ‘plasticity signals’. 2.3 The cascade synapse model Storing memories of ongoing, everyday experiences requires a high degree of synaptic plasticity, while retaining these memories demands protection against changes induced by further activity and experiences. Models in which memories are stored through switch-like transitions in synaptic efficacy are good at storing but bad at retaining memories if these transitions are likely, and they are poor at storage but good at retention if they are unlikely [1]. In order to address this dilemma, Fusi et. al. developed the model of binary cascade synapses, which combines high levels of memory storage with long retention times and significantly outperforms conventional models [9]. 2-21
  • 23. The Synaptic Processing Unit Anthony Hsiao They consider the case of binary synapses, i.e. a synapse with only two efficacies, (for example potentiated and depressed, weak or strong), which is not implausible, since biological synapses have been reported to display binary states of efficacy as well [2]. The structure of a binary cascade model is shown in Figure 8, specifying two independent dimensions for each synapse. Just like ordinary models of binary synapses, a binary cascade synapse can be in one of two states of efficacy, weak or strong, but while ordinary models only allow one fixed value of plasticity, cascade synapses possess a cascade of n states with varying degree of plasticity, implementing metaplasticity (i.e. the plasticity of plasticity). Ongoing plasticity then corresponds to transitions of a synapse between states characterized by different degrees of plasticity, rather than (only) different synaptic strengths. Figure 8: Schematic of a Cascade Model of Synaptic Plasticity. Courtesy of Stefano Fusi. There are two levels of synaptic strength, weak (yellow) and strong (blue), denoted by + and -. Associated with these strengths is a cascade of n sates (n = 5 in this case). Transitions between state I of the cascade of any strength and state 1 of the opposite strength take place with probability qi, corresponding to conventional synaptic plasticity. Transitions with probabilities p i ±link the states within the respective cascade (downward arrows), corresponding to metaplasticity. Binary cascade synapses can respond to any learning rule with binary plasticity signals, i.e. signals that are either ‘potentiate’ or ‘depress’, and responds to them stochastically; plasticity signals are only responded to with a given probability which 2-22
  • 24. The Synaptic Processing Unit Anthony Hsiao is determined by the state along the cascade the synapse is in. So it is the varying probability of responding to plasticity signals that implement the different degrees of plasticity described above. In the highest state (state 1 of the cascade in Figure 8), the probability of responding to a plasticity event is 1, and decreases for states further down the cascades, where the synapse becomes less plastic. In the model analysed by Fusi, the plasticity actually halves for every state down the cascade, i.e. 50% chance of responding to a plasticity signal in the second cascade, 25% in the third, and so forth. A cascade synapse can respond to plasticity events in two ways, depending on whether it already has the ‘right’ efficacy, referred to as switching and chaining. If it switches, then it is changing efficacy, i.e. from weak to strong, or vice versa. If a synapse switches, it will always make a transition to state 1, i.e. the most plastic state, of the opposite cascade, regardless of what state it was in before. In Figure 8, these transitions are represented by the arrows between the two cascades, with plasticity probabilities given by qi. If the synapse chains, i.e. it already has the right efficacy, then it is moving down one state in the cascade, thereby reducing (halving) its plasticity probability, becoming less plastic. In Figure 8, this is represented by the downward arrows connecting consecutive states within each cascade, with plasticity probabilities given by pi+/-. Thus, cascade synapses can respond to ongoing modifications by reducing their plasticity, thereby ‘reassuring’ their state of efficacy. Another way of looking at it is that synaptic efficacies and their degree of plasticity are dependent on the history of the synapses and the plasticity signals they received. Fusi et. al. assess the performance of cascade synapses to that of ordinary binary synapses by comparing the strength of an initial memory trace, the initial signal-to- noise ratio, as well as the average memory lifetime, the point at which this signal-to- noise ratio becomes equal to 1 for both synapse model (it is worthwhile to reiterate, that it was this trade-off, ability to store memories easily vs. retaining them for a long time, that originally led them to develop the cascade synapse model in the first place). They find that cascade models arrive at a better compromise, storing new memories 2-23
  • 25. The Synaptic Processing Unit Anthony Hsiao more easily and faithfully, yet retaining them for a longer period of time, as shown in Figure 9. Without going into unnecessary detail (the interested reader is advised to consult [1] for more information), they find that the better performance of cascade synapses stems the fact that they experience power-law forgetting, unlike ordinary binary synapses, which experience exponentially fast decay of their memories. Figure 9: Initial Signal-to-noise-ratio as a function of memory lifetime, from [1]. Signal-to-noise- [1]. 5 The initial signal-to-noise ratio of a memory trace stored using 10 synapses plotted against the memory lifetime (in units of 1 over the rate of candidate plasticity events). The blue (lower) curve is for a binary model with synaptic modification occurring with probability q that varies along the curve. The red (upper) line applies to the cascade model described by Fusi et. Al. The two curves have been normalised so that the binary model with q = 1 gives the same result as the n = 1 cascade model to which it is identical. Clearly, the cascade model performs better than the ‘normal’ binary model both in terms of initial signal-to-noise ratio and memory lifetime. In summary, binary cascade synapses outperform their ‘ordinary counterpart’ in terms of memory storage and retention, which derives from the more complex structure allowing the synapse to respond to ongoing modifications along two dimensions – efficacy and metaplasticity. It is desirable to implement these nice properties into real hardware, and previous attempts have already laid good groundwork for that. 2.4 Previous work This project mainly builds up on two previous projects. The first one, titled ‘A stochastic synapse for reconfigurable hardware’, a short project during the Telluride workshop for Neuromorphic Engineering by Dylan Muir [15], laid the ground work 2-24
  • 26. The Synaptic Processing Unit Anthony Hsiao for both the following and this project. In particular, it succeeded in creating a first VHDL implementation of the cascade synapse and verified its operation in simulations. One of the biggest contributions of this project is the design of one particular type of pseudo-random number generator, the Hybrid Cellular Automata array pseudo-random number generator, which also found extensive use in this current project. However, no actual hardware was synthesised from the digital design. The second project, ‘A VHDL implementation of the Cascade Synapse Model’, a diploma project by Tobias Kringe [16], succeeded in designing and implementing a small array of cascade synapses onto an FPGA. The operation of the digital cascade synapses was verified both in simulation and in hardware, and encouraging results were achieved in confirming the complex behaviour of the cascade synapse (which is why this current project will not focus on reproducing and re-verifying the properties of hardware implemented cascade synapses). However, the VHDL implementation was rather large, and only a small number of synapses could be implemented onto the FPGA. It was Tobias Kringe who proposed to virtualise the cascade synapses (which is one of the aims of this current project) in order to realise a useful number of synapses onto one FPGA. Due to the radically different architecture of the virtualised synapses to the static hardware synapses, next to none of his VHDL implementation was reused. To the best of the knowledge of the author, there has been no other working hardware implementation of a large number of cascade synapses (in fact, of any number of synapses) to date. 2.5 Overview of the hardware environment Neuromorphic aVLSI hardware commonly comprises low power analogue CMOS circuits operating in the subthreshold regime, that mimic (morph) the properties of real neural systems and elements. In particular, a neuromorphic aVLSI neuron chip was used, which comprised an array of leaky Integrate & Fire (IF) silicon neurons with Diff-Pair Integrator (DPI) synapses. Communication to the outside world was done using the asynchronous Address Event Representation (AER) protocol. The 2-25
  • 27. The Synaptic Processing Unit Anthony Hsiao FPGA is sitting on an FPGA board developed at the Institute of Neuroinformatics in Zurich. 2.5.1 Silicon neurons There are different types of silicon neurons, such as conductance based models which aim to map molecular conductance mechanisms underlying neuron behaviour in detail into analogue electronic circuits, or more qualitative models such as the I&F neuron model, which merely implements the observed characteristics of neuron behaviour into silicon, such as integration, firing or the refractory period. The aVLSI chip used in this project contained 128 I&F neurons similar to the circuit depicted in Figure 10. Qualitatively, this I&F circuit works by integrating input current from on-chip synapses on its membrane, and elicits a (voltage) spike if the membrane voltage crosses a firing threshold. Figure 10: Circuit diagram of an ultra low power Integrate & Fire Neuron. 10: an low Labelled functional circuit elements mimic the behaviour of real neurons. Transistors operate in the sub-threshold regime to exploit their desirable exponential characteristics. A capacitor Cmem integrates incoming post-synaptic current into a membrane voltage Vmem. If the membrane potential crosses the spiking threshold, it will ‘spike’ just like a real neuron. Courtesy of Giacomo Indiveri. 2-26
  • 28. The Synaptic Processing Unit Anthony Hsiao 2.5.2 Silicon synapses Each I&F neuron has 32 silicon synapses with different properties and behaviour connected to it, but only one type of synapse was used in this project, namely the static DPI synapse. The circuit of such a synapse is depicted in Figure 11. Qualitatively, the DPI synapse works by receiving a (voltage) spike from a pre- synaptic neuron (or from the outside world), and then injects a given amount of current onto the membrane of the post-synaptic neuron it is connected to in response. The amount of current produced by every incoming spike is dependent on the static synaptic weight and the time constant of the synapse, which can be adjusted to achieve the desired static synaptic weight. Figure 11: Circuit diagram of the so called Diff-Pair Integrator (DPI) synapse. 11: Diff- iff synapse. For every pre-synaptic spike it receives, it dumps a post-synaptic current onto the membrane of the post-synaptic neuron connected to it. The amount of current, and other dynamics, can be set by parameters such as the synaptic weight, the time constant tau or the threshold voltage. Communication 2.5.3 Communication using AER The Address Event Representation (AER) protocol is used to allow for communication in multi-chip environments. It is a serial asynchronous four-phase handshaking protocol (using request-acknowledge signals) which encodes events (i.e. spikes) of individual neurons by assigning each neuron a unique address (up to 2-27
  • 29. The Synaptic Processing Unit Anthony Hsiao 16bits). Every time a neuron fires, it generates an address event, which is then transmitted over the AER bus to receiving hardware. Unlike conventional electronic systems with arrays of information sources, such as digital cameras, neuromorphic systems using the AER protocol do not scan through every one of its elements to transmit one frame after another, but rather, information is transmitted on demand. Only if a neuron spikes, will an address event be transmitted. Therein, one of the most important points about the AER protocol is its asynchrony, whereby the precise timing of the address event is implicitly encoding the time of the spike itself – no need to communicate timestamps for individual spikes. Conveniently, since electronic circuits implementing neuromorphic hardware are very fast, while neural activity is rather slow (<100Hz), a large number of neurons can share the same AER bus without problem. Typically, an AER bus would have a bandwidth of about 1Mevent/second. 2.5.4 The FPGA board The FPGA used in this project is a Xilinx Spartan 3 (xc3s400pq208) that sits on a prototype FPGA board developed by Daniel Fasnacht during his diploma project at the Institute of Neuroinformatics in Zurich, depicted in Figure 12. Features used in this project are the USB interface and the two AER ports (one input, one output). It has an external clock of 106.125MHz, and is programmed using JTAG. Apart from developing the board itself, Daniel Fasnacht further developed a Linux driver to allow communication with the USB board. A program developed by Giacomo Indiveri is used to send data to the FPGA board. In particular, pre-synaptic spikes are sent through the USB bus to the SPU by specifying a synapse address and an inter-spike interval to the previous spike, data which is easily generated using the piking neuron toolbox1 in Matlab. The aVLSI neuron chip is configured using Matlab2. 1 Developed by Dylan Muir at the Institute of Neuroinformatics 2 To set up the environment variable for the aVLSI chip in Matlab: chipinit.m. To load the required calibration settings to the chip: bias_050607.m 2-28
  • 30. The Synaptic Processing Unit Anthony Hsiao It should be noted, that his is a prototype board, and with experimental or prototype hardware, extra consideration should be taken, since not all functions necessarily have to work as expected. However, seeing experimental hardware work and become ‘alive’ is one of the most gratifying moments of hardware development. In the experimental setup used for the classification task (as described in 7.5A real classification task) the FPGA board interfaces with an aVLSI ‘IFSLTWA’ neuron chip, using the AER connections to send address events to, and receiving feedback from the neurons. Figure 13 illustrates this experimental setup. Figure 12: Prototype FPGA board developed by Daniel Fasnacht. 12: 1. Xilinx Spartan 3 (xc3s400pq208) 2. USB port 3. AER-out port 4. AER-in port 2-29
  • 31. The Synaptic Processing Unit Anthony Hsiao 13: Figure 13: Experimental hardware setup. 1. FPGA SPU 2. Forward AER connection 3. aVLSI chip with array of I&F neurons 4. Oscilloscope measuring the post-synaptic membrane potential 5. post-synaptic feedback AER connection (with logic analyzer) 6. pre-synaptic stimuli input USB connection. 2.5.5 Software Throughout this project, three software packages were used, namely Xilinx ISE 9.1i Webpack to code the VHDL design, Modelsim PE Student Edition to simulate VHDL code and Matlab, for various things, including plotting, initialization file generation, analysis or spike train generation. A project diary was kept on GoogleDocuments. 2-30
  • 32. The Synaptic Processing Unit Anthony Hsiao rule 3 STADP – a novel Hebbian learning r ule ‘The illiterate of the 21st century will not be those who cannot read and write, but those who cannot learn, unlearn, and relearn’ – Alvin Toffler In the previous section, the general concept of synaptic plasticity was introduced. While different learning rules have been proposed, for the task at hand, keeping in mind that the Synaptic Processing Unit is to be tested on a real classification task, it is necessary to implement a learning rule that is both suitable for the learning task in a general environment, as well as easily implemented into digital hardware. There are several learning rules out there that would be interesting to be implemented, most prominently STDP, amongst also others [18], [3], [20], but none really meet the needs for this project. From [19] and [20], it was concluded that ordinary STDP would not be sufficient as a general learning rule. Instead, the system would either have to be taught with specifically crafted and highly correlated temporal patterns (not a general environment), or a more elaborate version of STDP would have to be constructed, which is impractical for the implementation, both in terms of hardware real estate (memory in particular, but also logic) and circuit complexity. Prototype designs for STDP were rejected on the basis of it requiring excessive memory and overcomplicating the digital circuit. Instead, a novel but very simple, easily implemented learning rule was developed together with [20], called Spike-Timing and Activity Dependent Plasticity (STADP), which produces simple binary plasticity events, depress and potentiate, as required by the binary cascade synapse model. 3.1 STADP – Yet another learning rule? At the heart of STADP is the same Hebbian learning paradigm, that ‘what fires together, wires together’. Unlike STDP, which derives the causality for ‘firing together’ from the difference in spike times, STADP uses a mixture of firing time and 3-31
  • 33. The Synaptic Processing Unit Anthony Hsiao firing rate based measures to determine, whether pre- and post-synaptic neuron ‘fire together’. As the name suggests, STADP produces plasticity signals depending on spike timing as well as activity. In particular, it is dependent on the state of activity of the post- synaptic neuron, and the timing of pre-synaptic spikes. STADP says, that the post-synaptic neuron can be in one of two states at any point in time: active and inactive. This state is determined by a threshold function of the post- synaptic firing frequency: if it is above a mean firing rate fm, it is said to be active, otherwise it is inactive. For example, a setup of aVLSI I&F neurons could have a mean firing rate fm = 50Hz, which is biologically plausible, and be said to be active for firing rates above 50Hz, and inactive for firing rates below 50Hz. Then, two neurons are said to ‘fire together’ if a pre-synaptic spike arrives while the post-synaptic neuron is active, and the synapse should be potentiated (LTP). The reverse is also true, i.e. when a pre-synaptic spike arrives at the synapse while the post-synaptic neuron is inactive, then the synapse should be depressed (LTD). However, this scheme would result in one plasticity signal for every pre-synaptic spike, so in order to condition the number of plasticity signals produced, STADP is stochastic, and only produces potentiation or depression signals with a certain probability, called the probability of plasticity, p(plasticity). Figure 14 below summarises how STADP produces plasticity events. 3-32
  • 34. The Synaptic Processing Unit Anthony Hsiao 14: Figure 14: STADP Plasticity events are elicited with a probability p(plasticity), and depend on the spike time of the pres- synaptic, and the activity of the post-synaptic neuron. 3.1.1 From spike time to spike rate The two state abstraction of the post-synaptic neuron’s activity essentially requires an integration of its spike-times to produce spike rates. However, integration of spikes arriving at irregular intervals into spike rates can be a non-trivial task in real time processing in digital hardware (it would be very easy in analogue electronics actually!). In STADP, this is elegantly performed using a stochastic process, inspired by quantum physics [20]. The main idea behind this is that the post-synaptic neuron is in an unknown state of activity until it gets ‘measured’, in this case by an incoming pre-synaptic spike. Every time the post-synaptic neuron spikes, its state of activity is set to active independent on the current state. A neuron in active state can then make a transition to the inactive state with a probability p(deactivate) (this can also be regarded as a two state hidden Markov process), as depicted in Figure 15. Without specifying what the p(deactivate) is at any point of time, it can be appreciated how a post-synaptic neuron firing at mean firing rate fm should have a probability of being in active state, p(active) of 0.5, a more active neuron should have a higher p(active) and a less active neuron should have a lower p(active). 3-33
  • 35. The Synaptic Processing Unit Anthony Hsiao 15: mechanism. Figure 15: The STADP mechanism. A post-synaptic neuron can be in one of two states: active and inactive. The STADP mechanism determines the state of the post-synaptic neuron by integrating the post-synaptic firing times. A post- synaptic spike sets the neuron to active state, which then stochastically resets to the inactive state after an amount of time equal to the mean postsynaptic inter-spike interval. Clearly, the probability that the post-synaptic neuron is in active state at any given time increases as it’s firing rate increases, and is 0.5 if it is firing at the mean firing rate. In order to implement this in real hardware (it would be rather challenging to actually instantiate some kind of quantum process), the STADP mechanism proposed here is using an abstraction of the stochastic deactivation of the post-synaptic neuron. This abstraction is based on the assumption that the neuron fires as a poisson process with mean firing rate fm, which has an exponentially distributed inter-spike interval (the time interval between two consecutive spikes) ~ exp(1/fm). Then, upon every incoming post-synaptic spike (which sets the neuron’s state to active), an exponentially distributed ‘expiry time’ is drawn, after which the neuron is said to reset to the inactive state. This way, the desired properties can be achieved: if the post-synaptic neuron is firing at the mean firing rate fm, it will have an equal chance of being in active or inactive state, on average, at any point in time. Similarly, if it is firing at a higher rate, it has a higher chance of being active since it is being set to active faster than it is expiring to inactive, while if it is firing at a lower rate, it has a lower chance of being active at any point in time. 3-34
  • 36. The Synaptic Processing Unit Anthony Hsiao One question remains. Whether a plasticity event is a depression or a potentiation event is dependent on the post-synaptic neuron’s activity as explained above – but then, how does STADP behave for different pre-synaptic frequencies? As the name suggests, the plasticity is dependent on spike timing, since the state of activity of the post-synaptic neuron is only ever evaluated on an incoming pre-synaptic spike, but in fact, its rate plays a role too. In general, the higher the pre-synaptic frequency, the more plasticity events will be produced. However, since potentiation and depression are only elicited with probability p(plasticity), the dependence on the pre-synaptic rate is slightly more complex. While high pre-synaptic frequencies are likely to lead to a high rate of plasticity, low, but non-zero, pre-synaptic frequencies are likely not to result in any plasticity event at all, as only few of the already rare pre-synaptic spikes would ever lead to a plasticity event. In summary, the pre-synaptic firing rate can be said to determine the rate (probability) of plasticity events, while the post-synaptic frequency is best described as setting the type of the plasticity events. Synapses with high pre-synaptic firing rates are more likely to be receiving plasticity signals, while synapses with low pre-synaptic firing rates are likely to remain static, as they receive none or only few plasticity events. Characteristics 3.2 Characteristics of STADP The previous section explained how, conceptually, STADP works, and how the actual STADP mechanism, which draws an exponentially distributed expiry time for the post-synaptic neuron to reset to the inactive state, works. The following paragraphs describe some of its characteristics as well as the expected plasticity signals that STADP would produce. When characterising the behaviour or the results of STADP, the two important points to be noted are firstly whether the expiry time mechanism works at all, and secondly what plasticity profile it produces over a range of pre- and post-synaptic frequencies. By observing p(active), the correct operation of the mechanism can be verified, by 3-35
  • 37. The Synaptic Processing Unit Anthony Hsiao observing the plasticity rates, i.e. how many potentiation or depression events are elicited per second, insights into the plasticity profile can be gained. The following plots were obtained from a simple Matlab simulation3 done by Dylan Muir, and show the rate of potentiation (LTP rate), rate of depression (LTD rate), the net effect of plasticity (LTP rate – LTD rate) as well as p(active), over pre- and post- synaptic frequency ranges of 0-100Hz. 16: Figure 16: Simulated behaviour of STADP. Left column: rate of potentiation and depression events per second, over a range of pre- and post- synaptic frequencies [1:100Hz] (ignore the axis labels). Right column: Net effect of STADP and probability of the postsynaptic neuron being in active state per unit time. These simulation results suggest that STADP indeed works as a Hebbian learning rule, and has the desired characteristics. The p(active) is approximately 0.5 at a post- synaptic frequency of 50Hz, is increases for higher frequencies, and decreases for lower frequencies. Furthermore, the plasticity rate increases with pre-synaptic 3 p(active) curve: make_prob_active_vs_freq_plot.m other plots: make_freq_sim_plot.m 3-36
  • 38. The Synaptic Processing Unit Anthony Hsiao frequency for both potentiation and depression, which also have a qualitatively correct behaviour, best summarized by the net effect of LTP and LTD: with increasing pre-synaptic frequencies, there are more plasticity events, with potentiation dominating for high post-synaptic frequencies, and depression dominating for low post-synaptic frequencies. One important characteristic to note, however, is that potentiation and depression are not symmetric within the regime of operation, and that the net effect of plasticity has a bias towards depression, or equivalently, reluctance towards potentiation. This is due to the p(active) curve, which is not linear or symmetric about the (50Hz, 0.5) point. As will be described later in the experimental section, this will have an observable effect. Possible remedies for this could include measures such as pre-biasing or distorting the p(active) curve so that it saturates at 100Hz, or by setting a minimum expiry time of 10ms (1/100Hz) in order to ensure that p(active) is 1 at 100Hz. The remedy used would have to be matched to the particular implementation of STADP. While more detailed and formal analysis of STADP would be desirable, this would go beyond the scope of this report. These initial simulation results are satisficing ( = satisfying enough), and confidence in the learning rule further derives from [20]. 3-37
  • 39. The Synaptic Processing Unit Anthony Hsiao 4 Design ‘I am enough of an artist to draw freely upon my imagination. Imagination is more important than knowledge. Knowledge is limited. Imagination encircles the world’ – Albert Einstein 4.1 Summary of features of the Synaptic Processing Unit The Synaptic Processing Unit designed here has the following features: • Speed of operation: Clocked at 90MHz internally • System architecture: o Fully pipelined design – the SPU can theoretically process a new address event every clock cycle, although this never happens in practice o Modular design – allows for easy plug-in of a new learning rule • On-chip learning rule: STADP with 11.1ns time resolution • I/O ports: 1x USB input, 1x AER input, 1x AER output • Cascade representation: 6bit, reconfigurable, allowing for synapses with up to 32 cascades • Cascade memory address width: 13bit, reconfigurable, allowing for up to 8192 binary cascade synapses • Addressing: Configurable number of neurons (up to 256) • One teacher synapse per neuron 4.2 System level design Although this project builds upon previous work as mentioned earlier, most parts of the Synaptic Processing Unit were designed from scratch, since the pipelined and virtualized cascade synapse requires a very different architecture. 4-38
  • 40. The Synaptic Processing Unit Anthony Hsiao 4.2.1 The SPU in a neural system From a high level point of view, the SPU is supposed to integrate with one aVLSI neuron chip, forming one coherent neural system containing an array of neurons with cascade synapse functionality. This system could, for example, be used as one layer of a larger network of spiking neurons, as depicted in Figure 17. Figure 17: System level interaction of SPU and aVLSI neuron chip. 17: System interaction Together, these form one freely reconfigurable integrated array of N Integrate and Fire neurons with binary cascade synapses. 4.2.2 Input and output ports In order to act as one coherent system, the SPU has to be able to communicate both with the neuron chip, as well as with the outside world. Here, this is done using the USB port of the FPGA board as pre-synaptic input, and the two AER ports to connect the SPU to the neuron chip. Clearly, a forward connection, whereby pre-synaptic spikes are routed towards the right post-synaptic neuron is necessary. However, in order to be able to perform learning using STADP, and indeed most other learning rules, an additional feedback connection from the neuron chip back to the SPU is necessary, in order to obtain information about the post-synaptic neurons, which in this case means to estimate their state of activity. 4-39
  • 41. The Synaptic Processing Unit Anthony Hsiao 4.3 Virtualising the cascade synapse The binary cascade model is quite a nice model to be implemented in digital hardware. It has essentially only two important properties, namely its binary efficacy and its current state, which at the same time encodes the plasticity, which in turn is represented by a plasticity probability, which halves for every higher cascade. This has ‘digital’ written all over it. In order to virtualise the cascade synapses, some conceptual ‘cascade mechanism’ by which to process them has to be devised. The basic idea is to trade hardware real estate on the FPGA for memory, and to process synapses on demand. This has two immediate design deliverables: • In order to virtualise the cascade synapses, an abstraction or memory representation of them has to be defined, • A mechanism, by which they are processed on, i.e. how individual synapses respond to plasticity signals, has to be developed Conveniently, the cascade synapse can be represented by a bit vector very intuitively. One bit encodes the synaptic efficacy, while a number of other bits encode the state of the synapse, i.e. the synaptic plasticity, i.e. the plasticity probability, depending on the number of cascades. Then, halving the plasticity probability is just a matter of a bit shifting operation. As depicted in Figure 18, an Nbit representation where the MSB represents the efficacy, and the word [N-1...0] represents the plasticity probability, as an unsigned binary number. 18: Figure 18: Bit representation of cascade synapses 4-40
  • 42. The Synaptic Processing Unit Anthony Hsiao Using this representation, the plasticity probability ranges from 0 to 2N-1-1 rather than from 0 to 1, but this is not a problem, since it can be regarded as the numerator of a rational number with denominator 2N-1-1. Such a representation can easily be stored in and retrieved from memory, and provides the functionality required to implement the virtualisation. Here, N = 6 was fixed as a reasonable maximum cascade representation width, allowing for synapses with up to 32 cascades. This is more than sufficient, and in fact, too large a number of cascades can actually decrease the memory performance of the synapses [1]. The processing on the cascade synapse can be expected to be relatively simple, since there is only a small number of things the synapse ‘can do’: switch or chain, with a probability given by its state. The exact mechanism implemented is described in detail in the 4-41
  • 43. The Synaptic Processing Unit Anthony Hsiao Implementation section, but from a high level description point of view, it has to: • Obtain the right cascade from memory • Perform the necessary operations on its state representation (i.e. switch, chain or do nothing) • Produce a new cascade state representation, and pass it back to the cascade memory 4.4 SPU internal addressing Since incoming and outgoing events are following the AER protocol, whereby neurons are identified by addresses, the SPU internal representation is also using addresses as identifiers of synapses. 19: Figure 19: SPU internal addressing format At the heart of the addressing scheme are the synapses, which can be identified uniquely by an Nbit synapse address, as shown in Figure 19. For historical reasons4, this synapse address is set to 13bits, allowing it to uniquely identify up to 8192 synapses. The top few bits of the synapse address represent the neuron address, which uniquely identify the post-synaptic neuron which the cascade synapse is connecting to. The aspect ratio of the neural system, i.e. how many neurons there are and how many synapses each has can be changed freely within the SPU by changing 4 The SPU was originally designed to interact with an aVLSI chip with 256 neurons and 8192 synapses, the largest of its kind at that time 4-42
  • 44. The Synaptic Processing Unit Anthony Hsiao this neuron address width, and does not have to correspond to the actual number of neurons (or synapses) on the aVLSI chip. Modular 4.5 Modular design of the SPU Apart from implementing cascade synapse behaviour in a virtualised fashion, the SPU has to perform two other important tasks: spike forwarding and learning. Overall, the core of the SPU, i.e. ignoring data I/O and FPGA board particulars, will have the following four modules: • Forwarding module • Learning module • Cascade module • Cascade memory The conceptual architecture that stems from these four modules is depicted in Figure 20. Figure 20: Conceptual Architecture of the SPU 20: The principle of operation of the SPU is as follows: 4-43
  • 45. The Synaptic Processing Unit Anthony Hsiao 1. The signal selector (not one of the core functions of the SPU) performs arbitration between pre- and post-synaptic inputs, and forwards this address into the SPU, to the forwarding module, the cascade memory and the learning module. 2. The cascade memory retrieves the cascade synapse representation corresponding to the synapse address, and, at the same time, writes new cascade states to (another location in) memory. 3. The learning rule (stochastically) produces plasticity signals as required by STADP and the pre- and post-synaptic spikes the SPU receives. 4. The forwarding module forwards pre-synaptic addresses on to the output of the SPU, depending if, and only if, the efficacy of the synapse is high. 5. The cascade module (stochastically) processes the cascade representation according to the plasticity signals it receives from the learning module and passes on a new cascade state to be written by the cascade memory This architecture can be fully pipelined, so that the SPU can process one ‘instruction’, i.e. one address event, per clock cycle. This is particularly important in order to ensure that the SPU is operating fast enough, since in a multi-chip environment, it should not be the processing bottleneck, but rather, it should be able to process whatever is being thrown its way by the pre-synaptic input (USB). Since the AER bus can typically transmit about 1Mevent/second, the SPU should be able to process a multiple of that, which a fully pipelined architecture allows. In order to ensure that only the ‘right’ signals are being processed and that no wrong data is written to memory, the SPU uses an extra level of control signals that indicate the validity of the data shown in Figure 20. 4.6 Module specifications The high level relationship between the individual modules described above translates into precise input/output and functional specifications, described below. 4-44
  • 46. The Synaptic Processing Unit Anthony Hsiao 4.6.1 Forwarding Function: • To forward valid pre-synaptic spikes to the post-synaptic neuron address over the AER output of the SPU, if the ‘target’ synapse has high efficacy or a teacher signal was sent. Input signals: • neuron_address: address of the synapse the current pre-synaptic spike is addressed to. Up to 13bits • target_synapse_efficacy: MSB of the cascade representation of the addressed synapse. 1bit. • address_pre_post: control signal issued by the signal selector which indicates whether current data comes from the pre-synaptic (‘0’) or the post- synaptic (‘1’) feedback input. 1bit. • address_valid: control signal that indicates whether current data is a valid Outputs: • target_neuron_address: address of the post-synaptic neuron that is to be sent out through the AER output. up to 8bits. • target_address_valid: control signal that indicates whether the target neuron address is valid. 1bit. 4.6.2 Learning Rule (STADP) Function: • To implement STADP • To correctly produce plasticity events (dep./pot.) Inputs: • synapse_address: address of the incoming pre- or post-synaptic spike. Up to 13bits. • address_pre_post: control signal issued by the signal selector which indicates whether current data comes from the pre-synaptic (‘0’) or the post- synaptic (‘1’) feedback input. 1bit. • address_valid: control signal that indicates whether current data is a valid. 1bit. Outputs: • cascade_synapse_address: address of the cascade synapse that the plasticity signals are valid for. Up to 13bits. • plasticity_dep_pot: plasticity signal, indicating whether the cascade synapse should be depressed (‘0’) or potentiated (‘1’). 1bit. • plasticity_valid: control signal that indicates whether the plasticity signal and the cascade synapse address are valid. 1bit. 4-45
  • 47. The Synaptic Processing Unit Anthony Hsiao Cascade 4.6.3 Cascade Process Function: • To process cascade states according to plasticity signals from the learning module Inputs: • cascade_synapse_state: cascade state representation of the cascade synapse that is to be processed. Up to 6bits. • cascade_synapse_address: address of the current cascade synapse that the plasticity signals are valid for. Up to 13bits. • plasticity_dep_pot: plasticity signal, indicating whether the cascade synapse should be depressed (‘0’) or potentiated (‘1’). 1bit. • plasticity_valid: control signal that indicates whether the plasticity signal and the cascade synapse address are valid. 1bit. Outputs: • cascade_address_out: address of the new cascade state representation of the valid new state. Up to 6bits. • new_state: new processed cascade state representation ready to be written back to memory. Up to 6bits. • new_state_valid: control signal that indicates whether the new state and the cascade out address is valid. 1bit. 4.6.4 Cascade memory Function: • To retrieve cascade representations of synapses addressed at its read port • To store valid and new cascade representations of synapses addressed at its write port Input signals: • synapse_address: address of the cascade the current pre-synaptic spike is addressed to. Up to 13bits. • new_state_address: address of the new state that has undergone plasticity. Up to 13bits. • new_state: new state of cascade synapse after processing. Up to 6bits. • new_state_valid: control signal that indicates whether the new state for the new state address is a valid. 1bit. Outputs: • current_state: address of the post-synaptic neuron that is to be sent out through the AER output. Up to 6bits. 4-46
  • 48. The Synaptic Processing Unit Anthony Hsiao 4.6.5 Global signals In addition to the inputs specified above, all modules share clock, clock enable and asynchronous reset inputs to reset all internal registers and FIFOs. Note that the content of memory is not reset to the initial state by this reset signal, but only the output registers of the memory are cleared. All signals internal to the SPU are active high. 4-47
  • 49. The Synaptic Processing Unit Anthony Hsiao 5 Implementation ‘It's not good enough that we do our best; sometimes we have to do what's required’ – Winston Churchill Pseudo- 5.1 Pseudo-random number generators The performance of stochastic learning processes, indeed of any stochastic process, is heavily dependent on the ‘quality’ of the underlying randomness. Since the SPU has random processes in two of its major functional components, the cascade synapse module and the learning rule, implementing a good pseudo-random number generator (pRNG) is even more important. A good pRNG generates highly uncorrelated sequences of pRNs with a very long maximum-length, before the sequence repeats. A good review on ‘classical’ pRNGs can be found in [8], however the pRNG used here is more unconventional. Instead of performing mathematical manipulation, including multiplication by prime numbers and modulo division to generate pRNs, which is what most classical pRNGs do and is rather resource intensive for a digital logic implementation, a so called Hybrid cellular automata (HCA) array pRNG is employed, which, on the contrary, are a very efficient choice for FPGA implementation. Cellular automata consist of grids of ‘cells’, where each cell can be in one of a finite number of states. Time is discrete, and each cell has a local update rule to determine the state of it in the next unit of time. One of the most popular cellular automata is Conway’s 2D ‘Game of Life’. Here, we consider a one dimensional binary HCA, i.e. an array of bits, where each cell (bit) has one of two local update rules, namely Rule 90 or Rule 150, as shown in Figure 21, classified by Wolfram [16]. Rule 90 takes the XOR of both of its neighbours to determine the next state of a cell, while Rule 150 adds the XOR of the current value of the cell as well. Cells beyond the boundaries of the array are considered to be '1' at all times, which ensures that the automaton does not freeze in case of all cells being '0'. These choices and the right configuration for the rules used 5-48
  • 50. The Synaptic Processing Unit Anthony Hsiao ensure that the pRNG produces maximum length sequences of uniform pRNs. In [8], there is a detailed description of which rules to use for what bit position to generate maximum length sequences for HCA arrays of a given size. Figure 21: A Hybrid Cellular Automata linear array 21: The HCA pRNG makes use of two different nearest neighbour update rules, namely Rule 90 and Rule 150. It is very suitable for implementation on an FPGA, and further produces maximal-length sequences of highly uncorrelated patterns. Figure courtesy of Dylan Muir. If used as described above, HCA pRNGs would introduce high correlation for adjacent cells, which can be avoided by only using a subset of non-neighbouring bits from a larger array to generate random numbers. One possible choice for creating a 32bit random number is to use a 128bit HCA, tapping off every fourth bit to form the pRN, for example. By using this method to generate pRNs as required by the different modules, the stochastic processes in the SPU can be trusted to be as random as is possible, to the best of the knowledge of the author. 5.2 Description of generics generics Before explaining the architecture of the individual SPU internal modules, it is helpful to understand the parameterisation of the VHDL code that was carried out in order to 5-49
  • 51. The Synaptic Processing Unit Anthony Hsiao keep the SPU reconfigurable. The following is a brief description of the generics used within the implementation that allow a customisation of the SPU. • SYNAPSE_ADDRESS_WIDTH : natural := 13: The synapse address width is the width of most the addresses within the SPU, and sets the maximum number of synapses that can be addressed. By default, it is set to 13bits, allowing for up to 8192 cascade synapses to be addressed. The fixed depth of the cascade memory (the memory itself is not parameterisable) also limits the maximum number of synapses to be implemented to 8192, although fewer synapses may be used (manual reconfiguration of the memory would be required to increase the depth of the cascade memory; this is not difficult). • NEURON_ADDRESS_WIDTH : natural := 8: The neuron address width is the width of the neuron address, and tells the SPU how many of the synapse addresses’ MSBs are attributed to identifying the neuron. By default, it is set to 8bits allowing for up to 256 neurons to be addressed, and a smaller number of neurons can be specified without problems. • CASCADE_WIDTH : natural := 5: The cascade width is the number of bits that the cascade representation uses. It can be up to 6 bits wide, as limited by the width of the cascade memory, but fewer bits, such as the default value of 5 bits may be specified. The cascade width includes both the efficacy bit and the plasticity probability width. At the same time, the cascade width specifies the width of the pRN generated in the cascade synapse module, which is always one bit less than the cascade width (since the plasticity probability in the cascade representation, which will be compared to the pRN, is one bit smaller than the cascade width). • PRE_THRESHOLD : natural := 230: The pre threshold sets the p(plasticity) with which STADP elicits plasticity events; the higher the threshold, the smaller is the p(plasticity). It may range from 0 to 255, where p(plasticity) would be 1 and 0 respectively. Using these four parameters, the SPU can be configured, at compile time, to have the desired characteristics. 5-50
  • 52. The Synaptic Processing Unit Anthony Hsiao 5.3 Module level design The following sections will individually describe the implementations of the SPU’s modules on a functional level. In order to save paper and time, no VHDL code is reproduced here. The interested reader is advised to consult the supplementary CD for the VHDL code. In all of the diagrams shown in the following sections, the convention shown in Figure 22 for arrows is used. In particular, dotted arrows are used to represent the flow of control signals, dashed arrows for addresses and solid lined arrows are used to represent the flow of data. 22: Figure 22: Conventions on the arrows used in block diagrams Furthermore, light blue vertical bars are used to indicate register levels or clocked processes. 5.3.1 Spike forwarding The forwarding module is the simplest out of all the four major functional modules. As specified in the previous chapter, it ‘only’ has to forward valid pre-synaptic spikes if the synapse it was addressed to has high efficacy, or if it is being sent to the teacher synapse. The basic structure of the learning module is shown in Figure 23. The outputs are generated in a very simple way. The target neuron address is simply forwarded directly from the incoming neuron address, while the target address valid signal is a simple chain of logic operations. Note that the target address valid signal is dependent on the negation of the address_pre_post signal, since a pre-synaptic input spike is represented by a ‘0’. 5-51
  • 53. The Synaptic Processing Unit Anthony Hsiao Figure 23: Spike forwarding module block diagram 23: The teacher synapse is defined to be the 0th synapse of every neuron, i.e. if the synapse address’ bottom (depending on how wide the neuron address width is) bits are zero, then it is sent to the teacher synapse, and should be forwarded regardless of the synaptic efficacy. Due to its simplicity, the forwarding module only requires one clock cycle to perform the processing. 5.3.2 Learning rule (STADP) The learning rule module is much more complex, as shown in Figure 24. It contains some logic, several registers, a look-up table implemented by a 256x36bit single port ROM, a 256x36bit single port memory block RAM, a 36bit timer with 11.1ns resolution and an 8bit pRNG. In order to understand it, it is best to work from the outputs backwards, and considering separately what happens on a pre- and on a post- synaptic synapse address (spike). There are three output signals: the cascade synapse address, the plasticity signal and the plasticity valid signal, which need to be considered first. 5-52
  • 54. The Synaptic Processing Unit Anthony Hsiao The cascade synapse address is simply a forwarded version of the input synapse address. The plasticity signal, i.e. whether a synapse should be depressed or potentiated, depends on the activity of the postsynaptic neuron. As mentioned earlier, this is implemented by drawing pseudo-random exponentially distributed expiry times for the post-synaptic neuron, at which it becomes inactive, and comparing this expiry time to the current time is all it needs to elicit the right plasticity signal. So, if the current time, i.e. the output of the timer, is greater than the post-synaptic neuron’s expiry time which is given by the output of the expiry time memory, i.e. it has already expired, then a depression signal is produced (plasticity_dep_pot is reset to ‘0’). If the current time is less than or equal to the expiry time, then the neuron has not yet expired but is still active, and a potentiation signal is produced (plasticity_dep_pot is reset to ‘1’). The plasticity valid signal is only valid, if the incoming spike is valid and pre-synaptic. Furthermore, since plasticity signals are only elicited with a probability p(plasticity), the plasticity valid signal is further only valid, if an 8bit pRN is above the plasticity threshold pre_threshold. That is really all there is to the generation of plasticity signals, i.e. that is all that happens on arrival of a pre-synaptic spike. The rest of the STADP learning rule module is concerned with handling post-synaptic spikes and setting pseudo-random exponentially distributed expiry times. 5-53
  • 55. The Synaptic Processing Unit Anthony Hsiao Figure 24: STADP learning rule block diagram 24: 5-54