Machine learning and deep learning have proved to excel at pattern recognition tasks and represent state of the art models for computer vision, NLP, predictive applications and many artificial intelligence tasks. Neural networks represent a mathematical model of how scientists think the brain performs certain computations, but in decades they lacked of memories and memory/context recall characteristics of the brain. Although there’s recurrent neural networks (like LSTM) which handle memory and context, they can’t scale their memory size which prevents them from storing rich meaningful information over time and over contexts.
To solve this, some months ago researchers at Google’s deep mind proposed a model that combines neural networks with an external memory store and designed it to simulate the way neuroscientists think the brain manages memories (storing and retrieval), the system is fully differentiable which means that using the magic of calculus optimization, it can learn to handle its own inner-workings and memory I/O from scratch using training data.
The authors of the model demonstrated use cases where they trained it with graph data like family trees, or the London underground transport system and the system learned to use its memory to answer questions like navigation routes, and family relations hidden in the data, the model proved to have lots of potential to create new AI based applications that learn for themselves how to work with complex data structures (like graphs) without explicit programming. However, the problem is the scarce information available its neuroscience oriented and focused, which makes it difficult to grasp for software engineers, developers and data scientists.
We’ll demystify this fascinating architecture using analogies and examples familiar to software engineers, developers and data scientists,and provide intuitions that make the model easier to understand, adopt, which will unlock a complete new type of AI applications.
4. DNC basic idea
•Memory augmented neural network
•Neural network with I/O access to external memory
•I/O operations are learned instead of programmed
4#DLSAIS14
5. DNC basic idea
•Von Neumann computer architecture:
•CPU: in the DNC the CPU is a neural network
•Memory: separate external memory bank accessed by CPU via
read/write operations
5#DLSAIS14
6. Neuroscience meets AI and CS
•Basic architecture and memory allocation(release and assign) based in
computer science.
•Memory access(read) and retrieval based on neuroscience(hippocampus)
6#DLSAIS14
7. High level architecture
7#DLSAIS14
•A neural network called
controller (CPU) performs
computation on input data
•Read/Write heads perform
I/O from and to memory
•The controller interacts
with the read/write heads to
use “memories” for
computation.
8. DNC vs Neural Network
•Neural networks excel at pattern recognition ,perception tasks, sensory recognition and
reactive decision making(map inputs X to outputs Y) but they can’t be used for:
•Planning and reasoning tasks
•Use “memories” and facts from previous events
•Store useful information for future usage
•Generalize knowledge to new tasks(AGI)
•Work with complex data structures , like associative ones(graphs or trees)
8#DLSAIS14
9. DNC vs Neural Network
•The DNC tries to solve this by mixing the best of both worlds(memory based architecture
and machine learning):
•Perception and pattern recognition capabilities from machine learning
•Planning and reasoning based on previous memories and knowledge
•Usage of complex associative data structures
•Like a computer it can organize knowledge , data and facts as well as links between
them but like a neural network it needs no explicit programming because it can learn to
do so from examples(data).
9#DLSAIS14
10. Knowledge retrieval
•The DNC decides which “memories” to retrieve based on “attention mechanisms”
which can be described from both computational as well as neuro-science
perspectives,specially hippocampal synapses.
•Foundations of Human Memory , from Michael Kahana provides key human memory
concepts which the DNC has analogies with.
10#DLSAIS14
Neuro-science
Computational Neuro-science
Which external memory
locations to read and write
How does the brain
retrieves and relates stored
“memories” ?
11. Memory(attribute vectors)
•The external memory it’s a real number matrix(NxW).
•Attribute theory: every human memory is represented by a list of attributes which
describe the memory itself,and the context.
11#DLSAIS14
Computational Neuro-science
RAM with N positions and
word-size W
Human memories are
represented as a list of W
attributes.
13. Content based(similarity) access
•The controller(CPU) can emit a key vector and retrieve from the memory(or write to)
locations that best matches the key.
•Neuro-science proposes a model were :we can remember events when exposed to a
similar experience
13#DLSAIS14
Computational Neuro-science
Retrieve a weigthed sum of
memory values,weighted by
similarity to some specific value.
Similarity can be cosine similarity
We recall(or reinforce) past
experiences when exposed to
similar ones.
15. Time ordered access(temporal links)
•The system records the order in which memory locations are written.
•Temporal Context Model: its easier for us to remember and recall events in the order
they occurred(try to say all alphabet characters in random order vs ordered)
15#DLSAIS14
Computational Neuro-science
Linked list of memory position
written ,ordered by time.
Recall/retrieve memories in the
order they occurred.
17. Short term and Long Term Memory
•Although not mandatory, the controller can be a LSTM(long short term memory) neural
network which provides short-term memory.
•Search of associative memory(SAM): SAM model proposes that our memory is a dual
store, a shor-term store and a long-term store.
17#DLSAIS14
Computational Neuro-science
Short-term memory provided by
LSTM neural network controller.
SAM model of dual memories
storage.
19. Dynamic Memory Allocation
•Additionally to writing by content, the DNC can assign and release memory as a
computer does, based on memory usage percentage and read orderings.
•The DNC can choose to write on new locations , update existing ones(reinforce
memories) or not write at all.
19#DLSAIS14
Computational Neuro-science
Dynamic memory administration. Add new memories or reinforce
existing ones.
23. Complete architecture
23#DLSAIS14
At each time-step(clock cycle) the DNC:
• Gets an input(data) and calculates an output
that it’s a weighted sum of its inputs and the
“memories” retrieved from memory.
•The DNC decides how to interact with the
memory (where and what to read and write) via
an “interface vector”.
•The DNC sends the “memories” read to the next
time-step.
25. How the DNC decides I/O
25#DLSAIS14
How the DNC learns and decides how to interact with memory?
•The differentiable part of the DNC.
•Every component of the system uses weights similar to those of a
neural network
•Thus it can be trained via gradient descent and multi-variate calculus
optimization.
•Using samples(data) the system learns how to behave optimally.
28. Potential applications
28#DLSAIS14
•Graph reasoning problems.
•DeepMind trained the DNC on many random graphs:
•It learned to use it’s memory to navigate through the graph.
•Then 2 specific graphs were fed:
- The London underground graph
- A family tree
-Surprisingly it was able to generalize without re-training( AGI ?)
29. Potential applications
29#DLSAIS14
•Reinforcement learning
•It was tested on a grid game where:
•The player(agent) is given a set of goals and constraints per goal.
•It is then requested to satisfy a single goal
•It has to plan and reason how to achieve the goal.
•It stored the goals and constraints in memory
30. Thanks for your attention
30#DLSAIS14
•My contact:
-Linkedin: https://www.linkedin.com/in/luis-fernando-leal-hernandez-9a736276/
-Email: wichofer89@gmail.com
-Github: https://github.com/llealgt/DNC/
•References and illustrations thanks to:
•“Hybrid computing using a neural network with dynamic external memory", Nature 538, 471–476
(October 2016) doi:10.1038/nature20101.
•“Implementation and Optimization of Differentiable Neural Computers” ,Carol Hsin,Stanford
University
•“Differentiable memory and the brain”,Sam
Greydanus,https://greydanus.github.io/2017/02/27/differentiable-memory-and-the-brain/