4. Hierarchical memories and Sparse Code 1. Associative memories
𝑥i 𝒚j
wji
Weight: strength of link from i to j:
component i
of the input vector
component j
of the output vector
4
5. Hierarchical memories and Sparse Code 1. Associative memories
input 𝑥 𝜇
𝒙 𝟏
𝜇
0
𝒙 𝟐
𝜇
1
𝒙 𝟑
𝜇
1
𝒙 𝟒
𝜇
0
𝒙 𝟓
𝜇
0
output 𝑦 𝜇
𝒚 𝟏
𝜇
0
𝒚 𝟐
𝜇
0
𝒚 𝟑
𝜇
0
𝒚 𝟒
𝜇
0
𝒚 𝟓
𝜇
1
𝒚 𝟔
𝜇
0
xi yj
wji
w11
w12
w14
w13
w15
Link from i to j:
component i
of the input vector
component j
of the output vector
5
9. Behavior of matricial associative memories
• The memory load increases with the number of patterns learned
Hierarchical memories and Sparse Code 1. Associative memories
0,0%
0,1%
0,2%
0,3%
0,4%
0,5%
0,6%
0,7%
0 200 400 600 800 1000 1200 1400 1600 1800
p1-memoryload
p - number of patterns learned (stored)
Memory load p1 in function of p
Probability that a synapse is active after storing p patterns
𝑝1 = 1 − (1 −
𝐾𝐿
𝑀𝑁
) 𝑝
𝒑𝟏 = 𝟏 − (𝟏 −
𝑲 𝟐
𝑵 𝟐
) 𝒑
9
10. Hierarchical memories and Sparse Code 1. Associative memories
• The quality of the retrieved outputs deteriorates with increasing memory load
Numberofadd-errorsinoutput
Memory load in r=3
10
11. Research question
• How can we increase network performance?
i.e, increase the number of stored patterns using the same number of computations
without compromising the quality of the retrieval?
↔
• How can we reduce the number of computational steps
for the same number of stored patterns?
Hierarchical memories and Sparse Code
11
12. Research question
• How can we increase network performance?
i.e, increase the number of stored patterns using the same number of computations
without compromising the quality of the retrieval?
↔
• How can we reduce the number of computational steps for the
same number of stored patterns?
Solution
• Reorganizing matrices
Hierarchical memories and Sparse Code
12
20. 2. c) Optimal capacity
Maximum number Cstor of associations stored is O(ln N2):
Cstor =
𝑃
𝑀
= ln(2)
𝑁2
𝐾2
with:
P – number of associations
M – number of neurons (dimension of output);
N – dimension of input
K – activity level (number of 1s per vector)
for an optimal O(ln N) activity level K:
𝐾 = 𝑙𝑜𝑔2
𝑛
4
(𝒔𝒑𝒂𝒓𝒔𝒆 𝒄𝒐𝒅𝒆)
Hierarchical memories and Sparse Code 2. Lernmatrix
20
37. 2. a) Learning with weights
• Hebb’s learning rule
Hierarchical memories and Sparse Code 4. Ordered indexes memory
W 𝒙2
𝒚1
𝒙1
W
𝒚2
1 0 0 0 0 1
2 0 0 0 0 1
1 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0
1
1
0
0
1
1
0
0
0
1 0 0 0 0 0 1 0 0 0 0 1
This is a codification
for the positions of
the correlations!
37
38. 4. a) Motivation
4. b) Idea
4. c) Solution: Ordered Indexes
Hierarchical Associative Memory
Hierarchical memories and Sparse Code 4. Ordered indexes memory
38
41. 4.c) Solution: Ordered Indexes Hierarchical Associative Memory
Hierarchical memories and Sparse Code 4. Ordered indexes memory
Auxiliary structures:
1 2 3 4 5 6 7 8 9 10 11 12
AllSequenceListinitial
Sequence of all the column-indexes
subsequence
node
41
42. Hierarchical memories and Sparse Code 4. Ordered indexes memory
For each Line L of the Wr=3
For each Subsequence SS in L
If (Ones and Zeros unclustered?(SS))
SplitAndOrder(SS);
Base of the algorithm:
42
43. Hierarchical memories and Sparse Code 4. Ordered indexes memory
For each Line L of the Wr=3
For each Subsequence SS in L
If (Ones and Zeros unclustered?(SS))
SplitAndOrder(SS);
Base of the algorithm:
Note: Variants of the model are more intelligent ways to select the next line to be tested.
43
66. Hierarchical memories and Sparse Code 4. Ordered indexes memory
r=3 r=2 r=1
Value of 𝑤𝑖𝑗:
- 1
- 0
66
67. Hierarchical memories and Sparse Code 4. Ordered indexes memory
r=3 r=2 r=1
Value of 𝑤𝑖𝑗:
- 1
- 0
67
68. 4. a) Motivation
4. b) Idea
4. c) Solution: Ordered Indexes Hierarchical
Associative Memory
4. d) Empirical experiments
Hierarchical memories and Sparse Code 4. Ordered indexes memory
68
69. 4. d) Empirical experiments
• Method: test:
• quality (add errors) (the same as before)
• performance (number of computations) of retrieval
• additions
• multiplications
• threshold-comparisons
• fire-shots
Hierarchical memories and Sparse Code 4. Ordered indexes memory
69
70. 4. d) Empirical experiments
• (default) Experience – in Julia programing language
• Data base of 1600 patterns
• 120 tests
• For each test:
• Performance and Quality
… in function of memory load
i.e, probability of 𝑤𝑖𝑗 = 1, ∀𝑖, 𝑗. Memory load variable thanks to a 𝑝 (number of patterns learned) variable
for retrieving 20 patterns
• Fixed N (number of neurons)
• Fixed K (activity level or number of 1s per vector). With Gauss distribution
• Each test in run by 5 models…
Hierarchical memories and Sparse Code 4. Ordered indexes memory
70
71. 4. d) Empirical experiments
(...)
• Each test in run by 5 models
• Lernmatrix
• Hierarchical Associative Memory
• 3 models for Ordered indexes Hierarchical Ass. Memory
• Lines for iteration naively chosen
• Lines for iteration with more 1s chosen first
• Lines for iteration with more 0s chosen first + right null columns discarded
Hierarchical memories and Sparse Code 4. Ordered indexes memory
71
72. Hierarchical memories and Sparse Code 4. Ordered indexes memory
Total number of steps in function of memory load
in layer r=3
For each test, and for each model
- Lernmatrix
- Hierarchical Ass. Mem.
- (naively) Ordered H.A.M
- (1s first) Ordered H.A.M
- Ord. H.A.M with right
null-columns discarded
72
74. Hierarchical memories and Sparse Code 4. Ordered indexes memory
- Lernmatrix
- Hierarchical Ass. Mem.
- (naively) Ordered H.A.M
- (1s first) Ordered H.A.M
- Ord. H.A.M with right
null-columns discarded
Curves of Hierarchical models >>
curve of Lernmatrix (≈-80% steps)
Hierarchical models:
74
75. Hierarchical memories and Sparse Code 4. Ordered indexes memory
- Lernmatrix
- Hierarchical Ass. Mem.
- (naively) Ordered H.A.M
- (1s first) Ordered H.A.M
- Ord. H.A.M with right
null-columns discarded
Curves of Hierarchical models >>
curve of Lernmatrix (≈-80% steps)
Why?
Pruning of
≈ 80% columns
75
76. Hierarchical memories and Sparse Code 4. Ordered indexes memory
- Hierarchical Ass. Mem.
- (naively) Ordered H.A.M
- (1s first) Ordered H.A.M
- Ord. H.A.M with right
null-columns discarded
Original
Ordered
models
Performance Ordered Hierarchical models
>> original hierarchical model
Why?
1. Reordering
improoves the
aggregations
for pruning
76
77. Hierarchical memories and Sparse Code 4. Ordered indexes memory
Shift
Why?
1. Reordering
otimizes aggregations
for pruning
2. Reordering frees space
y = total steps
x = memory load in r=1
- Hierarchical Ass. Mem. (H.A.M)
- (naively) Ordered H.A.M
- (1s first) Ordered H.A.M
- Ord. H.A.M with right
null-columns discarded
77
78. Hierarchical memories and Sparse Code 4. Ordered indexes memory
Performance model that discards
columns > other ordered column-indexes
models
Why?
Right of the matrix
is not even visited
- Hierarchical Ass. Mem.
- (naively) Ordered H.A.M
- (1s first) Ordered H.A.M
- Ord. H.A.M with right
null-columns discarded
78
80. 5.1) Achievements
• Considerable savings
• 2-20% (worst-case) of the total number of steps
(Ordered column-indexes hierarchical model with right null-columns discarded, relatively to Lernmatrix.)
• Worthy trade-offs
• More resources spent in infrastructure and computations in Learning phase -
done (only) once.
• Overcome by benefits in retrievals
Hierarchical memories and Sparse Code 5. Conclusion
80
81. 5.2) Future work
• Variable aggregation factor
• Adapt window to density of different zones of the matrices
• Different distribution for correlations
• Check how non-uniform activity patterns affect the models
• Check cost of hierarchical models of neural networks in biology
Hierarchical memories and Sparse Code 5. Conclusion
81