EchoBay: optimization of Echo State Networks under memory and time constraints

1
EchoBay:
Design of Echo State Networks
under Memory and Time constraints
17th - 31st May
NGCX@San Francisco
Luca Cerina {luca.cerina@polimi.it}
Giuseppe Franco {g.franco4@studenti.unipi.it}
Marco D. Santambrogio {marco.santambrogio@polimi.it}

2
Biological inspiration
cdn.aarp.net

3
Visual Cortex
GoogleNet
In Convolutional Neural Networks and similar architectures information directly
traverse the network:
● Information and recognition are deﬁned by neural weights
● Easier to learn (derivable functions)
● Don’t require temporal relations to function properly

4
Memory in the Brain
human-memory.net
Complexity explodes if we want to
mimic memory functions.
Memory is spread across diﬀerent
brain cortexes.
Following the biologically-inspired
road is technologically challenging.

5
Memorize everything
digitaltrends.com
Although information density grows larger and larger, data storage without
semantic is not an eﬃcient paradigm for memory (e.g. 1mln photos of cats do
not explicitly represent the cats concept).

6
Model everything
Graphs and equation models add semantics to the data to improve knowledge,
but either require human supervision (e.g. annotating graphs) or they are
extremely diﬃcult to identify from data (e.g. NARMAX models of non-linear
dynamic systems).

7
Learn everything
Early Hopﬁeld networks provided associative memory, but with a low recall (0.13
- 0.14).
Modern LSTM networks (long-short term memory) can learn complex temporal
relations at diﬀerent time-scales.

8
Power demands power
Novel RNN architectures are more efficient than
LSTM, but they still require long learning time
and high computational power.
These limitations confine RNNs in large cloud
setups (38M parameters for DeepSpeech 2[1])
or really shallow models (4 layers at most) on
mobile systems [2][3].
Latency-critical applications require smarter
models.
[1] Dario Amodei et al. 2016. Deep speech 2: End-to-end speech recognition in english and mandarin
[2] Qingqing Caoet al. 2017. MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU.
[3] J. Chauhan et al. 2018. Breathing-Based Authentication on Resource-Constrained IoT Devices using Recurrent Neural Networks.

10
Echo State Network
● Win
and W are Random → Untrained
● Wout
is Trained with least square
regression
● Fewer Weights
● Less Data Required
● Eﬃcient and Fast Training

11
ESN embedded
Echo State Networks can exploit reservoirs with different topologies: random,
cyclic, and mixed (small world).
Simpler topologies allow special implementations such as photonic devices
[4], memristors [5], and FPGAs [6].
Regular random ESN instead require a fine tuning of hyper-parameters to
reach a compromise between performance, memory usage and
computational time.
[4] Laurent Larger, et al 2017. High-speed photonic reservoir computing using a time-delay-based architecture: Million words per
second classification.
[5] Shiping Wen, Rui Hu et al 2018. Memristor-based echo state network with online least mean square.
[6] Miquel L Alomar et al 2016. FPGA-based stochastic echo state networks for time-series forecasting.

12
Bayesian Optimization
Bayesian Optimization eﬃciently searches
in the regions that are expected to improve
the performance (i.e. the ﬁtness function).
How does it apply to embedded systems?

13
Memory Constraints
Since ESN are extremely sparse, memory occupation is mostly dependent from
the number of active units .
On embedded devices we can optimize the network both setting hard constraints
on density and reservoir’s size, and introducing a penalty factor on Runits

14
Time Constraints
Since ESN are extremely sparse, also .
A target-dependent benchmark map gives us a contour of that respects
a given time-constraint.
Other options include: precision reduction, quantization, and multithreading.

15
Experimental analysis
The network was tested on state-of-the-art non-linear prediction tasks:
● SantaFe Laser
● NARMA10 task
● Lorenz equations

16
Results: memory
Hard-constraints optimization can be managed choosing high Nr, high 𝜌, or
balanced solutions, following the benchmark contours.
1-step Laser prediction task on ESP32 target (240MHz).
Other hyper-parameters: 𝜔in = 𝛼 = 1, λ = 0

17
Results: memory
If the problem becomes too complex for small random topologies, we can couple
the constraint with memory penalization.
5-step NARMA10 prediction task on ARM target.
Other hyper-parameters: 𝜔in = 0.4, 𝛼 = 1, λ = 0
ESP32 reached only 45% accuracy

18
Results: time
Proper boundaries on Bayesian optimization guarantee optimal and
good performance under decreasing time constraints
5-step Lorenz prediction task on ARM target.

19
Results: time
Proper boundaries on Bayesian optimization guarantee optimal and
good performance under decreasing time constraints

20
Conclusions
Bayesian Optimization and Echo State Networks provide
competitive performance on temporal learning tasks.
Target dependent constraints allow performance tuning and
smarter optimization.
The EchoBay library simpliﬁes the design and testing process
without a single line of code.
QUESTIONS?
Luca Cerina {luca.cerina@polimi.it}
Giuseppe Franco {g.franco4@studenti.unipi.it}
Marco D. Santambrogio {marco.santambrogio@polimi.it}

EchoBay: optimization of Echo State Networks under memory and time constraints

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a EchoBay: optimization of Echo State Networks under memory and time constraints

Similar a EchoBay: optimization of Echo State Networks under memory and time constraints (20)

Más de NECST Lab @ Politecnico di Milano

Más de NECST Lab @ Politecnico di Milano (20)

Último

Último (20)

EchoBay: optimization of Echo State Networks under memory and time constraints