[DSC Adria 23] Veljko Pejovic Lightweight Deep Learning on Edge Devices.pptx

Lightweight Deep Learning
on Edge Devices
Veljko Pejović (veljko.pejovic@fri.uni-lj.si)
Faculty of Computer and Information Science
University of Ljubljana, Slovenia
Computer Science Department,
Lancaster University, UK

AI Should Live on the Edge
Privacy and availability
“4 in 10 consumers opt not to use the [AI-powered
voice assistant] services because they are worried
about their data”
The Voice Consumer Index (VCI)
Vixen Labs, 2021
“AI requires a high-bandwidth, low-latency network.
It is important to ensure the service wrap and
technology stack are consistent for all regions”
What are the infrastructure requirements for artificial intelligence?
Terry Storrar, Leaseweb, 2021

AI Struggles on the Edge
Latency, memory, energy
• Limited resources vs
increasing model requirements
Canziani, A., Paszke, A., & Culurciello, E. (2016). An analysis
of deep neural network models for practical applications.
arXiv preprint arXiv:1605.07678.
• Heterogeneous devices and
latency/energy burden
Wang, H., Kim, B., Xie, J., & Han, Z.
How is energy consumed in smartphone deep learning apps?
Executing locally vs. remotely. In IEEE GLOBECOM 2019

Next Generation Hardware Won’t Help
Mobiles will lag
• Breakdown of Dennard scaling
• Packing more transistors in the
same area will dissipate more power
• Multicore needs space
• More energy for computation and cooling
[Hennessy & Patterson, Turing Award Lecture 2019]

Approximate Mobile Computing
We don’t need all the computing power and energy all the time
AMC

Opportunities for AMC
• Computed result quality exceeds the limits of human perception or attention
• Computed result quality exceeds a user’s interest/need
• Preserving resources is more important than high result quality
• Inputs and/or the computation are inherently noisy
• Inputs are inherently “easy” to process

Bringing AMC to Masses
Programming support for context-aware approximation
• All developers should be able to approximate
• Mobile developers are not data scientists
• Approximation should be dynamic

Mobiprox
Supporting approximate deep learning on mobiles
• Implement support for approximate
tensor operations on Android
M. Fabjančič, O. Machidon, H. Sharif, Y. Zhao, S. Misailović, V. Pejović
Mobiprox: Supporting Dynamic Approximate Computing on Mobiles
arXiv:2303.11291 (2023)

Mobiprox
QoS loss
Speedup
• Implement support for approximate
tensor operations on Android
• Uncover the Pareto-front of
configurations (layer-wise
approximations), that give the optimal
speedup — inference accuracy trade-
off
• Devise dynamic adaptation
algorithms for navigating the Pareto
front

Mobiprox
• Approximations:
• Filter sampling, perforated convolutions, quantization
• Implementation: expanded CLBlast lib
• Tuning:
• On a GPU-enabled cluster * **
• On an Android device
Row perforation and column perforation
Filter sampling
* Sharif et al., ApproxTuner: A Compiler and Runtime System for Adaptive Approximations. PPoPP, 2021
** Sharif et al. ApproxHPVM: a portable compiler IR for accuracy-aware optimizations. OOPSLA, 2019

Profiler for
Android
ApproxTuner
Mobiprox
Compiler pipeline
PyTorch
- Definition
- Training
Tuning binary
(CUDA)
Inference binary
(OpenCL)
ART
JNI
Preparation
LLVM
hpvm-tensor-
rt-android
Java/Kotlin
App

Dynamic Approximation Adaptation
Context-aware, need-driven, business-oriented adaptation
• Arbitrary adaptation strategies can be implemented
• “More accurate human activity recognition model when a user is exercising”
• “Higher approximation level when battery falls under 15%”
• Our pick: “Minimize energy usage without sacrificing the inference accuracy”

Dynamic Approximation Adaptation
Driven by SoftMax confidence thresholds
SoftMax
confidence
t

Evaluation
Human activity recognition
• 21 volunteers, on-body UDOO boards,
six prescribed activities
• Slight accuracy drop
from 65% to 63% accuracy (-2%)
• Significant energy savings
from 245mAh to 209mAh (-15%)
• Certain classes are more robust
to approximation than others
Average accuracy vs. average energy consumption for all users
non-approximated network vs confidence-based adaptation

Evaluation
Spoken keyword recognition
• HONK model built on Google SC
• Mix 160 unheard utterances from
Google SC with noise levels from
realistic environments
• Confidence-based adaptation
• 15% less energy, 0% accuracy loss

Acknowledgements
The Team Resources
• Octavian Machidon
• Alina Machidon
• Davor Sluga
• Matevž Fabjančič
• Timotej Knez
• Janez Božič
• Tine Fajfar
• Jani Asprov
“Bringing Resource Efficiency to Smartphones with Approximate
Computing”
(ARRS project No.: N2-0136)
“Context-Aware On-Device Approximate Computing”
(ARRS project No.: J2-3047)
“Computer Structures and Systems”
(ARRS core funding No. P2-0098.
M. Fabjancic et al. Mobiprox: Supporting Dynamic Approximate Computing on
Mobiles, arXiv:2303.11291, 2023
A. Machidon and V. Pejovic, Enabling Resource-Efficient Edge Intelligence
with Compressive Sensing-Based Deep Learning, ACM Computing Frontiers,
May 2022
A. Machidon and V. Pejovic, Deep Learning Techniques for Compressive
Sensing-Based Reconstruction and Inference - A Ubiquitous Systems
Perspective, Artificial Intelligence Review, 2022
T. Knez, O. Machidon, and V. Pejovic, Self-Adaptive Approximate Mobile Deep
Learning, Electronics (2021)
V. Pejovic, Towards Approximate Mobile Computing, ACM GetMobile
Magazine, Vol 22(5), December, 2018.

Thank you!
Veljko Pejović (veljko.pejovic@fri.uni-lj.si)
University of Ljubljana, Slovenia
Lancaster University, UK
Code available at https://gitlab.fri.uni-lj.si/lrk

[DSC Adria 23] Veljko Pejovic Lightweight Deep Learning on Edge Devices.pptx

[DSC Adria 23] Veljko Pejovic Lightweight Deep Learning on Edge Devices.pptx

Recomendados

Recomendados

Más contenido relacionado

Similar a [DSC Adria 23] Veljko Pejovic Lightweight Deep Learning on Edge Devices.pptx

Similar a [DSC Adria 23] Veljko Pejovic Lightweight Deep Learning on Edge Devices.pptx (20)

Más de DataScienceConferenc1

Más de DataScienceConferenc1 (20)

Último

Último (20)

[DSC Adria 23] Veljko Pejovic Lightweight Deep Learning on Edge Devices.pptx