SlideShare una empresa de Scribd logo
1 de 20
Descargar para leer sin conexión
What I Know about
TensorFlow Lite
Koan-Sin Tan

freedom@computer.org

Hsinchu Coding Serfs Meeting

Dec 7th, 2017
• We heard Android NN and TensorFlow Lite back in Google I/
O 2017

• My COSCUP 2017 slide deck “TensorFlow on Android”

• https://www.slideshare.net/kstan2/tensorflow-on-
android

• People knew a bit about Android NN API before it was
announced and released

• No information about TensorFlow Lite, at least to me,
before it was released in Nov
tf-lite and android NN in
Google I/O
• New TensorFlow runtime
• Optimized for mobile and
embedded apps

• Runs TensorFlow models on
device

• Leverage Android NN API

• Soon to be open sourced
from Google I/O 2017 video
Actual Android NN API
• Announced with Android 8.1 Preview 1

• Available to developer in NDK

• yes, NDK

• The Android Neural Networks API (NNAPI)
is an Android C API designed for running
computationally intensive operations for
machine learning on mobile devices

• NNAPI is designed to provide a base layer
of functionality for higher-level machine
learning frameworks (such as TensorFlow
Lite, Caffe2, or others) that build and train
neural networks

• The API is available on all devices running
Android 8.1 (API level 27) or higher.
https://developer.android.com/ndk/images/nnapi/nnapi_architecture.png
Android NN on Pixel 2
• Only the CPU fallback is available

• Actually, you can see Android NN API related in AOSP
after Oreo MR1 (8.1) release already

• user level code, see https://android.googlesource.com/
platform/frameworks/ml/+/oreo-mr1-release

• HAL, see https://android.googlesource.com/platform/
hardware/interfaces/+/oreo-mr1-release/
neuralnetworks/
TensorFlow Lite
• TensorFlow Lite is TensorFlow’s lightweight solution for
mobile and embedded devices

• It enables on-device machine learning inference with low
latency and a small binary size

• Low latency techniques: optimizing the kernels for mobile
apps, pre-fused activations, and quantized kernels that
allow smaller and faster (fixed-point math) models

• TensorFlow Lite also supports hardware acceleration with
the Android Neural Networks API
https://www.tensorflow.org/mobile/tflite/
What does TensorFlow Lite
contain?
• a set of core operators, both quantized and float, which have been tuned for mobile platforms

• pre-fused activations and biases to further enhance performance and quantized accuracy

• using custom operations in models also supported

• a new model file format, based on FlatBuffers

• the primary difference is that FlatBuffers does not need a parsing/unpacking step to a secondary
representation before you can access data

• the code footprint of FlatBuffers is an order of magnitude smaller than protocol buffers

• a new mobile-optimized interpreter, 

• key goals: keeping apps lean and fast. 

• a static graph ordering and a custom (less-dynamic) memory allocator to ensure minimal load,
initialization, and execution latency

• an interface to Android NN API if available
https://www.tensorflow.org/mobile/tflite/
why a new mobile-specific
library?
• Innovation at the silicon layer is enabling new possibilities for hardware
acceleration, and frameworks such as the Android Neural Networks API
make it easy to leverage these

• Recent advances in real-time computer-vision and spoken language
understanding have led to mobile-optimized benchmark models being open
sourced (e.g. MobileNets, SqueezeNet)

• Widely-available smart appliances create new possibilities for on-device
intelligence

• Interest in stronger user data privacy paradigms where user data does not
need to leave the mobile device

• Ability to serve ‘offline’ use cases, where the device does not need to be
connected to a network
https://www.tensorflow.org/mobile/tflite/
• A set of core operators, both quantized and float, many of which have been tuned for mobile platforms. These
can be used to create and run custom models. Developers can also write their own custom operators and
use them in models

• A new FlatBuffers-based model file format

• On-device interpreter with kernels optimized for faster execution on mobile

• TensorFlow converter to convert TensorFlow-trained models to the TensorFlow Lite format.

• Smaller in size: TensorFlow Lite is smaller than 300KB when all supported operators are linked and less than
200KB when using only the operators needed for supporting InceptionV3 and Mobilenet

• Pre-tested models

• Inception V3, MobileNet, On Device Smart Reply

• Quantized versions of the MobileNet model, which runs faster than the non-quantized (float) version on CPU.

• New Android demo app to illustrate the use of TensorFlow Lite with a quantized MobileNet model for object
classification

• Java and C++ API support
https://www.tensorflow.org/mobile/tflite/
• Java API: A convenience
wrapper around the C++ API
on Android

• C++ API: Loads the
TensorFlow Lite Model File
and invokes the Interpreter.
The same library is available
on both Android and iOS
https://www.tensorflow.org/mobile/tflite/
• Let $TF_ROOT be root of tensorflow

• source of tf-lite: ${TF_ROOT}/tensorflow/contrib/lite/

• https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/README.md

• examples

• two for Android, two for iOS

• APIs: ${TF_ROOT}/tensorflow/contrib/lite/g3doc/apis.md, https://github.com/tensorflow/
tensorflow/blob/master/tensorflow/contrib/lite/g3doc/apis.md

• no benchmark_model: well there is one, https://github.com/tensorflow/tensorflow/blob/master/
tensorflow/contrib/lite/tools/benchmark_model.cc

• it’s incomplete

• no command line label_image (https://github.com/tensorflow/tensorflow/tree/master/tensorflow/
examples/label_image)
• model: .tflite model

• resolver: if no custom ops, builtin
op resolve is enough

• interpreter: we need it to compute
the graph

• interpreter->AllocateTensor():
allocate stuff for you, e.g., input
tensor(s)

• fill the input

• interpreter->Invoke(): run the graph

• process the output
tflite::FlatBufferModel	model(path_to_model);	
tflite::ops::builtin::BuiltinOpResolver	resolver;	
std::unique_ptr<tflite::Interpreter>	interpreter;	
tflite::InterpreterBuilder(*model,	resolver)(&interpreter);	
//	Resize	input	tensors,	if	desired.	
interpreter->AllocateTensors();	
float*	input	=	interpreter->typed_input_tensor<float>(0);	
//	Fill	`input`.	
interpreter->Invoke();	
float*	output	=	interpreter->type_output_tensor<float>(0);
beyond basic stuff
• https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/interpreter.h

• const char* GetInputName(int index): https://github.com/tensorflow/tensorflow/blob/master/
tensorflow/contrib/lite/interpreter.h#L157

• const char* GetOutputName(int index): https://github.com/tensorflow/tensorflow/blob/
master/tensorflow/contrib/lite/interpreter.h#L166

• int tensors_size(): https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/
lite/interpreter.h#L171

• TfLiteTensor* tensor(int tensor_index)

• int nodes_size(): https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/
lite/interpreter.h#L174

• const std::pair<TfLiteNode, TfLiteRegistration>* node_and_registration(int node_index) 

• Yes, we should be able to enumerate/traverse tensors and nodes
beyond basic stuff
• void UseNNAPI(bool enable)

• void SetNumThreads(int num_threads)

• TfLiteTensor: https://github.com/freedomtan/tensorflow/blob/
label_image_tflite_pr/tensorflow/contrib/lite/context.h#L163

• my label_image for tflite

• https://github.com/freedomtan/tensorflow/blob/
label_image_tflite_pr/tensorflow/contrib/lite/examples/
label_image/label_image.cc#L103
• logs of running label_image

• https://drive.google.com/file/d/11LAI_b1fVOM2GxOT_gOpOMpASaqe5m_U/view?
usp=sharing

• builtin state dump function

• void PrintInterpreterState(Interpreter* interpreter): https://github.com/tensorflow/tensorflow/
blob/master/tensorflow/contrib/lite/optional_debug_tools.h#L25

• https://github.com/freedomtan/tensorflow/blob/label_image_tflite_pr/tensorflow/contrib/lite/
examples/label_image/label_image.cc#L147

• TF operations --> TF Lite operations is not trivial

• https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/
tf_ops_compatibility.md

• https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/
nnapi_delegate.cc
speed of quantized one
• It seems it's much better than naive quantization as we saw before

• On Nexus 9 (MobileNet 1.0/224)

• Quantized

• ./label_image -t 2: ~ 160 ms

• ./label_image -t 2 -c 100: ~ 60 ms

• Floating point

• ./label_image -t 2 -f 1 -m ./mobilenet_v1_1.0_224.tflite: ~ 300 ms

• ./label_image -t 2 -c 100 -f 1 -m ./mobilenet_v1_1.0_224.tflite: ~ 82 ms

• TFLiteCameraDemo: 130 - 180 ms

• Pixel 2

• TFLiteCameraDemo: ~ 100 ms

• didn’t see significant difference, w/ or w/o Android NN runtime
Custom Operators
• https://github.com/tensorflow/
tensorflow/blob/master/
tensorflow/contrib/lite/g3doc/
custom_operators.md

• OpInit(), OpFree(),
OpPrepare(), and OpInvoke() in
interpreter.cc
typedef	struct	{	
		void*	(*init)(TfLiteContext*	context,	const	char*	buffer,	size_t	length);	
		void	(*free)(TfLiteContext*	context,	void*	buffer);	
		TfLiteStatus	(*prepare)(TfLiteContext*	context,	TfLiteNode*	node);	
		TfLiteStatus	(*invoke)(TfLiteContext*	context,	TfLiteNode*	node);	
}	TfLiteRegistration;
Fake Quantiztion
• How hard can it be? How much time is needed?

• Several pre-tested models are available

• https://github.com/tensorflow/tensorflow/blob/master/
tensorflow/contrib/lite/g3doc/models.md

• but only one of them (https://storage.googleapis.com/
download.tensorflow.org/models/tflite/
mobilenet_v1_224_android_quant_2017_11_08.zip) is quantized
one

• as we can guess from related docs, retrain is kinda required to
get accuracy back
• BLAS part: eigen (http://eigen.tuxfamily.org/) and
gemmlowp (https://github.com/google/gemmlowp)
The End

Más contenido relacionado

La actualidad más candente

Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlowMatthias Feys
 
Python Programming with Google Colab
Python Programming with Google ColabPython Programming with Google Colab
Python Programming with Google Colabvadhaniseetharaman
 
Tensorflow presentation
Tensorflow presentationTensorflow presentation
Tensorflow presentationAhmed rebai
 
Getting started with TensorFlow
Getting started with TensorFlowGetting started with TensorFlow
Getting started with TensorFlowElifTech
 
A Peek into Google's Edge TPU
A Peek into Google's Edge TPUA Peek into Google's Edge TPU
A Peek into Google's Edge TPUKoan-Sin Tan
 
Machine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite PreviewMachine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite PreviewModulabs
 
Introduction to TinyML - Solomon Muhunyo Githu
Introduction to TinyML - Solomon Muhunyo GithuIntroduction to TinyML - Solomon Muhunyo Githu
Introduction to TinyML - Solomon Muhunyo GithuSolomon Githu
 
[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020Akihiro Suda
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...Simplilearn
 
KERAS Python Tutorial
KERAS Python TutorialKERAS Python Tutorial
KERAS Python TutorialMahmutKAMALAK
 
Exploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source ToolsExploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source ToolsKoan-Sin Tan
 
Introduction To TensorFlow
Introduction To TensorFlowIntroduction To TensorFlow
Introduction To TensorFlowSpotle.ai
 
PyTorch Introduction
PyTorch IntroductionPyTorch Introduction
PyTorch IntroductionYash Kawdiya
 
Block I/O Layer Tracing: blktrace
Block I/O Layer Tracing: blktraceBlock I/O Layer Tracing: blktrace
Block I/O Layer Tracing: blktraceBabak Farrokhi
 

La actualidad más candente (20)

Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlow
 
Python Programming with Google Colab
Python Programming with Google ColabPython Programming with Google Colab
Python Programming with Google Colab
 
Tensorflow presentation
Tensorflow presentationTensorflow presentation
Tensorflow presentation
 
Getting started with TensorFlow
Getting started with TensorFlowGetting started with TensorFlow
Getting started with TensorFlow
 
A Peek into Google's Edge TPU
A Peek into Google's Edge TPUA Peek into Google's Edge TPU
A Peek into Google's Edge TPU
 
Machine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite PreviewMachine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite Preview
 
TensorFlow
TensorFlowTensorFlow
TensorFlow
 
Introduction to TinyML - Solomon Muhunyo Githu
Introduction to TinyML - Solomon Muhunyo GithuIntroduction to TinyML - Solomon Muhunyo Githu
Introduction to TinyML - Solomon Muhunyo Githu
 
[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020
 
Pytorch
PytorchPytorch
Pytorch
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
 
KERAS Python Tutorial
KERAS Python TutorialKERAS Python Tutorial
KERAS Python Tutorial
 
Embedded Linux Kernel - Build your custom kernel
Embedded Linux Kernel - Build your custom kernelEmbedded Linux Kernel - Build your custom kernel
Embedded Linux Kernel - Build your custom kernel
 
Exploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source ToolsExploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source Tools
 
Introduction To TensorFlow
Introduction To TensorFlowIntroduction To TensorFlow
Introduction To TensorFlow
 
Linux Systems: Getting started with setting up an Embedded platform
Linux Systems: Getting started with setting up an Embedded platformLinux Systems: Getting started with setting up an Embedded platform
Linux Systems: Getting started with setting up an Embedded platform
 
PyTorch Introduction
PyTorch IntroductionPyTorch Introduction
PyTorch Introduction
 
Hacking QNX
Hacking QNXHacking QNX
Hacking QNX
 
A practical guide to buildroot
A practical guide to buildrootA practical guide to buildroot
A practical guide to buildroot
 
Block I/O Layer Tracing: blktrace
Block I/O Layer Tracing: blktraceBlock I/O Layer Tracing: blktrace
Block I/O Layer Tracing: blktrace
 

Similar a Introduction to TensorFlow Lite

open source nn frameworks on cellphones
open source nn frameworks on cellphonesopen source nn frameworks on cellphones
open source nn frameworks on cellphonesKoan-Sin Tan
 
Introduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptxIntroduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptxJanagi Raman S
 
Tensorflow on Android
Tensorflow on AndroidTensorflow on Android
Tensorflow on AndroidKoan-Sin Tan
 
Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020Koan-Sin Tan
 
Tensor flow intro and summit info feb 2017
Tensor flow intro and summit info feb 2017Tensor flow intro and summit info feb 2017
Tensor flow intro and summit info feb 2017Sam Witteveen
 
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022InfluxData
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in ProductionMatthias Feys
 
The Fn Project: A Quick Introduction (December 2017)
The Fn Project: A Quick Introduction (December 2017)The Fn Project: A Quick Introduction (December 2017)
The Fn Project: A Quick Introduction (December 2017)Oracle Developers
 
TensorFlow Technology
TensorFlow TechnologyTensorFlow Technology
TensorFlow Technologynarayan dudhe
 
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunes
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunesBringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunes
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunesDroidConTLV
 
Bringing TensorFlow to Android - a War Story
Bringing TensorFlow to Android - a War StoryBringing TensorFlow to Android - a War Story
Bringing TensorFlow to Android - a War StoryYoni Tsafir
 
Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to TelegrafInfluxData
 
OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Foundation
 
Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?GetInData
 
Tensor flow 2.0 what's new
Tensor flow 2.0  what's newTensor flow 2.0  what's new
Tensor flow 2.0 what's newPoo Kuan Hoong
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlowNdjido Ardo BAR
 
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022InfluxData
 
TensorFlow 2.0 Autographs - For TFUG - Vik Pant
TensorFlow 2.0 Autographs - For TFUG - Vik PantTensorFlow 2.0 Autographs - For TFUG - Vik Pant
TensorFlow 2.0 Autographs - For TFUG - Vik PantDevatanu Banerjee
 

Similar a Introduction to TensorFlow Lite (20)

open source nn frameworks on cellphones
open source nn frameworks on cellphonesopen source nn frameworks on cellphones
open source nn frameworks on cellphones
 
Introduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptxIntroduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptx
 
Tensorflow on Android
Tensorflow on AndroidTensorflow on Android
Tensorflow on Android
 
Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020
 
Tensor flow intro and summit info feb 2017
Tensor flow intro and summit info feb 2017Tensor flow intro and summit info feb 2017
Tensor flow intro and summit info feb 2017
 
hpcpp.pptx
hpcpp.pptxhpcpp.pptx
hpcpp.pptx
 
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in Production
 
The Fn Project: A Quick Introduction (December 2017)
The Fn Project: A Quick Introduction (December 2017)The Fn Project: A Quick Introduction (December 2017)
The Fn Project: A Quick Introduction (December 2017)
 
TensorFlow Technology
TensorFlow TechnologyTensorFlow Technology
TensorFlow Technology
 
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunes
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunesBringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunes
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunes
 
Bringing TensorFlow to Android - a War Story
Bringing TensorFlow to Android - a War StoryBringing TensorFlow to Android - a War Story
Bringing TensorFlow to Android - a War Story
 
Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to Telegraf
 
OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11
 
C#: Past, Present and Future
C#: Past, Present and FutureC#: Past, Present and Future
C#: Past, Present and Future
 
Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?
 
Tensor flow 2.0 what's new
Tensor flow 2.0  what's newTensor flow 2.0  what's new
Tensor flow 2.0 what's new
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
 
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022
 
TensorFlow 2.0 Autographs - For TFUG - Vik Pant
TensorFlow 2.0 Autographs - For TFUG - Vik PantTensorFlow 2.0 Autographs - For TFUG - Vik Pant
TensorFlow 2.0 Autographs - For TFUG - Vik Pant
 

Más de Koan-Sin Tan

running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on androidKoan-Sin Tan
 
Exploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source ToolExploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source ToolKoan-Sin Tan
 
A Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlowA Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlowKoan-Sin Tan
 
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?Koan-Sin Tan
 
SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016Koan-Sin Tan
 
A peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk UserA peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk UserKoan-Sin Tan
 
Android Wear and the Future of Smartwatch
Android Wear and the Future of SmartwatchAndroid Wear and the Future of Smartwatch
Android Wear and the Future of SmartwatchKoan-Sin Tan
 
Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android BenchmarksKoan-Sin Tan
 
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsDark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsKoan-Sin Tan
 
Smalltalk and ruby - 2012-12-08
Smalltalk and ruby  - 2012-12-08Smalltalk and ruby  - 2012-12-08
Smalltalk and ruby - 2012-12-08Koan-Sin Tan
 

Más de Koan-Sin Tan (12)

running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on android
 
A Peek into TFRT
A Peek into TFRTA Peek into TFRT
A Peek into TFRT
 
Exploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source ToolExploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source Tool
 
A Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlowA Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlow
 
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?
 
Caffe2 on Android
Caffe2 on AndroidCaffe2 on Android
Caffe2 on Android
 
SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016
 
A peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk UserA peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk User
 
Android Wear and the Future of Smartwatch
Android Wear and the Future of SmartwatchAndroid Wear and the Future of Smartwatch
Android Wear and the Future of Smartwatch
 
Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android Benchmarks
 
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsDark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
 
Smalltalk and ruby - 2012-12-08
Smalltalk and ruby  - 2012-12-08Smalltalk and ruby  - 2012-12-08
Smalltalk and ruby - 2012-12-08
 

Último

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 

Último (20)

Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 

Introduction to TensorFlow Lite

  • 1. What I Know about TensorFlow Lite Koan-Sin Tan freedom@computer.org Hsinchu Coding Serfs Meeting Dec 7th, 2017
  • 2. • We heard Android NN and TensorFlow Lite back in Google I/ O 2017 • My COSCUP 2017 slide deck “TensorFlow on Android” • https://www.slideshare.net/kstan2/tensorflow-on- android • People knew a bit about Android NN API before it was announced and released • No information about TensorFlow Lite, at least to me, before it was released in Nov
  • 3. tf-lite and android NN in Google I/O • New TensorFlow runtime • Optimized for mobile and embedded apps • Runs TensorFlow models on device • Leverage Android NN API • Soon to be open sourced from Google I/O 2017 video
  • 4. Actual Android NN API • Announced with Android 8.1 Preview 1 • Available to developer in NDK • yes, NDK • The Android Neural Networks API (NNAPI) is an Android C API designed for running computationally intensive operations for machine learning on mobile devices • NNAPI is designed to provide a base layer of functionality for higher-level machine learning frameworks (such as TensorFlow Lite, Caffe2, or others) that build and train neural networks • The API is available on all devices running Android 8.1 (API level 27) or higher. https://developer.android.com/ndk/images/nnapi/nnapi_architecture.png
  • 5. Android NN on Pixel 2 • Only the CPU fallback is available • Actually, you can see Android NN API related in AOSP after Oreo MR1 (8.1) release already • user level code, see https://android.googlesource.com/ platform/frameworks/ml/+/oreo-mr1-release • HAL, see https://android.googlesource.com/platform/ hardware/interfaces/+/oreo-mr1-release/ neuralnetworks/
  • 6. TensorFlow Lite • TensorFlow Lite is TensorFlow’s lightweight solution for mobile and embedded devices • It enables on-device machine learning inference with low latency and a small binary size • Low latency techniques: optimizing the kernels for mobile apps, pre-fused activations, and quantized kernels that allow smaller and faster (fixed-point math) models • TensorFlow Lite also supports hardware acceleration with the Android Neural Networks API https://www.tensorflow.org/mobile/tflite/
  • 7. What does TensorFlow Lite contain? • a set of core operators, both quantized and float, which have been tuned for mobile platforms • pre-fused activations and biases to further enhance performance and quantized accuracy • using custom operations in models also supported • a new model file format, based on FlatBuffers • the primary difference is that FlatBuffers does not need a parsing/unpacking step to a secondary representation before you can access data • the code footprint of FlatBuffers is an order of magnitude smaller than protocol buffers • a new mobile-optimized interpreter, • key goals: keeping apps lean and fast. • a static graph ordering and a custom (less-dynamic) memory allocator to ensure minimal load, initialization, and execution latency • an interface to Android NN API if available https://www.tensorflow.org/mobile/tflite/
  • 8. why a new mobile-specific library? • Innovation at the silicon layer is enabling new possibilities for hardware acceleration, and frameworks such as the Android Neural Networks API make it easy to leverage these • Recent advances in real-time computer-vision and spoken language understanding have led to mobile-optimized benchmark models being open sourced (e.g. MobileNets, SqueezeNet) • Widely-available smart appliances create new possibilities for on-device intelligence • Interest in stronger user data privacy paradigms where user data does not need to leave the mobile device • Ability to serve ‘offline’ use cases, where the device does not need to be connected to a network https://www.tensorflow.org/mobile/tflite/
  • 9. • A set of core operators, both quantized and float, many of which have been tuned for mobile platforms. These can be used to create and run custom models. Developers can also write their own custom operators and use them in models • A new FlatBuffers-based model file format • On-device interpreter with kernels optimized for faster execution on mobile • TensorFlow converter to convert TensorFlow-trained models to the TensorFlow Lite format. • Smaller in size: TensorFlow Lite is smaller than 300KB when all supported operators are linked and less than 200KB when using only the operators needed for supporting InceptionV3 and Mobilenet • Pre-tested models • Inception V3, MobileNet, On Device Smart Reply • Quantized versions of the MobileNet model, which runs faster than the non-quantized (float) version on CPU. • New Android demo app to illustrate the use of TensorFlow Lite with a quantized MobileNet model for object classification • Java and C++ API support https://www.tensorflow.org/mobile/tflite/
  • 10. • Java API: A convenience wrapper around the C++ API on Android • C++ API: Loads the TensorFlow Lite Model File and invokes the Interpreter. The same library is available on both Android and iOS https://www.tensorflow.org/mobile/tflite/
  • 11. • Let $TF_ROOT be root of tensorflow • source of tf-lite: ${TF_ROOT}/tensorflow/contrib/lite/ • https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/README.md • examples • two for Android, two for iOS • APIs: ${TF_ROOT}/tensorflow/contrib/lite/g3doc/apis.md, https://github.com/tensorflow/ tensorflow/blob/master/tensorflow/contrib/lite/g3doc/apis.md • no benchmark_model: well there is one, https://github.com/tensorflow/tensorflow/blob/master/ tensorflow/contrib/lite/tools/benchmark_model.cc • it’s incomplete • no command line label_image (https://github.com/tensorflow/tensorflow/tree/master/tensorflow/ examples/label_image)
  • 12. • model: .tflite model • resolver: if no custom ops, builtin op resolve is enough • interpreter: we need it to compute the graph • interpreter->AllocateTensor(): allocate stuff for you, e.g., input tensor(s) • fill the input • interpreter->Invoke(): run the graph • process the output tflite::FlatBufferModel model(path_to_model); tflite::ops::builtin::BuiltinOpResolver resolver; std::unique_ptr<tflite::Interpreter> interpreter; tflite::InterpreterBuilder(*model, resolver)(&interpreter); // Resize input tensors, if desired. interpreter->AllocateTensors(); float* input = interpreter->typed_input_tensor<float>(0); // Fill `input`. interpreter->Invoke(); float* output = interpreter->type_output_tensor<float>(0);
  • 13. beyond basic stuff • https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/interpreter.h • const char* GetInputName(int index): https://github.com/tensorflow/tensorflow/blob/master/ tensorflow/contrib/lite/interpreter.h#L157 • const char* GetOutputName(int index): https://github.com/tensorflow/tensorflow/blob/ master/tensorflow/contrib/lite/interpreter.h#L166 • int tensors_size(): https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/ lite/interpreter.h#L171 • TfLiteTensor* tensor(int tensor_index) • int nodes_size(): https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/ lite/interpreter.h#L174 • const std::pair<TfLiteNode, TfLiteRegistration>* node_and_registration(int node_index) • Yes, we should be able to enumerate/traverse tensors and nodes
  • 14. beyond basic stuff • void UseNNAPI(bool enable) • void SetNumThreads(int num_threads) • TfLiteTensor: https://github.com/freedomtan/tensorflow/blob/ label_image_tflite_pr/tensorflow/contrib/lite/context.h#L163 • my label_image for tflite • https://github.com/freedomtan/tensorflow/blob/ label_image_tflite_pr/tensorflow/contrib/lite/examples/ label_image/label_image.cc#L103
  • 15. • logs of running label_image • https://drive.google.com/file/d/11LAI_b1fVOM2GxOT_gOpOMpASaqe5m_U/view? usp=sharing • builtin state dump function • void PrintInterpreterState(Interpreter* interpreter): https://github.com/tensorflow/tensorflow/ blob/master/tensorflow/contrib/lite/optional_debug_tools.h#L25 • https://github.com/freedomtan/tensorflow/blob/label_image_tflite_pr/tensorflow/contrib/lite/ examples/label_image/label_image.cc#L147 • TF operations --> TF Lite operations is not trivial • https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/ tf_ops_compatibility.md • https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/ nnapi_delegate.cc
  • 16. speed of quantized one • It seems it's much better than naive quantization as we saw before • On Nexus 9 (MobileNet 1.0/224) • Quantized • ./label_image -t 2: ~ 160 ms • ./label_image -t 2 -c 100: ~ 60 ms • Floating point • ./label_image -t 2 -f 1 -m ./mobilenet_v1_1.0_224.tflite: ~ 300 ms • ./label_image -t 2 -c 100 -f 1 -m ./mobilenet_v1_1.0_224.tflite: ~ 82 ms • TFLiteCameraDemo: 130 - 180 ms • Pixel 2 • TFLiteCameraDemo: ~ 100 ms • didn’t see significant difference, w/ or w/o Android NN runtime
  • 17. Custom Operators • https://github.com/tensorflow/ tensorflow/blob/master/ tensorflow/contrib/lite/g3doc/ custom_operators.md • OpInit(), OpFree(), OpPrepare(), and OpInvoke() in interpreter.cc typedef struct { void* (*init)(TfLiteContext* context, const char* buffer, size_t length); void (*free)(TfLiteContext* context, void* buffer); TfLiteStatus (*prepare)(TfLiteContext* context, TfLiteNode* node); TfLiteStatus (*invoke)(TfLiteContext* context, TfLiteNode* node); } TfLiteRegistration;
  • 18. Fake Quantiztion • How hard can it be? How much time is needed? • Several pre-tested models are available • https://github.com/tensorflow/tensorflow/blob/master/ tensorflow/contrib/lite/g3doc/models.md • but only one of them (https://storage.googleapis.com/ download.tensorflow.org/models/tflite/ mobilenet_v1_224_android_quant_2017_11_08.zip) is quantized one • as we can guess from related docs, retrain is kinda required to get accuracy back
  • 19. • BLAS part: eigen (http://eigen.tuxfamily.org/) and gemmlowp (https://github.com/google/gemmlowp)