On-device ML with TFLite

On-device ML
with Lite
Margaret Maynard-Reid, 2/12/2020
@margaretmz

@margaretmz | #ML | #GDE
Topics
● Why on-device ML?
● On-device ML options
● E2E tf.Keras to TFLite to Android
○ train a model from scratch
○ convert to TFLite
○ deploy to mobile and IoT
● TFLite on microcontroller & Coral Edge TPU
2

@margaretmz | #ML | #GDE 3
Intro
Why On-device ML?
● Access to more data
● Faster user interaction
● Preserve privacy
Unique constraints:
● Less compute power
● Limited memory
● Battery consumption

TensorFlow for mobile & edge devices
4
2015
TF open
sourced
2016
TF
mobile
2017
TF Lite
developer
preview
2018
ML Kit
2019
- New ML Kit features
- TF Mobile deprecated
- New TFLite features!!!

TFLite on 3b+ devices!
Source: Tensorflow Lite team
5

Dance Like @I/O 2019
Segmentation, Pose, GPU on-device
6

TensorFlow Lite
● Converter - convert to TFLite file format
● Interpreter - execute inference & optimized
for small devices
● Ops/Kernel - limited ops
● Interface to hardware acceleration
○ NN API
○ Edge TPU
7

Optimization
1. Reduce model size
TFLite model optimization toolkit
● Quantization - convert 32 bit floating point
to fixed point (e.g. 8-bit int)
○ Post-training quantization
○ Quantization-aware training
● Pruning - eliminating unnecessary values
in the weight tensor
8
2. Speed up inference
On Android:
● GPU delegate
● Android NNAPI

On-device ML
What are your options?
Media Pipe
9

On-device ML Options
10
What / how Who Where
Native Android (iOS) apps
● Direct deploy to Android
● With ML Kit
● With MediaPipe
● Fritz.ai
Android (or iOS)
developers
React Native Web developers
TFLite / TF micro Embedded Microcontrollers
Edge TPUs

React Native Support
● Use TF.js ML directly inside React Native with WebGL
acceleration
● Load models from the web, or compile into your
application
Link to demo video | Link to github
11

Base APIs (Out of the box)
Custom models
● Dynamic model downloads
● A/B testing (via Firebase remote Configuration)
● Model conversion (from TensorFlow to TFLite)
Learn more about ML Kit 👉 g.co/mlkit
Image labelling OCR Face detection
Barcode scanning Landmark detection Smart reply
Object detection & Tracking Translation (56 languages) AutoML
Google ML Kit
12

Why use ML Kit?
13
Convert to
Bytebuffer/bit
map
Calibration
Java
Native Frame
Scheduler
(Image Timestamp)
Convert to byte
array
Output
Results
Pipeline config
Convert to Grayscale
Resize/Rotate
Tracker
Frame
Selection
Convert to
RGB/Resize/R
otate
Detector
(TF Lite
model)
Object
Manager
Image
Validation
Resize
Pipeline
Classifier
( TF Lite
model)
Source: ML Kit team

● Firebase console
● AutoML - train model
● Download TFLite
● Mobile & edge
https://firebase.google.com/docs/ml-kit/automl-image-labeling
Google ML Kit - AutoML
14

MediaPipe
A cross-platform AI pipeline
framework by Google Research:
● TensorFlow & TFLite
● Desktop, web, mobile, Coral
Edge TPUs
● Fast & realtime
● GPU
● WebGL
15
Source: MediaPipe Github

Two talks on Media Pipe
@AI Nextcon 2/13 1PM
@Google Seattle 2/13 5PM
● Google MediaPipe @Seattle by Ming Yong
16

Fritz.ai
Mobile ML made easy...
● Supports Android & iOS
● Features: Image labelling &
segmentation, object detection,
style transfer, pose estimation…
● Analytics, custom model hosting,
perf monitoring…
● Free up to certain usage
17
Source: Embrace your new look with Fritz Hair Segmentation

Datasets
Train model
(Convert to TFLite)
Deploy for inference
End to End
Model training to inference
With TensorFlow 2.0
18

End to end: model training to inference in TF 2.0
19
Model
● tf.Keras (TensorFlow)
● Python libraries:
Numpy, Matplotlib etc
SavedModel or
Keras model
Serving
● Cloud
● Web
● Mobile
● IoT
● Micro controllers
● Edge TPU
Training Inference
Data

Data
● Existing datasets
○ Part of the deep learning framework:
■ MNIST, CIFAR10, FASHION_MNIST, IMDB movie reviews etc
○ Open datasets:
■ MNIST, MS-COCO, IMAGENet, CelebA etc
○ Kaggle datasets: https://www.kaggle.com/datasets
○ Google Dataset search tool: https://toolbox.google.com/datasetsearch
○ TF 2.0: TFDS
● Collect your own data
20

Models
Options of getting a model:
● Download a pre-trained model (here): Inception-v3, mobilenet etc.
● Transfer learning with a pre-trained model
○ Feature extraction or fine tuning on pre-trained model
○ TensorFlow hub (https://www.tensorflow.org/hub/)
● Train your own model from scratch (example in this talk)
21

Model saving, conversion, deployment
● Model saving - SavedModel or Keras model
● Model conversion
○ Convert the model to tflite format
○ Validate the converted model before deploy
● Deploy TFLite for inference
22

End to End: tf.Keras to TFLite to Android
23

MNIST dataset
● 60,000 train set and 10,000 test set
● 28x28x1 grayscale images
● 10 classes: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
● Popular for computer vision
○ “hello world” tutorial or
○ benchmarking ML algorithms
24

Training the model in Colab
Launch sample code on Colab → mnist_tfkeras_to_tflite.ipynb
1. Import data
2. Define model architecture
3. Train the model
4. Model saving & conversion
○ Save a Keras model
○ convert to tflite format
25

A typical CNN model architecture
MNIST example:
● Convolutional layer (definition)
● Pooling layer (definition)
● Dense (fully-connected layer) definition
26
input conv pool conv pool conv pool Dense
0
1
2
3
4
5
6
7
8
9

Inspect the model - in python code
In python code, after defining the
model architecture, use
model.summary() to show the
model architecture
27

Virtualize model
Use a visualization tool:
● TensorBoard
● Netron
(https://github.com/lutzroeder/Netron)
Drop the .tflite model into Netron and see
the model visually
Note: model metadata a new TFLite tool (to be
launched) will allow you to inspect the model &
modify the metadata
28

Model saving
When to save as SavedModel or a Keras model?
Note: In TensorFlow 2.0 , tf.keras.Model.save() and tf.keras.models.save_model() default to the SavedModel format
(not HDF5). (link to doc)
29
SavedModel Keras Model
Share pre-trained models and model pieces on
TensorFlow Hub
Train with tf.Keras and you know your deploy your
target
When you don’t know the deploy target

Model conversion (with TFLite converter)
30
Command line Python code (recommended)
SavedModel
tflite_convert
--saved_model_dir=/tmp/my_saved_model
--output_file=/tmp/my_model.tflite
Keras Model
--keras_model_file=/tmp/my_keras_model.h5
--output_file=/tmp/my_model.tflite
# Create a converter
converter =
tf.contrib.lite.TFLiteConverter.from_keras_model_file(keras_model)
from_keras_model(model)
# Set quantize to true (optional)
converter.post_training_quantize=True
# Convert the model
tflite_model = converter.convert()
# Create the tflite model file
tflite_model_name = "my_model.tflite"
open(tflite_model_name, "wb").write(tflite_model)

Validate TFLite model after conversion
31
Protip: validate the tflite model in python after conversion -
31
TensorFlow result TFLite result Compare results
# Test the TensorFlow model on random
Input data.
tf_result = model(tf.constant(input_data))
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="converted_model.tflite")
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape),
dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
tflite_result = interpreter.get_tensor(output_details[0]['index'])
# Compare the result.
for tf_result, tflite_result in zip(tf_result, tflite_result):
np.testing.assert_almost_equal(tf_result,
tflite_result,
decimal=5)

Tflite on Android
Android sample code DigitRecognizer, step by step:
● Place tf.lite model under assets folder
● Update build.gradle dependencies
● Input image - custom view, gallery or camera
● Data preprocessing
● Classify with the model
● Post processing
● Display result in UI
32

Dependencies
Update build.gradle to include tensorflow lite
android {
// Make sure model doesn't get compressed when app is compiled
aaptOptions {
noCompress "tflite"
}
}
dependencies {
….
// Add dependency for TensorFlow Lite
compile 'org.tensorflow:tensorflow-lite:[version-number]’
}
Place the mnist.tflite model file under /assets folder
33

Input - image data
Input to the classifier is an image, your options:
● Draw on canvas from custom View
● Get image from Gallery or a 3rd party camera
● Live frames from Camera2 API
Make sure the image dimensions (shape) matches what your classifier expects
● 28x28x1- MNIST or FASHION_MNIST gray scale image
● 299x299x3 - Inception V3
● 256x256x3 - MobileNet
34

Image preprocessing
● Convert Bitmap to ByteBuffer
● Normalize pixel values to be a certain range
● Convert from color to grayscale, if needed
35

Run inference
Load the model file located under the assets folder
Use the TensorFlow Lite interpreter to run inference on the input image
36

Post processing
The output is an array of probabilities, each
correspond to a category
Find the category with the highest probability
and output result to UI
37

Summary
● Training with tf.Keras is easy
● Model conversion to TFLite is easier
● Android implementation is getting better:
○ Validate tflite model before deploy to Android
○ Image pre-processing
○ Input tensor shape?
○ Color or grayscale?
○ Post processing
My blog post: E2E tf.Keras to TFLite to Android
38

New TFLite features
Announced at TensorFlow World:
1. New TFLite support library (link)
2. Model metadata (not yet launched)
3. Model repository pre-converted to tflite format (link to models w/ examples | link
to hosted models)
4. Transfer learning made easy - model customization API (link)
5. Ready to use end-to-end tutorials and full example apps (link)
6. TFLite course on Udacity (link)
39

TFLite classification demo app
Check out the classification Demo
app in TensorFlow repo
40

Inference with GPU
● Face contour detection
● Link to blog post: TensorFlow Lite Now
Faster with Mobile GPUs
41

Posenet example
● PoseNet model on Android
● Camera live frames
● Display key body parts in real time
● Link to blog post: Track human poses in
real-time on Android with TensorFlow Lite
42

More TFLite examples
43

On device ML training is finally here!
● Train with ~20 images
● Use transfer learning
● Quantized MobileNetV2
● Android device (5.0+)
Link to blog | Android sample
44

TFLite on microcontroller
● Tiny models on tiny computers
● Consumes much less power than CPUs - days on a coin battery
● Tiny RAM and Flash available
● Opens up voice interface to devs
More info here -
● Doc - https://www.tensorflow.org/lite/guide/microcontroller
● Code lab - https://g.co/codelabs/sparkfunTF
● Purchase - https://www.sparkfun.com/products/15170
45

Coral edge TPU (beta) - hardware for on-device ML acceleration
Link to codelab: https://codelabs.developers.google.com/codelabs/edgetpu-classifier/index.html#0
● Dev board (+ camera module)
● USB Accelerator (+ camera
module + Raspberry Pi)
Coral Edge TPU
46

Coral Edge TPU
MobileNet SSD
model running on
TPU
Inference time:
< ~20 ms
> ~60 fps
47

Coral Edge TPU demo
MobileNet SSD
model running on
CPU
Inference time
> ~390ms
~ 3fps
48

On-device ML trends
● Why the future of machine learning is tiny? - Pete Warden
● Deploying to mobile and IoT will get much easier
● TFLite will have many more features
● Federated learning
● On device training
49

Awesome TFLite 😎
bit.ly/awesome-tflite - please star ⭐ the repo if you find it useful!
50

Thank you!
51
Follow me on Twitter, Medium or GitHub to learn more about
deep learning, TensorFlow and on-device ML
@margaretmz
@margaretmz
margaretmz

On-device ML with TFLite

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a On-device ML with TFLite

Similar a On-device ML with TFLite (20)

Último

Último (20)

On-device ML with TFLite