SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
KaoNet: Face Recognition and Generation
App using Deep Learning
Van Phu Quang Huy
Pham Quang Khang
1
About Us
Van Phu Quang Huy
● AI Lead Engineer at Galapagos Inc
Pham Quang Khang
● Software engineer@Works Applications
2
Objectives
● What we want to do?
To introduce the whole process of creating an application
based on Deep Learning
● What will be included:
○ Convolutional Neural Networks (CNNs)
○ Generative Adversarial Networks (GANs)
○ TensorFlow
3
Part 1: Face Recognition
4
First thing first: idea
● Facial recognition is a promising yet challenging research field for its
enormous applications:
○ Biometric security system
○ Monitoring and people searching
○ Daily applications
● All the tools to develop a facial recognition app are already provided by
lots of company
=> Why not a face recognition app
5
The name: KaoNet
KaoNet = 顔(Kao) + Net
It is The Network of Faces
6
What can the app do?
● Classify the input data into groups of faces of the same people
● Generate faces using the input such that the generated faces can be as
similar as human as possible
7
Whose faces?
● In order to train a neural network, the amount of sample data must be
very large
who would have that amount of photos for share? => famous people
● Who would attract most => singers, models, actresses
8
Where to find those photos?
● Online: internet is the infinite source of all kind of information, hence the
more famous one person is, the higher the probability his/her photos can
be searched with simple keywords
● The search engine we chose: Bing. Because the API to crawl photos from
search results is still free in Bing.
9
How many photos?
● At first, a list of more than 50 popular people was chosen as the target of
the app, we expected around at least 1K of data for each
● Crawling: data was collected from a few simple fixed keyword to search
on Bing and save the result to local server
● Result: around 1K of photos for each person were collected but after
removing wrong result, only around 200 correct samples for each was
chosen for KaoNet
10
But we only care about The Face of people
● The whole photo is not
a good sample since
too much noise in
background
● Solution: Cut out the
face out by OpenCV
11
Finally
A Net of Kao (faces)
Result: only have enough time to
filter 26 targets
12
Finish?
● That is only the beginning. Now the hard part: training
● Model:
○ CNN : train to classify samples
○ GAN: to generate a face from samples
● Framework: TensorFlow. Because it is highly supported for CNN with real
time training process observation, and one code for both CPU and GPU
13
Training progress in real time
steps steps
Neural Networks
14
cat
Convolutional Neural Network: convolution layer
Idea: extracting the elementary features of image by using the local receptive
fields instead of training all points on the original image (Yann LeCun 1998)
15
Fei-Fei Li, Stanford 2016
Pooling layer (sub-sampling)
Local averaging and sub-sampling, reducing the resolution of feature map and
reducing the resolution of the feature map (Yann LeCun 1998)
16
Fei-Fei Li, Stanford 2016
CNN architecture in KaoNet
● Formula of convolutional layers:
Convolution + Batch Normalization + ReLU +
Max Pooling
● Architecture of KaoNet:
4 convolutional layers (conv) + 2 fully
connected layers (fc)
17
layer size-in size-out kernel
conv1 128⨉128⨉3 128⨉128⨉32 7⨉7, 1
pool1 128⨉128⨉32 64⨉64⨉32 2⨉2, 2
conv2 64⨉64⨉32 64⨉64⨉64 5⨉5, 1
pool2 64⨉64⨉64 32⨉32⨉64 2⨉2, 2
conv3 32⨉32⨉64 32⨉32⨉128 3⨉3, 1
pool3 32⨉32⨉128 16⨉16⨉128 2⨉2, 2
conv4 16⨉16⨉128 16⨉16⨉192 3⨉3, 1
pool4 16⨉16⨉192 8⨉8⨉192 2⨉2, 2
reshape 8⨉8⨉192 1⨉12288
fc1 1⨉12288 1⨉1024
fc2 1⨉1024 1⨉512
CNN architecture in KaoNet
18
Hyper-parameters in KaoNet
● Number of layers: 4 conv, 2 fc
● Size and number of filters in each convolutional layer (previous slide)
● Size of fully connected layer (previous slide)
● Weight-decay (for fc only, weight-decay = 0.004)
● Optimization algorithm (AdamOptimizer)
● Initial learning rate (0.004)
● Initial weight (normal distribution with mean=0, sttdev = 5e-4)
19
Data partition
● Data is separated into 2 parts: train data and validation data with the ratio
of 80: 20.
● Each epoch, training result is applied to validation data to evaluate the
loss and prediction accuracy
● Each train step, an amount of batch_size (KaoNet: 64) data is loaded for
training. Data is loaded randomly from training set
20
Source code
● TensorFlow tutorial of Cifar10 and MNIST are good samples
https://www.tensorflow.org/tutorials/deep_cnn
https://www.tensorflow.org/tutorials/layers
● Our source code (not public yet)
https://github.com/vanhuyz/KaoNet
21
Let’s run
● Training with 26 targets resulted in fair accuracy on training set but
extremely poor on validation set => Overfitting
22
steps steps
Train set
Validation set
Train set
Validation set
Why failed?
Causes:
○ The model is too complex compare to the number of sample in each training set
○ The number of sample for each object varied too much, some has the number of sample
a few times more than others
Solution:
● Simplify the model => not so a choice for application extending
● Increasing the number of samples => not enough time
● Only train with targets have sufficient number of sample => worth trying
23
The Ultimate 2
● One way to fix the problem is to use a set of sample that fairly separated
and with more amount of data
● The Ultimate 2: 10K of photos for each target
24
Accuracy of validation test is highly improved
● Loss drops to close to zero after 10K steps of training
● Train accuracy went to 100% before 5K steps
● Validation accuracy highly improved, compared to previous data set
25
steps steps
Training Environment
● Use all resources that we can
○ Macbook Pro (CPU)
○ Dell Vostro desktop (CPU)
○ AWS GPU Instance g2.8xlarge (2.7$/h) → totally cost us about 100$
○ GeForce GTX 1080 (GPU) → thank Galapagos Inc for supporting!
26
Demo
27
Embedding Visualization
・Presenting the vector of last fully
connected layer at each input data
・Each image is represented by a
512-dimensional vector
・High dimension vectors are
compressed into 3-dimensional
vector using PCA for visualization
→ Let’s check on Tensorboard
28
Future of KaoNet
● Biometric security: using face recognition to replace physical lock
● Face search
● Criminal hunting using CCTV
29
Part 2: Face Generation
30
Generative Model [1]
● Explicitly or implicitly model the distribution of data
● By sampling from that model, it is possible to generate synthetic data
points in the data space
31
[1] C.Bishop, 2006. Pattern Recognition and Machine Learning, p43
Generative Adversarial Networks (GAN)
What are some recent and potentially upcoming breakthroughs in deep
learning? (from Quora 2016)
32
The most important one, in my opinion, is adversarial training (also called
GAN for Generative Adversarial Networks)...
This, and the variations that are now being proposed is the most
interesting idea in the last 10 years in ML, in my opinion.
-
Yann LeCun, Director of AI Research at Facebook
(https://www.quora.com/What-are-some-recent-and-potentially-upcoming-breakthroughs-in-deep-learning)
GAN [1]
● Based on a game theoretic scenario in which the generator network must
compete against an adversary [2]
○ The generator network directly produces “fake” samples
○ The discriminator network attempts to distinguish between samples drawn from the
training data and samples drawn from the generator
● Train 2 networks simultaneously
○ The discriminator learns to correctly classify samples as real of fake
○ The generator learns to fool the discriminator into believing its samples are real
● In convergence, the generator’s samples are indistinguishable from real
data, and the discriminator outputs ½ everywhere
33
[1] Goodfellow, 2014
[2] Goodfellow et al, 2016. Deep Learning, p702
GAN in easy words...
● A criminal tries to print fake money
● A police attempts to distinguish fake money from real money
● At first, with outdated technology, the criminal just prints some “random
papers”, so the police can easily detect what is fake money
● The criminal learns from that, then improves his tech
34
vs
GAN in easy words...
● As the fake money becomes more and more realistic, the police also has
to improve his detection skill
● As a result, the criminal and the police learn from each other, and
continuously improve themselves
● Finally, when the fake money looks so realistic that the police can not
distinguish, the world is over!
35
In the GAN world
● The criminal is called Generator
● The police is called Discriminator
● Generator and Discriminator are usually Neural Networks (but not
required)
● GAN’s problems:
○ Unstable to train
○ Non-convergence
36
GAN’s application: Image to Image Translation
37(Isola et al, 2016)
Deep Convolutional Generative Adversarial Networks (DCGAN) [1]
● Both Generator and Discriminator are Deep Convolutional Neural
Networks
● Apply some techniques for stable training
○ Replace pooling layers with strided convolutions (discriminator) and fractional-strided
convolutions (generator)
○ Use batch normalization
○ Remove fully connected hidden layers
○ Use LeakyRELU activation in the discriminator for all layers
○ ...
38
[1] Radford et al, 2015
Generator Network in DCGAN
39(Radford et al, 2015)
Experiment: Train DCGAN on our celebrity dataset
40
Step 0 Step 1000 Step 40000
Experiment: Train DCGAN on the Ultimate 2 dataset
41
Step 0 Step 1000 Step 15000
Conclusion
● We have introduced step-by-step of developing an application based on
Deep Learning
● Succeed in creating a face classification app based on CNN
● Achieved 98% accuracy for validation test and good result on test data
● Successfully generated images using DCGAN
42

Más contenido relacionado

La actualidad más candente

Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classificationijtsrd
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural NetworksYogendra Tamang
 
Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...
Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...
Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...Myungyon Kim
 
Ai and neural networks
Ai and neural networksAi and neural networks
Ai and neural networksNikhil Kansari
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural networkItachi SK
 
Handwritten digits recognition report
Handwritten digits recognition reportHandwritten digits recognition report
Handwritten digits recognition reportSwayamdipta Saha
 
Digit recognition using neural network
Digit recognition using neural networkDigit recognition using neural network
Digit recognition using neural networkshachibattar
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature LearningAmgad Muhammad
 
introduction to deep Learning with full detail
introduction to deep Learning with full detailintroduction to deep Learning with full detail
introduction to deep Learning with full detailsonykhan3
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Basit Rafiq
 
Artificial neural network for machine learning
Artificial neural network for machine learningArtificial neural network for machine learning
Artificial neural network for machine learninggrinu
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnnSumeraHangi
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksParrotAI
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural networkFerdous ahmed
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applicationsSangeeta Tiwari
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015Jia-Bin Huang
 

La actualidad más candente (20)

Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classification
 
Intoduction to Neural Network
Intoduction to Neural NetworkIntoduction to Neural Network
Intoduction to Neural Network
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...
Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...
Deep Learning and Tensorflow Implementation(딥러닝, 텐서플로우, 파이썬, CNN)_Myungyon Ki...
 
Ai and neural networks
Ai and neural networksAi and neural networks
Ai and neural networks
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Handwritten digits recognition report
Handwritten digits recognition reportHandwritten digits recognition report
Handwritten digits recognition report
 
Digit recognition using neural network
Digit recognition using neural networkDigit recognition using neural network
Digit recognition using neural network
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature Learning
 
introduction to deep Learning with full detail
introduction to deep Learning with full detailintroduction to deep Learning with full detail
introduction to deep Learning with full detail
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
Artificial neural network for machine learning
Artificial neural network for machine learningArtificial neural network for machine learning
Artificial neural network for machine learning
 
Image captioning
Image captioningImage captioning
Image captioning
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural Networks
 
Neural
NeuralNeural
Neural
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applications
 
Image recognition
Image recognitionImage recognition
Image recognition
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
 

Similar a KaoNet: Face Recognition and Generation App using Deep Learning

Finding the best solution for Image Processing
Finding the best solution for Image ProcessingFinding the best solution for Image Processing
Finding the best solution for Image ProcessingTech Triveni
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術CHENHuiMei
 
Neuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine PresentationNeuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine PresentationBohdan Klimenko
 
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLSeldon
 
Neural_Programmer_Interpreter
Neural_Programmer_InterpreterNeural_Programmer_Interpreter
Neural_Programmer_InterpreterKaty Lee
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Fernando Constantino
 
State-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domainsState-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domainsKnoldus Inc.
 
NTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer LearningNTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer LearningSean Yu
 
DN18 | Demystifying the Buzz in Machine Learning! (This Time for Real) | Dat ...
DN18 | Demystifying the Buzz in Machine Learning! (This Time for Real) | Dat ...DN18 | Demystifying the Buzz in Machine Learning! (This Time for Real) | Dat ...
DN18 | Demystifying the Buzz in Machine Learning! (This Time for Real) | Dat ...Dataconomy Media
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deploymenttaeseon ryu
 
Botnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniquesBotnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniquesIvan Letteri
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Suraj Aavula
 
Beyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networksBeyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networksJunKudo2
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Universitat Politècnica de Catalunya
 
Model based rl
Model based rlModel based rl
Model based rlSeolhokim
 
2a Mini-conf PredictCovid. Field: Artificial Intelligence
2a Mini-conf PredictCovid. Field: Artificial Intelligence2a Mini-conf PredictCovid. Field: Artificial Intelligence
2a Mini-conf PredictCovid. Field: Artificial IntelligenceAlex Camargo
 

Similar a KaoNet: Face Recognition and Generation App using Deep Learning (20)

Finding the best solution for Image Processing
Finding the best solution for Image ProcessingFinding the best solution for Image Processing
Finding the best solution for Image Processing
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Neuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine PresentationNeuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine Presentation
 
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
 
Neural_Programmer_Interpreter
Neural_Programmer_InterpreterNeural_Programmer_Interpreter
Neural_Programmer_Interpreter
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.
 
State-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domainsState-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domains
 
Scene understanding
Scene understandingScene understanding
Scene understanding
 
NTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer LearningNTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer Learning
 
DN18 | Demystifying the Buzz in Machine Learning! (This Time for Real) | Dat ...
DN18 | Demystifying the Buzz in Machine Learning! (This Time for Real) | Dat ...DN18 | Demystifying the Buzz in Machine Learning! (This Time for Real) | Dat ...
DN18 | Demystifying the Buzz in Machine Learning! (This Time for Real) | Dat ...
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 
Botnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniquesBotnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniques
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
Beyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networksBeyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networks
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...
 
ML in Android
ML in AndroidML in Android
ML in Android
 
Model based rl
Model based rlModel based rl
Model based rl
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
2a Mini-conf PredictCovid. Field: Artificial Intelligence
2a Mini-conf PredictCovid. Field: Artificial Intelligence2a Mini-conf PredictCovid. Field: Artificial Intelligence
2a Mini-conf PredictCovid. Field: Artificial Intelligence
 

Último

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

KaoNet: Face Recognition and Generation App using Deep Learning

  • 1. KaoNet: Face Recognition and Generation App using Deep Learning Van Phu Quang Huy Pham Quang Khang 1
  • 2. About Us Van Phu Quang Huy ● AI Lead Engineer at Galapagos Inc Pham Quang Khang ● Software engineer@Works Applications 2
  • 3. Objectives ● What we want to do? To introduce the whole process of creating an application based on Deep Learning ● What will be included: ○ Convolutional Neural Networks (CNNs) ○ Generative Adversarial Networks (GANs) ○ TensorFlow 3
  • 4. Part 1: Face Recognition 4
  • 5. First thing first: idea ● Facial recognition is a promising yet challenging research field for its enormous applications: ○ Biometric security system ○ Monitoring and people searching ○ Daily applications ● All the tools to develop a facial recognition app are already provided by lots of company => Why not a face recognition app 5
  • 6. The name: KaoNet KaoNet = 顔(Kao) + Net It is The Network of Faces 6
  • 7. What can the app do? ● Classify the input data into groups of faces of the same people ● Generate faces using the input such that the generated faces can be as similar as human as possible 7
  • 8. Whose faces? ● In order to train a neural network, the amount of sample data must be very large who would have that amount of photos for share? => famous people ● Who would attract most => singers, models, actresses 8
  • 9. Where to find those photos? ● Online: internet is the infinite source of all kind of information, hence the more famous one person is, the higher the probability his/her photos can be searched with simple keywords ● The search engine we chose: Bing. Because the API to crawl photos from search results is still free in Bing. 9
  • 10. How many photos? ● At first, a list of more than 50 popular people was chosen as the target of the app, we expected around at least 1K of data for each ● Crawling: data was collected from a few simple fixed keyword to search on Bing and save the result to local server ● Result: around 1K of photos for each person were collected but after removing wrong result, only around 200 correct samples for each was chosen for KaoNet 10
  • 11. But we only care about The Face of people ● The whole photo is not a good sample since too much noise in background ● Solution: Cut out the face out by OpenCV 11
  • 12. Finally A Net of Kao (faces) Result: only have enough time to filter 26 targets 12
  • 13. Finish? ● That is only the beginning. Now the hard part: training ● Model: ○ CNN : train to classify samples ○ GAN: to generate a face from samples ● Framework: TensorFlow. Because it is highly supported for CNN with real time training process observation, and one code for both CPU and GPU 13 Training progress in real time steps steps
  • 15. Convolutional Neural Network: convolution layer Idea: extracting the elementary features of image by using the local receptive fields instead of training all points on the original image (Yann LeCun 1998) 15 Fei-Fei Li, Stanford 2016
  • 16. Pooling layer (sub-sampling) Local averaging and sub-sampling, reducing the resolution of feature map and reducing the resolution of the feature map (Yann LeCun 1998) 16 Fei-Fei Li, Stanford 2016
  • 17. CNN architecture in KaoNet ● Formula of convolutional layers: Convolution + Batch Normalization + ReLU + Max Pooling ● Architecture of KaoNet: 4 convolutional layers (conv) + 2 fully connected layers (fc) 17 layer size-in size-out kernel conv1 128⨉128⨉3 128⨉128⨉32 7⨉7, 1 pool1 128⨉128⨉32 64⨉64⨉32 2⨉2, 2 conv2 64⨉64⨉32 64⨉64⨉64 5⨉5, 1 pool2 64⨉64⨉64 32⨉32⨉64 2⨉2, 2 conv3 32⨉32⨉64 32⨉32⨉128 3⨉3, 1 pool3 32⨉32⨉128 16⨉16⨉128 2⨉2, 2 conv4 16⨉16⨉128 16⨉16⨉192 3⨉3, 1 pool4 16⨉16⨉192 8⨉8⨉192 2⨉2, 2 reshape 8⨉8⨉192 1⨉12288 fc1 1⨉12288 1⨉1024 fc2 1⨉1024 1⨉512
  • 18. CNN architecture in KaoNet 18
  • 19. Hyper-parameters in KaoNet ● Number of layers: 4 conv, 2 fc ● Size and number of filters in each convolutional layer (previous slide) ● Size of fully connected layer (previous slide) ● Weight-decay (for fc only, weight-decay = 0.004) ● Optimization algorithm (AdamOptimizer) ● Initial learning rate (0.004) ● Initial weight (normal distribution with mean=0, sttdev = 5e-4) 19
  • 20. Data partition ● Data is separated into 2 parts: train data and validation data with the ratio of 80: 20. ● Each epoch, training result is applied to validation data to evaluate the loss and prediction accuracy ● Each train step, an amount of batch_size (KaoNet: 64) data is loaded for training. Data is loaded randomly from training set 20
  • 21. Source code ● TensorFlow tutorial of Cifar10 and MNIST are good samples https://www.tensorflow.org/tutorials/deep_cnn https://www.tensorflow.org/tutorials/layers ● Our source code (not public yet) https://github.com/vanhuyz/KaoNet 21
  • 22. Let’s run ● Training with 26 targets resulted in fair accuracy on training set but extremely poor on validation set => Overfitting 22 steps steps Train set Validation set Train set Validation set
  • 23. Why failed? Causes: ○ The model is too complex compare to the number of sample in each training set ○ The number of sample for each object varied too much, some has the number of sample a few times more than others Solution: ● Simplify the model => not so a choice for application extending ● Increasing the number of samples => not enough time ● Only train with targets have sufficient number of sample => worth trying 23
  • 24. The Ultimate 2 ● One way to fix the problem is to use a set of sample that fairly separated and with more amount of data ● The Ultimate 2: 10K of photos for each target 24
  • 25. Accuracy of validation test is highly improved ● Loss drops to close to zero after 10K steps of training ● Train accuracy went to 100% before 5K steps ● Validation accuracy highly improved, compared to previous data set 25 steps steps
  • 26. Training Environment ● Use all resources that we can ○ Macbook Pro (CPU) ○ Dell Vostro desktop (CPU) ○ AWS GPU Instance g2.8xlarge (2.7$/h) → totally cost us about 100$ ○ GeForce GTX 1080 (GPU) → thank Galapagos Inc for supporting! 26
  • 28. Embedding Visualization ・Presenting the vector of last fully connected layer at each input data ・Each image is represented by a 512-dimensional vector ・High dimension vectors are compressed into 3-dimensional vector using PCA for visualization → Let’s check on Tensorboard 28
  • 29. Future of KaoNet ● Biometric security: using face recognition to replace physical lock ● Face search ● Criminal hunting using CCTV 29
  • 30. Part 2: Face Generation 30
  • 31. Generative Model [1] ● Explicitly or implicitly model the distribution of data ● By sampling from that model, it is possible to generate synthetic data points in the data space 31 [1] C.Bishop, 2006. Pattern Recognition and Machine Learning, p43
  • 32. Generative Adversarial Networks (GAN) What are some recent and potentially upcoming breakthroughs in deep learning? (from Quora 2016) 32 The most important one, in my opinion, is adversarial training (also called GAN for Generative Adversarial Networks)... This, and the variations that are now being proposed is the most interesting idea in the last 10 years in ML, in my opinion. - Yann LeCun, Director of AI Research at Facebook (https://www.quora.com/What-are-some-recent-and-potentially-upcoming-breakthroughs-in-deep-learning)
  • 33. GAN [1] ● Based on a game theoretic scenario in which the generator network must compete against an adversary [2] ○ The generator network directly produces “fake” samples ○ The discriminator network attempts to distinguish between samples drawn from the training data and samples drawn from the generator ● Train 2 networks simultaneously ○ The discriminator learns to correctly classify samples as real of fake ○ The generator learns to fool the discriminator into believing its samples are real ● In convergence, the generator’s samples are indistinguishable from real data, and the discriminator outputs ½ everywhere 33 [1] Goodfellow, 2014 [2] Goodfellow et al, 2016. Deep Learning, p702
  • 34. GAN in easy words... ● A criminal tries to print fake money ● A police attempts to distinguish fake money from real money ● At first, with outdated technology, the criminal just prints some “random papers”, so the police can easily detect what is fake money ● The criminal learns from that, then improves his tech 34 vs
  • 35. GAN in easy words... ● As the fake money becomes more and more realistic, the police also has to improve his detection skill ● As a result, the criminal and the police learn from each other, and continuously improve themselves ● Finally, when the fake money looks so realistic that the police can not distinguish, the world is over! 35
  • 36. In the GAN world ● The criminal is called Generator ● The police is called Discriminator ● Generator and Discriminator are usually Neural Networks (but not required) ● GAN’s problems: ○ Unstable to train ○ Non-convergence 36
  • 37. GAN’s application: Image to Image Translation 37(Isola et al, 2016)
  • 38. Deep Convolutional Generative Adversarial Networks (DCGAN) [1] ● Both Generator and Discriminator are Deep Convolutional Neural Networks ● Apply some techniques for stable training ○ Replace pooling layers with strided convolutions (discriminator) and fractional-strided convolutions (generator) ○ Use batch normalization ○ Remove fully connected hidden layers ○ Use LeakyRELU activation in the discriminator for all layers ○ ... 38 [1] Radford et al, 2015
  • 39. Generator Network in DCGAN 39(Radford et al, 2015)
  • 40. Experiment: Train DCGAN on our celebrity dataset 40 Step 0 Step 1000 Step 40000
  • 41. Experiment: Train DCGAN on the Ultimate 2 dataset 41 Step 0 Step 1000 Step 15000
  • 42. Conclusion ● We have introduced step-by-step of developing an application based on Deep Learning ● Succeed in creating a face classification app based on CNN ● Achieved 98% accuracy for validation test and good result on test data ● Successfully generated images using DCGAN 42