SlideShare una empresa de Scribd logo
1 de 27
Image Search at Facebook:
Making sense of one of the
largest image databases in the
world
Fedor Borisyuk, engineering leader at Facebook
A bit about me
• Fedor Borisyuk
• At Facebook since April 2017
• Lead ML teams in the domains
• Computer vision
Agenda
1. Photo Search product
2. Photo Search at FB
3. Deep dive: Large scale image classification
4. Deep dive: Optical character recognition
5. Q & A
1. Overview of Photo Search product
Photo Search at Facebook
•Social Photos – posted by
friends
•Public photos – posted by
people to be publicly visible
•Over a billion images uploaded
every day
What people are searching for
https://unsplash.com/photos/eIvu9C94UfY
https://unsplash.com/photos/c9H7UzXK7uk
https://unsplash.com/photos/yihlaRCCvd4
https://unsplash.com/photos/UWw9OD3pIMo
https://unsplash.com/photos/4V07cUP8Sxc
https://unsplash.com/photos/FBXuXp57eM0
https://unsplash.com/photos/PGnqT0rXWLs
Friends photos
Celebrities
Products
Memes
https://unsplash.com/photos/EzH46XCDQRY
Recipes
Music/Movies
Places
Sport events
News
https://www.nps.gov/locations/alaska/news.htm
What people are searching for
https://unsplash.com/photos/yihlaRCCvd4
Query: running dog meme
https://unsplash.com/photos/DIZBFTl7c-A
Query: child pink skirt
https://en.wikipedia.org/wiki/Strelitzia#/media/File:Strelitzia_larger.jpg
Query: strelitzia
2. Photo Search at FB
Unicorn: Infrastructure of search
* Unicorn: A System for Searching the Social Graph, VLDB, 2013, Mike Curtiss et al.
Photo Search Ranking pipeline
Search
request
Retrieval
1st stage
ranking
2nd stage
ranking
Models
https://code.fb.com/ml-applications/under-the-hood-photo-search/
Overview ML Technologies
• CNNs for large scale image classification
• Ranking
• Neural networks
• GBDTs
• Features based on:
• Image clustering
• Image tagging
• Image quality
• Multimodal relationship between Query and Image
• Optical character recognition
Modeling similarity between query and image
• Multilingual query embeddings trained using Fasttext (https://github.com/facebookresearch/fastText)
• Image embeddings trained on ResNeXt
Extending Photos with textual description
Publication: Multi-model similarity propagation and its application for web image retrieval, Xin-Jing Wang at el.
Photos are coming from:
https://unsplash.com/photos/3WhQe8sEBZU
https://unsplash.com/photos/ie8giTVBVxE
https://unsplash.com/photos/9FWfFy4N4R8
https://unsplash.com/photos/a90WklNaPBM
https://unsplash.com/photos/9EwxGJdTJNo
3. Deep dive: Large scale image classification
Large scale Image classification
• Architecture: ResNeXt 101 with >800
million parameters
• Train data: 3.5 billion public images and
17,000 hashtags
ECCV, 2018
Supervised Unsupervised
ImageNet: Cat, dog, … #cat, #dog, …
Weakly supervised
Large scale Image classification: Noise
Large scale Image
classification
• Labels collision
• utilize WordNet to merge some
hashtags into a single canonical form
(e.g., #brownbear and #ursusarctos
are merged)
• Skewed label distribution
• Square root sampling
Image search at facebook - making sense of one of the largest image databases in the world
4. Deep dive: Optical character recognition
Optical Character Recognition
• OCR is a process of conversion of electronic images into machine
encoded text
Optical character recognition
KDD, 2018
OCR End-to-end Process
Text Detection Model
• Faster R-CNN performs detection and object recognition by:
Learn
CNN Image
Representati
on
Learn region
proposal network
to produce
bounding boxes
Learn classifier to
recognize if box
contains text
Remove duplicate
overlapping boxes
Learn regression to
refine boxes
coordinates
• CNN ResNet-18 architecture
• Cast as sequence prediction problem:
• Input: the image containing the text
• Output: sequence of characters
• Use Connectionist Temporal Classification (CTC) loss to train
Text Recognition Model
• Recognition model inference:
• in linear time by greedily taking the most likely
character at every position
• recognize words of arbitrary length and out-of-
vocabulary words
Text Recognition Model
• CTC model harder to train as model consistently diverged
• Curriculum learning – start easy:
• short words <= 5 characters
• low learning rate so the model doesn’t diverge
Curriculum learning training
Q & A

Más contenido relacionado

Similar a Image search at facebook - making sense of one of the largest image databases in the world

Analyzing data about our data
Analyzing data about our dataAnalyzing data about our data
Analyzing data about our dataHeather Piwowar
 
Bring Satellite and Drone Imagery into your Data Science Workflows
Bring Satellite and Drone Imagery into your Data Science WorkflowsBring Satellite and Drone Imagery into your Data Science Workflows
Bring Satellite and Drone Imagery into your Data Science WorkflowsDatabricks
 
NISO REST Training IIIF
NISO REST Training IIIF NISO REST Training IIIF
NISO REST Training IIIF Glen Robson
 
Mainstreaming Digital Imaging: Missouri Botanical Garden Archives
Mainstreaming Digital Imaging: Missouri Botanical Garden Archives Mainstreaming Digital Imaging: Missouri Botanical Garden Archives
Mainstreaming Digital Imaging: Missouri Botanical Garden Archives Chris Freeland
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesUnited States Air Force Academy
 
Generating Audio-Visual Slideshows from Text Articles Using Word Concreteness
Generating Audio-Visual Slideshows from Text Articles Using Word ConcretenessGenerating Audio-Visual Slideshows from Text Articles Using Word Concreteness
Generating Audio-Visual Slideshows from Text Articles Using Word Concretenessivaderivader
 
Phd Colloquium Spatial Analysis
Phd Colloquium Spatial AnalysisPhd Colloquium Spatial Analysis
Phd Colloquium Spatial Analysisalistairleak
 
IIIF Introduction and Opportunities at Cornell
IIIF Introduction and Opportunities at CornellIIIF Introduction and Opportunities at Cornell
IIIF Introduction and Opportunities at CornellSimeon Warner
 
Searching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonJonathon Hare
 
Searching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonJonathon Hare
 
How the Web of Data Will be Won
How the Web of Data Will be WonHow the Web of Data Will be Won
How the Web of Data Will be WonJeni Tennison
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchSujit Pal
 
There is a method to it: Making meaning in information research through a mix...
There is a method to it: Making meaning in information research through a mix...There is a method to it: Making meaning in information research through a mix...
There is a method to it: Making meaning in information research through a mix...Lynn Connaway
 
Similarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSymeon Papadopoulos
 
Using a Wiki for Collaboration and Coordination
Using a Wiki for Collaboration and CoordinationUsing a Wiki for Collaboration and Coordination
Using a Wiki for Collaboration and CoordinationConnie Crosby
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinAnja Jentzsch
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision Chen Sagiv
 
Conservation's Digital Landscape: one conservator's perspective
Conservation's Digital Landscape: one conservator's perspectiveConservation's Digital Landscape: one conservator's perspective
Conservation's Digital Landscape: one conservator's perspectiveNancie Ravenel
 
Exploring the Application Potential of Relational Web Tables
Exploring the Application Potential of Relational Web TablesExploring the Application Potential of Relational Web Tables
Exploring the Application Potential of Relational Web TablesChris Bizer
 

Similar a Image search at facebook - making sense of one of the largest image databases in the world (20)

Analyzing data about our data
Analyzing data about our dataAnalyzing data about our data
Analyzing data about our data
 
Bring Satellite and Drone Imagery into your Data Science Workflows
Bring Satellite and Drone Imagery into your Data Science WorkflowsBring Satellite and Drone Imagery into your Data Science Workflows
Bring Satellite and Drone Imagery into your Data Science Workflows
 
NISO REST Training IIIF
NISO REST Training IIIF NISO REST Training IIIF
NISO REST Training IIIF
 
Mainstreaming Digital Imaging: Missouri Botanical Garden Archives
Mainstreaming Digital Imaging: Missouri Botanical Garden Archives Mainstreaming Digital Imaging: Missouri Botanical Garden Archives
Mainstreaming Digital Imaging: Missouri Botanical Garden Archives
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large Repositories
 
Generating Audio-Visual Slideshows from Text Articles Using Word Concreteness
Generating Audio-Visual Slideshows from Text Articles Using Word ConcretenessGenerating Audio-Visual Slideshows from Text Articles Using Word Concreteness
Generating Audio-Visual Slideshows from Text Articles Using Word Concreteness
 
Phd Colloquium Spatial Analysis
Phd Colloquium Spatial AnalysisPhd Colloquium Spatial Analysis
Phd Colloquium Spatial Analysis
 
IIIF Introduction and Opportunities at Cornell
IIIF Introduction and Opportunities at CornellIIIF Introduction and Opportunities at Cornell
IIIF Introduction and Opportunities at Cornell
 
Searching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at Southampton
 
Searching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at Southampton
 
How the Web of Data Will be Won
How the Web of Data Will be WonHow the Web of Data Will be Won
How the Web of Data Will be Won
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity Search
 
Public art project
Public art projectPublic art project
Public art project
 
There is a method to it: Making meaning in information research through a mix...
There is a method to it: Making meaning in information research through a mix...There is a method to it: Making meaning in information research through a mix...
There is a method to it: Making meaning in information research through a mix...
 
Similarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia content
 
Using a Wiki for Collaboration and Coordination
Using a Wiki for Collaboration and CoordinationUsing a Wiki for Collaboration and Coordination
Using a Wiki for Collaboration and Coordination
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision
 
Conservation's Digital Landscape: one conservator's perspective
Conservation's Digital Landscape: one conservator's perspectiveConservation's Digital Landscape: one conservator's perspective
Conservation's Digital Landscape: one conservator's perspective
 
Exploring the Application Potential of Relational Web Tables
Exploring the Application Potential of Relational Web TablesExploring the Application Potential of Relational Web Tables
Exploring the Application Potential of Relational Web Tables
 

Más de MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

Más de MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Último

UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIUdaiappa Ramachandran
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdfJamie (Taka) Wang
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 

Último (20)

UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AI
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 

Image search at facebook - making sense of one of the largest image databases in the world

  • 1. Image Search at Facebook: Making sense of one of the largest image databases in the world Fedor Borisyuk, engineering leader at Facebook
  • 2. A bit about me • Fedor Borisyuk • At Facebook since April 2017 • Lead ML teams in the domains • Computer vision
  • 3. Agenda 1. Photo Search product 2. Photo Search at FB 3. Deep dive: Large scale image classification 4. Deep dive: Optical character recognition 5. Q & A
  • 4. 1. Overview of Photo Search product
  • 5. Photo Search at Facebook •Social Photos – posted by friends •Public photos – posted by people to be publicly visible •Over a billion images uploaded every day
  • 6. What people are searching for https://unsplash.com/photos/eIvu9C94UfY https://unsplash.com/photos/c9H7UzXK7uk https://unsplash.com/photos/yihlaRCCvd4 https://unsplash.com/photos/UWw9OD3pIMo https://unsplash.com/photos/4V07cUP8Sxc https://unsplash.com/photos/FBXuXp57eM0 https://unsplash.com/photos/PGnqT0rXWLs Friends photos Celebrities Products Memes https://unsplash.com/photos/EzH46XCDQRY Recipes Music/Movies Places Sport events News https://www.nps.gov/locations/alaska/news.htm
  • 7. What people are searching for https://unsplash.com/photos/yihlaRCCvd4 Query: running dog meme https://unsplash.com/photos/DIZBFTl7c-A Query: child pink skirt https://en.wikipedia.org/wiki/Strelitzia#/media/File:Strelitzia_larger.jpg Query: strelitzia
  • 9. Unicorn: Infrastructure of search * Unicorn: A System for Searching the Social Graph, VLDB, 2013, Mike Curtiss et al.
  • 10. Photo Search Ranking pipeline Search request Retrieval 1st stage ranking 2nd stage ranking Models https://code.fb.com/ml-applications/under-the-hood-photo-search/
  • 11. Overview ML Technologies • CNNs for large scale image classification • Ranking • Neural networks • GBDTs • Features based on: • Image clustering • Image tagging • Image quality • Multimodal relationship between Query and Image • Optical character recognition
  • 12. Modeling similarity between query and image • Multilingual query embeddings trained using Fasttext (https://github.com/facebookresearch/fastText) • Image embeddings trained on ResNeXt
  • 13. Extending Photos with textual description Publication: Multi-model similarity propagation and its application for web image retrieval, Xin-Jing Wang at el. Photos are coming from: https://unsplash.com/photos/3WhQe8sEBZU https://unsplash.com/photos/ie8giTVBVxE https://unsplash.com/photos/9FWfFy4N4R8 https://unsplash.com/photos/a90WklNaPBM https://unsplash.com/photos/9EwxGJdTJNo
  • 14. 3. Deep dive: Large scale image classification
  • 15. Large scale Image classification • Architecture: ResNeXt 101 with >800 million parameters • Train data: 3.5 billion public images and 17,000 hashtags ECCV, 2018 Supervised Unsupervised ImageNet: Cat, dog, … #cat, #dog, … Weakly supervised
  • 16. Large scale Image classification: Noise
  • 17. Large scale Image classification • Labels collision • utilize WordNet to merge some hashtags into a single canonical form (e.g., #brownbear and #ursusarctos are merged) • Skewed label distribution • Square root sampling
  • 19. 4. Deep dive: Optical character recognition
  • 20. Optical Character Recognition • OCR is a process of conversion of electronic images into machine encoded text
  • 23. Text Detection Model • Faster R-CNN performs detection and object recognition by: Learn CNN Image Representati on Learn region proposal network to produce bounding boxes Learn classifier to recognize if box contains text Remove duplicate overlapping boxes Learn regression to refine boxes coordinates
  • 24. • CNN ResNet-18 architecture • Cast as sequence prediction problem: • Input: the image containing the text • Output: sequence of characters • Use Connectionist Temporal Classification (CTC) loss to train Text Recognition Model
  • 25. • Recognition model inference: • in linear time by greedily taking the most likely character at every position • recognize words of arbitrary length and out-of- vocabulary words Text Recognition Model
  • 26. • CTC model harder to train as model consistently diverged • Curriculum learning – start easy: • short words <= 5 characters • low learning rate so the model doesn’t diverge Curriculum learning training
  • 27. Q & A