SlideShare una empresa de Scribd logo
1 de 18
Deriving  Conversational  Insight  by  
Learning  Emoji  Representations
VP,  Technology
Jeff  Weintraub
a  You  &  Mr  Jones  company
//BigDataLA2017
AGENDA
1. Emoji  Adop?on  
2. Emojineering  
3. Conversa?onal  Insight
Product  &  Technology  Development1.  Emoji  Adoption
4
//BigDataLA2017
Emoji  Adoption  -­‐  Instagram
October  2011  
Emoji  keyboard  launches  on  iOS
10%  
Instagram  Comments  contained  emoji  
(Nov  2011)
50%+
Instagram  Comments  contained  
emoji  (March  2015)
See  https://engineering.instagram.com/emojineering-­‐part-­‐1-­‐machine-­‐learning-­‐for-­‐emoji-­‐trendsmachine-­‐learning-­‐for-­‐emoji-­‐trends-­‐7f5f9cb979ad
5
//BigDataLA2017
Emoji  Adoption  -­‐  Instagram
2,666  
Emojis  in  Unicode  Standard  as  of  
May  2017
-­‐0.93  
Correla?on  coefficient  within  respec?ve  
cohorts
See  https://engineering.instagram.com/emojineering-­‐part-­‐1-­‐machine-­‐learning-­‐for-­‐emoji-­‐trendsmachine-­‐learning-­‐for-­‐emoji-­‐trends-­‐7f5f9cb979ad
Product  &  Technology  Development2.  Emojineering
7
//BigDataLA2017
Emojineering
Ford  GTs  are  the    
Ford  GTs  are
!
!
8
//BigDataLA2017
Emojineering
Ford  GTs  are  the    
Ford  GTs  are
!
!
(Pos)
(Neg)
9
//BigDataLA2017
Emojineering
NLP  SemanCc  Analysis  
-­‐ N-­‐gram  Nueral  Network  Language  
Model  (NNLM)
See  Mikolov,  et  al.  Efficient  Estimation  of  Word  Representations  in  Vector  Space,  2013
Q  =  Training  Complexity;  Goal  is  to  minimize  so  can  be  trained  efficiently  on        
more  data  
C  is  the  maximum  distance  of  the  words.    
V  is  size  of  the  vocabulary;  output  layer  dimensionality
-­‐ Trained  with  stochas?c  gradient  descent  
(SGD)  and  back  propaga?on
-­‐ Maximize  classifica?on  of  a  word  based  
on  another  word  in  the  same  sentence.
ConCnuous  Skip-­‐gram  Model  
10
//BigDataLA2017
Emojineering
Skip-­‐gram  Model  
-­‐ if  we  choose  C  =  5,  for  each  training  
word  we  will  select  randomly  a  number  
R  in  range  <  1;  C  >,  and  then  use  R  
words  from  history  and  R  words  from  
the  future  of  the  current  word  as  
correct  labels.
See  Mikolov,  et  al.  Efficient  Estimation  of  Word  Representations  in  Vector  Space,  2013
-­‐ increasing  the  range  improves  quality  of  
the  resul?ng  word  vectors,  but  it  also  
increases  the  computa?onal  complexity
11
//BigDataLA2017
Emojineering
DistribuConal  Hypothesis  
Words  that  occur  in  similar  contexts  tend  
to  have  similar  meanings  (Harris,  1954;  
Firth,  1957;  Deerwester  et  al.,  1990)
Training  Accuracy  
-­‐ 300  dimensional  vectors;  words  and  
emojis  
-­‐ 3  million  phrases  
-­‐ 6B  tokens
the, Ford, GT
cars, Ford, :)
See  https://engineering.instagram.com/emojineering-­‐part-­‐1-­‐machine-­‐learning-­‐for-­‐emoji-­‐trendsmachine-­‐learning-­‐for-­‐emoji-­‐trends-­‐7f5f9cb979ad
12
//BigDataLA2017
Emojineering  -­‐  Visualization
the, Ford, GT
cars, Ford, :)
13
//BigDataLA2017
Emojineering
DistribuConal  Hypothesis  
Words  that  occur  in  similar  contexts  tend  
to  have  similar  meanings  (Harris,  1954;  
Firth,  1957;  Deerwester  et  al.,  1990)
100  Billion  Words  
Model  contains  300  dimensional  vectors  
for  3  million  words  and  phrases
the, Ford, GT
cars, Ford, :)
3.  Conversational  Insight
14
//BigDataLA2017
Conversational  Insight  -­‐  Entertainment  Vertical
65.23%  
of  Emojis  used  were  Top  10  Emojis
34.7%  
of  Emojis  uses  were                      and  😂 😍
30.01%  of  Emojis  used  were  seman?cally  
relevant  to  key  words
15
//BigDataLA2017
Conversational  Insight  -­‐  Retail  Vertical
58.14%  
of  Emojis  used  were  Top  10  Emojis
22.5%  
of  Emojis  uses  were                      and   😍
11.78%  of  Emojis  used  were  seman?cally  
relevant  to  key  words
❤
16
//BigDataLA2017
Conversational  Insight  -­‐  Beauty  Vertical
71.22%  
of  Emojis  used  were  Top  10  Emojis
37.8%  
of  Emojis  uses  were                      and  😂 😍
4%  of  Emojis  used  were  seman?cally  
relevant  to  key  words
//BigDataLA2017
AGENDA
1. Emoji  Adop?on  
2. Emojineering  
3. Conversa?onal  Insight
Thank  You!
jeff@theamplify.com
@jeff_weintraub
a  You  &  Mr  Jones  company

Más contenido relacionado

Similar a Deriving Conversational Insight by Learning Emoji Representation by Jeff Weintraub

Responsive Web Cross-Media and Mobile
Responsive Web Cross-Media and MobileResponsive Web Cross-Media and Mobile
Responsive Web Cross-Media and MobileMatthew Snyder
 
Is your business ready for voice?
Is your business ready for voice?Is your business ready for voice?
Is your business ready for voice?Somo
 
Why artificial intelligence matters in i os app development
Why artificial intelligence matters in i os app developmentWhy artificial intelligence matters in i os app development
Why artificial intelligence matters in i os app developmentConcetto Labs
 
Decoding AI: A primer for property management leaders
Decoding AI: A primer for property management leadersDecoding AI: A primer for property management leaders
Decoding AI: A primer for property management leadersAppFolio
 
Social media and mobile presentation 2011
Social media and mobile presentation 2011 Social media and mobile presentation 2011
Social media and mobile presentation 2011 Galit Fein
 
Importance of Programming Language in Day to Day Life
Importance of Programming Language in Day to Day LifeImportance of Programming Language in Day to Day Life
Importance of Programming Language in Day to Day Lifeijtsrd
 
Itamoji: Italian Emoji Prediction @ Evalita 2018
Itamoji: Italian Emoji Prediction @ Evalita 2018Itamoji: Italian Emoji Prediction @ Evalita 2018
Itamoji: Italian Emoji Prediction @ Evalita 2018University of Torino
 
Maturation of the Twitter Ecosystem
Maturation of the Twitter EcosystemMaturation of the Twitter Ecosystem
Maturation of the Twitter EcosystemKevin Makice
 
Mobile design | development services
Mobile design | development servicesMobile design | development services
Mobile design | development servicesZnSoftech Pvt.Ltd
 
IRJET - A Web-based College Enquiry Chatbot using .Net and Dataset
IRJET - A Web-based College Enquiry Chatbot using .Net and DatasetIRJET - A Web-based College Enquiry Chatbot using .Net and Dataset
IRJET - A Web-based College Enquiry Chatbot using .Net and DatasetIRJET Journal
 
Web Scraping reveals top tech trends and company’s media mentions in 2017
Web Scraping reveals top tech trends and company’s media mentions in 2017Web Scraping reveals top tech trends and company’s media mentions in 2017
Web Scraping reveals top tech trends and company’s media mentions in 2017PromptCloud
 
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNINGTHE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNINGIRJET Journal
 

Similar a Deriving Conversational Insight by Learning Emoji Representation by Jeff Weintraub (20)

Responsive Web Cross-Media and Mobile
Responsive Web Cross-Media and MobileResponsive Web Cross-Media and Mobile
Responsive Web Cross-Media and Mobile
 
Is your business ready for voice?
Is your business ready for voice?Is your business ready for voice?
Is your business ready for voice?
 
Why artificial intelligence matters in i os app development
Why artificial intelligence matters in i os app developmentWhy artificial intelligence matters in i os app development
Why artificial intelligence matters in i os app development
 
Improving Emoji Understanding Tasks using EmojiNet – A Mini-Tutorial
Improving Emoji Understanding Tasks using EmojiNet – A Mini-TutorialImproving Emoji Understanding Tasks using EmojiNet – A Mini-Tutorial
Improving Emoji Understanding Tasks using EmojiNet – A Mini-Tutorial
 
EmojiNet: An Open Service and API for Emoji Sense Discovery
EmojiNet: An Open Service and API for Emoji Sense DiscoveryEmojiNet: An Open Service and API for Emoji Sense Discovery
EmojiNet: An Open Service and API for Emoji Sense Discovery
 
Decoding AI: A primer for property management leaders
Decoding AI: A primer for property management leadersDecoding AI: A primer for property management leaders
Decoding AI: A primer for property management leaders
 
Social media and mobile presentation 2011
Social media and mobile presentation 2011 Social media and mobile presentation 2011
Social media and mobile presentation 2011
 
Importance of Programming Language in Day to Day Life
Importance of Programming Language in Day to Day LifeImportance of Programming Language in Day to Day Life
Importance of Programming Language in Day to Day Life
 
Itamoji: Italian Emoji Prediction @ Evalita 2018
Itamoji: Italian Emoji Prediction @ Evalita 2018Itamoji: Italian Emoji Prediction @ Evalita 2018
Itamoji: Italian Emoji Prediction @ Evalita 2018
 
Maturation of the Twitter Ecosystem
Maturation of the Twitter EcosystemMaturation of the Twitter Ecosystem
Maturation of the Twitter Ecosystem
 
How many types of mobile apps
How many types of mobile appsHow many types of mobile apps
How many types of mobile apps
 
Mobile design | development services
Mobile design | development servicesMobile design | development services
Mobile design | development services
 
TotalSynch-PitchDeck
TotalSynch-PitchDeckTotalSynch-PitchDeck
TotalSynch-PitchDeck
 
My cv
My cvMy cv
My cv
 
IRJET - A Web-based College Enquiry Chatbot using .Net and Dataset
IRJET - A Web-based College Enquiry Chatbot using .Net and DatasetIRJET - A Web-based College Enquiry Chatbot using .Net and Dataset
IRJET - A Web-based College Enquiry Chatbot using .Net and Dataset
 
So you want to build an app
So you want to build an appSo you want to build an app
So you want to build an app
 
Machine Learning on Mobile
Machine Learning on MobileMachine Learning on Mobile
Machine Learning on Mobile
 
Machine Learning on Mobile
Machine Learning on MobileMachine Learning on Mobile
Machine Learning on Mobile
 
Web Scraping reveals top tech trends and company’s media mentions in 2017
Web Scraping reveals top tech trends and company’s media mentions in 2017Web Scraping reveals top tech trends and company’s media mentions in 2017
Web Scraping reveals top tech trends and company’s media mentions in 2017
 
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNINGTHE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
 

Más de Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA
 

Más de Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Último

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 

Último (20)

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 

Deriving Conversational Insight by Learning Emoji Representation by Jeff Weintraub

  • 1. Deriving  Conversational  Insight  by   Learning  Emoji  Representations VP,  Technology Jeff  Weintraub a  You  &  Mr  Jones  company
  • 2. //BigDataLA2017 AGENDA 1. Emoji  Adop?on   2. Emojineering   3. Conversa?onal  Insight
  • 3. Product  &  Technology  Development1.  Emoji  Adoption
  • 4. 4 //BigDataLA2017 Emoji  Adoption  -­‐  Instagram October  2011   Emoji  keyboard  launches  on  iOS 10%   Instagram  Comments  contained  emoji   (Nov  2011) 50%+ Instagram  Comments  contained   emoji  (March  2015) See  https://engineering.instagram.com/emojineering-­‐part-­‐1-­‐machine-­‐learning-­‐for-­‐emoji-­‐trendsmachine-­‐learning-­‐for-­‐emoji-­‐trends-­‐7f5f9cb979ad
  • 5. 5 //BigDataLA2017 Emoji  Adoption  -­‐  Instagram 2,666   Emojis  in  Unicode  Standard  as  of   May  2017 -­‐0.93   Correla?on  coefficient  within  respec?ve   cohorts See  https://engineering.instagram.com/emojineering-­‐part-­‐1-­‐machine-­‐learning-­‐for-­‐emoji-­‐trendsmachine-­‐learning-­‐for-­‐emoji-­‐trends-­‐7f5f9cb979ad
  • 6. Product  &  Technology  Development2.  Emojineering
  • 7. 7 //BigDataLA2017 Emojineering Ford  GTs  are  the     Ford  GTs  are ! !
  • 8. 8 //BigDataLA2017 Emojineering Ford  GTs  are  the     Ford  GTs  are ! ! (Pos) (Neg)
  • 9. 9 //BigDataLA2017 Emojineering NLP  SemanCc  Analysis   -­‐ N-­‐gram  Nueral  Network  Language   Model  (NNLM) See  Mikolov,  et  al.  Efficient  Estimation  of  Word  Representations  in  Vector  Space,  2013 Q  =  Training  Complexity;  Goal  is  to  minimize  so  can  be  trained  efficiently  on         more  data   C  is  the  maximum  distance  of  the  words.     V  is  size  of  the  vocabulary;  output  layer  dimensionality -­‐ Trained  with  stochas?c  gradient  descent   (SGD)  and  back  propaga?on -­‐ Maximize  classifica?on  of  a  word  based   on  another  word  in  the  same  sentence. ConCnuous  Skip-­‐gram  Model  
  • 10. 10 //BigDataLA2017 Emojineering Skip-­‐gram  Model   -­‐ if  we  choose  C  =  5,  for  each  training   word  we  will  select  randomly  a  number   R  in  range  <  1;  C  >,  and  then  use  R   words  from  history  and  R  words  from   the  future  of  the  current  word  as   correct  labels. See  Mikolov,  et  al.  Efficient  Estimation  of  Word  Representations  in  Vector  Space,  2013 -­‐ increasing  the  range  improves  quality  of   the  resul?ng  word  vectors,  but  it  also   increases  the  computa?onal  complexity
  • 11. 11 //BigDataLA2017 Emojineering DistribuConal  Hypothesis   Words  that  occur  in  similar  contexts  tend   to  have  similar  meanings  (Harris,  1954;   Firth,  1957;  Deerwester  et  al.,  1990) Training  Accuracy   -­‐ 300  dimensional  vectors;  words  and   emojis   -­‐ 3  million  phrases   -­‐ 6B  tokens the, Ford, GT cars, Ford, :) See  https://engineering.instagram.com/emojineering-­‐part-­‐1-­‐machine-­‐learning-­‐for-­‐emoji-­‐trendsmachine-­‐learning-­‐for-­‐emoji-­‐trends-­‐7f5f9cb979ad
  • 13. 13 //BigDataLA2017 Emojineering DistribuConal  Hypothesis   Words  that  occur  in  similar  contexts  tend   to  have  similar  meanings  (Harris,  1954;   Firth,  1957;  Deerwester  et  al.,  1990) 100  Billion  Words   Model  contains  300  dimensional  vectors   for  3  million  words  and  phrases the, Ford, GT cars, Ford, :) 3.  Conversational  Insight
  • 14. 14 //BigDataLA2017 Conversational  Insight  -­‐  Entertainment  Vertical 65.23%   of  Emojis  used  were  Top  10  Emojis 34.7%   of  Emojis  uses  were                      and  😂 😍 30.01%  of  Emojis  used  were  seman?cally   relevant  to  key  words
  • 15. 15 //BigDataLA2017 Conversational  Insight  -­‐  Retail  Vertical 58.14%   of  Emojis  used  were  Top  10  Emojis 22.5%   of  Emojis  uses  were                      and   😍 11.78%  of  Emojis  used  were  seman?cally   relevant  to  key  words ❤
  • 16. 16 //BigDataLA2017 Conversational  Insight  -­‐  Beauty  Vertical 71.22%   of  Emojis  used  were  Top  10  Emojis 37.8%   of  Emojis  uses  were                      and  😂 😍 4%  of  Emojis  used  were  seman?cally   relevant  to  key  words
  • 17. //BigDataLA2017 AGENDA 1. Emoji  Adop?on   2. Emojineering   3. Conversa?onal  Insight