SlideShare a Scribd company logo
1 of 38
Why Neural Net Field Aware Factorization
Machines
are able to break ground in
digital behaviours prediction
Presenter: Gunjan Sharma
Co-Author: Varun Kumar Modi
About the Authors
Presenter: Gunjan Sharma
System Architect @ InMobi (3 years)
SE @Facebook (2.5 Years)
DPE @Google (1 year)
Twitter Handle: @gunjan_1409
LinkedIn:
https://www.linkedin.com/in/gunjan-
sharma-a6794414/
Co-author: Varun Kumar Modi
Sr Research Scientist @ InMobi(5 years)
LinkedIn:
https://www.linkedin.com/in/varun-
modi-33800652/
Content
1) The problem and context
2) The Motivation
3) Building the model theory: piece by piece
4) Results of the 2 use cases
5) Understanding exactly why it works
6) Implementation at InMobi scale
Content
1) The problem and context
2) The Motivation
3) Building the model theory: piece by piece
4) Results of the 2 use cases
5) Understanding exactly why it works
6) Implementation at InMobi scale
InMobi is one of the largest advertising platform at scale globally
InMobi reaches >2 billion MAU across the world - specialised in mobile In-app advertising
JAPA
N
INDIA+
SEA
CHINA
Afri
ca
ANZ
NORTH
AMERICA
KOREA
EMEA
Latin
America
LATIN
AMERICA
Afri
ca
AfricaAFRICA
China
APAC
Consolidation has taken place to
clean up the ecosystem few
advertising platforms at scale exist
North America
(only
Video) Very limited number of players have
presence in Asia, InMobi is dominating
Few players control each component of the
chain; No presence of global players, except
InMobi
Problem stmt and why it matters
● What are the problems:
Use case 1 - Conversion ratio (CVR) prediction:
- CVR = Install rate of users = Probability of a install given a click
- Usage: CPM = CTR * CVR * CPI
Use case 2 - Video completion rate (VCR) prediction:
- Video completion rate of users watching advertising videos given click
● Why are they important:
○ Performance business - based on arbitrage, so the model directly determines the margin/profit of the
business and the ability of the campaign to achieve significant scale = > multi-million dollar
businesses!
Existing context and challenges
● Models traditionally used Linear/Logistic Regression and Tree-based models
● Both have their strengths and weaknesses when used in production
● What we need is an awesome model that sits somewhere in the middle and
can bring in the best of both worlds
LR Tree Based
Generalise for unseen combinations Our use cases could not
Potentially Underfit at times Potentially can overfit at times
Requires lesser RAM Can at times bloat RAM usage specially
with high cardinality features
Content
1) The problem and context
2) The Motivation
3) Building the model theory: piece by piece
4) Results of the 2 use cases
5) Understanding exactly why it works
6) Implementation at InMobi scale
Why think of NN for CVR/VCR
prediction
● Using cross features in LR wasn’t cutting it for us.
● Plus at some point it starts to become cumbersome both at training and
prediction time.
● All the major predictions noted here follow a complex curve
● LR left much to desire compared to Tree based models for example because
interaction-terms are limited
● We tried couple of awesome models that were also not able to beat Tree
based models
We all agreed that Neural Nets are a suitable technology to find higher order
interactions between our features
At the same time they have the power of generalising to unseen combinations.
Challenges Involved
● Traditionally NNs are more utilized for Classification problems
● We want to model our predictions as regression problem
● Most of the features are categorical which means we need to use one-hot
encoding
● This causes NN to spew very bad results as they need a lot of data to train
efficiently.
● Plus cardinality of some features is very high and it makes life more troublesome.
● Model should be easy to productionised both for training and serving
● Spark isn’t suited for custom NN networks.
● Model should be debuggable as much as possible to be able to explain the
Business changes
● The resistance to using NN for a long time came because of the lack of
understanding into their internals
Content
1) The problem and context
2) The Motivation
3) Building the model theory: piece by piece
4) Results of the 2 use cases
5) Understanding exactly why it works
6) Implementation at InMobi scale
Consider the following dummy dataset
Publisher Advertiser Gender CVR
ESPN Nike Male 0.01
CNBC Nike Male 0.0004
ESPN Adidas Female 0.008
Sony Coke Female 0.0005
Sony P&G Male 0.002
Factorization Machine (FM) - What are those
ESPN CNBC SONY Adi Nike Coke P&G Male Female
X0
X1
X2
Y0
Y1
Y2
Z0
Z1
Z2
Publisher Advertiser Gender CVR
ESPN Nike Male 0.01
1 0 0 0 1 0 0 1 0
= Publisher
Latent Vector
(PV)
= Advertiser
Latent Vector
(AV)
= Gender
Latent Vector
(GV)
PVT*AV + AVT*GV + GVT*PV = pCVR
NOTE: All vectors are K dimensional which is hyper parameter for the algorithm
Factorization Machine (FM) - What are those
● K dimensional representation for every feature value
● Captures second order interactions across all the features (ATB =
|A|*|B|*cos(Θ))
● Essentially a combination of hyperbolas summed up to form the final
prediction
● Works better than LR but tree based models are still more powerful.
● EG: Predict movie’s revenue:
Features
Movie
City
Gender
Latent Features
Horror
Comedy
Action
Romance
Second Order Intuition
● For every latent feature
● For every pair of original feature
● How much this latent feature affect
revenue when considering these pair
Final predicted revenue is linear sum over
all latent features
Field aware Factorization Machine (FFM)
ESPN CNBC SONY Adi Nike Coke P&G Male Female
XA
0
XA
1
XA
2
Publisher Advertiser Gender CVR
ESPN Nike Male 0.01
1 0 0 0 1 0 0 1 0
PVA
PVA
T*AVP + AVG
T*GVA + GVP
T*PVG = pCVR
NOTE: All vectors are K dimensional which is hyper parameter for the algorithm
XG
0
XG
1
XG
2
PVG
YP
0
YP
1
YP
2
AVP
YG
0
YG
1
YG
2
AVG
ZP
0
ZP
1
ZP
2
GVP
ZA
0
ZA
1
ZA
2
GVA
Field aware Factorization Machine (FFM)
● We have a K dimensional vector for every feature value for every other feature
type
● Still second order interactions but with more degrees of freedom than FM
● Intuition: Latent features interact with every other cross feature differently
Works significantly better than FM, but at certain cuts was still not able to beat
Tree based model
Deep neural-net with Factorisation Machine:
DeepFM
Sigmoid(FM + NeuralNet(PV :+ AV :+ GV)) = pCVR
DeepFM
● Now we are entering the neural net world
● This model is a combination of FM and NN and the final prediction is sum of
the output from the 2 models
● Here we optimize the entire graph together.
● It performs better than using the latent vectors from FM and then running
them through neural net as a secondary optimization (FNN)
● It performs better than FM but not better than FFM
● Intuition: FM finds the second order interactions while neural net uses the
latent vectors to find the higher order nonlinear interactions.
Neural Factorization Machine: NFM
NeuralNet((PV.*AV .+ AV.*GV .+ GV.*PV)T) = pCVR
NFM
● In this architecture you only run the second order features through NN instead
of the raw latent vectors
● Intuition: The neural net takes the second order interactions and uses them to
find the higher order nonlinear interactions
● Performs better than DeepFM mostly attributed to the 2 facts
○ The size of the net is smaller hence converges faster.
○ The neural net can take the second order interactions and convert them easily to higher order
interactions.
● Results were better than DeepFM as well. But still not better than FFM
InMobi Spec: DeepFFM
Feature1
F2E
Dense
Embeddings
F3E F1E F3E F1E F2E
Hidden Layers
Act
FF Machine
Ypred
Feature2 Feature3 Spare Features
InMobi Spec: DeepFFM
● A simple upgrade to deepFM
● Performs better than both DeepFM and FFM
● Training is slower
● FFM part of things does the majority of the prediction heavy lifting. Evidently
due to faster gradient convergence.
● Intuition: Take the latent vectors run them through NN for higher order
interactions and use FFM for second order interactions.
InMobi Spec: NFFM
Feature1
F2E
Dense
Embeddings
F3E F1E F3E F1E F2E
Feature2 Feature3
Sparse
Features
FF Machine
Hidden Layers
….... K inputs
Ypred
InMobi Spec: NFFM
● A simple upgrade to NFM
● Does better than everyone significantly.
● Converges faster than DeepFFM
● Intuition: Take the second order interactions from FFM and run them through
neural net to find higher order nonlinear interactions.
Content
1) The problem and context
2) The Motivation
3) Building the model theory: piece by piece
4) Results of the 2 use cases
5) Understanding exactly why it works
6) Implementation at InMobi scale
Use case 1 - Results CVR
Accuracy function: (ΣWᵢ * abs(Yactᵢ - Ypredᵢ))
ΣWᵢ
Model FFM DeepFM DeepFFM NFFM
Accuracy %
Improvement over
Linear model (small
DS)
44% 35% 48% 64%
Use case 1 - Results CVR
Training Data
Dates
Test Date Accuracy %
Improvement over
Linear Model
T1-T7 T7 21%
T1-T7 T8 14%
T2-T8 T8 20%
T2-T8 T9 14%
% Improvement over Tree
model
Cut1 21.7%
Cut2 18.5%
Use case 2 - Results VCR
Error Ftn(AEPV -
Absolute Error Per
View):
(Σ(Viewsᵢ-Cmpltdᵢ) * abs(Ypredᵢ) +(Cmpltdᵢ) * abs(1 - Ypredᵢ))
ΣViewsᵢ
Model / % AEPV
Improvement By
Country OS Cut
over last 7 day
Avg Model
Logistic Reg Logistic Reg(2nd
order
Autoregressive
features)
LR (GBT based
Feature
Engineering)
NFFM
Cut1 -3.71% 2.30% 2.51% 3.00%
Cut2 -2.16% 3.05% 4.48% 28.83%
Cut3 -0.31% -0.56% 5.65% 12.47%
Use case 2 - Results VCR
● LR with L2 Regularisation
● 2nd Order features were selected based on Information Gain criteria
● GBT package in spark Mlib was used(numTrees = 400, maxDepth=8,
sampling=0.5 minInstancePerNode = 10).
○ Training process was too slow, even with large enough resources.
○ Xgboost with Spark(tried later) was faster , and resulted in further Improvements
● NFFM: Increasing the number of layers till 3 resulted in further 20%
improvement in the validation errors, no significant improvement after that
Content
1) The problem and context
2) The Motivation
3) Building the model theory: piece by piece
4) Results of the 2 use cases
5) Understanding exactly why it works
6) Implementation at InMobi scale
Building the full intuition
Factorisation machine:
● Handling categorical features and sparse data matrix
● Extracting latent variables, e.g., identifying non-explicit segment profiles in the population
Field-aware:
● Dimensionality reduction (high cardinality features to K dimension representation)
● Increases degrees of freedom (compared to FM in terms field-specific values) to enable exhaustive
set of second-order interactions
Neural network:
● Explores and weight higher order interactions - went up to 3 layers of interaction sucessfully
● Generates numerical prediction
● Training the factors based on performance of both FM machine and Neural Nets (instead of training
them separately causing latent vectors to only be limited by power of FM)
Content
1) The problem and context
2) The Motivation
3) Building the model theory: piece by piece
4) Results of the 2 use cases
5) Understanding exactly why it works
6) Implementation at InMobi scale
Implementation details
● Hyper params are k, lambda, num layers, num nodes in layers, activation
functions
● Implemented in Tensorflow
● Adam optimizer
● L2 regularization. No dropouts
● No batch-normalization
● 1 layer 100 nodes performs good enough and saves compute
● ReLU activations (converges faster)
● k=16 (try with powers of 2)
● Weighted RMSE as loss function for both use cases
Predicting for unseen feature values
ESPN CNBC SONY UNKNOWN?
XA
0
XA
1
XA
2
XG
0
XG
1
XG
2
● Avg latent feature interactions per feature for unknown values
YA
0
YA
1
YA
2
YG
0
YG
1
YG
2
ZA
0
ZA
1
ZA
2
ZG
0
ZG
1
ZG
2
(XA
0+YA
0+ZA
0)/3
(XA
1+YA
1+ZA
1)/3
(XA
2+YA
2+ZA
2)/3
(XG
0+YG
0+ZG
0)/3
(XG
1+YG
1+ZG
1)/3
(XG
2+YG
2+ZG
2)/3
Implementing @ low-latency, high-scale
● MLeap: MLeap framework provides support for models trained both in Spark
and Tensorflow. Helps us train models in Spark for Tree based models and
TF models for NN based models
● Offline training and challenges: We cannot train TF models on yarn cluster
hence we use a GPU machine as gateway to pull data and from HDFS and
train on GPU
● Online serving challenges: TF serving has pretty low throughput and wasn’t
scaling for our QPS. Hence we are using local LRU cache with decent TTL to
scale the TF serving
Future research that we are currently pursuing...
● Hybrid Binning NFFM
● Distributed training and serving
● Dropouts & Batch Normalization
● Methods to interpret the latent-vector (Using methods like t-Distributed
Stochastic Neighbour Embedding (t-SNE) etc)
References
FM: https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf
FFM: http://research.criteo.com/ctr-prediction-linear-model-field-aware-factorization-machines/
DeepFM: https://arxiv.org/pdf/1703.04247.pdf
NFM: https://arxiv.org/pdf/1708.05027.pdf
GBT Based Feature Engg: http://quinonero.net/Publications/predicting-clicks-facebook.pdf
Thank You!

More Related Content

What's hot

Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsJustin Basilico
 
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...Gabriel Moreira
 
Accelerated Training of Transformer Models
Accelerated Training of Transformer ModelsAccelerated Training of Transformer Models
Accelerated Training of Transformer ModelsDatabricks
 
The Factorization Machines algorithm for building recommendation system - Paw...
The Factorization Machines algorithm for building recommendation system - Paw...The Factorization Machines algorithm for building recommendation system - Paw...
The Factorization Machines algorithm for building recommendation system - Paw...Evention
 
Recent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionRecent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionKai-Wen Zhao
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer VisionDongmin Choi
 
Introduction To Machine Learning | Edureka
Introduction To Machine Learning | EdurekaIntroduction To Machine Learning | Edureka
Introduction To Machine Learning | EdurekaEdureka!
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksData Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksHima Patel
 
Graph neural networks overview
Graph neural networks overviewGraph neural networks overview
Graph neural networks overviewRodion Kiryukhin
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMsSylvainGugger
 
Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIQualcomm Research
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Hady Elsahar
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsBenjamin Le
 
Distributed machine learning
Distributed machine learningDistributed machine learning
Distributed machine learningStanley Wang
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsArtifacia
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Hwa Pyung Kim
 

What's hot (20)

Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
lecun-01.ppt
lecun-01.pptlecun-01.ppt
lecun-01.ppt
 
BERT
BERTBERT
BERT
 
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
 
Accelerated Training of Transformer Models
Accelerated Training of Transformer ModelsAccelerated Training of Transformer Models
Accelerated Training of Transformer Models
 
The Factorization Machines algorithm for building recommendation system - Paw...
The Factorization Machines algorithm for building recommendation system - Paw...The Factorization Machines algorithm for building recommendation system - Paw...
The Factorization Machines algorithm for building recommendation system - Paw...
 
Recent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionRecent Object Detection Research & Person Detection
Recent Object Detection Research & Person Detection
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer Vision
 
Introduction To Machine Learning | Edureka
Introduction To Machine Learning | EdurekaIntroduction To Machine Learning | Edureka
Introduction To Machine Learning | Edureka
 
Swin transformer
Swin transformerSwin transformer
Swin transformer
 
Bert
BertBert
Bert
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning TasksData Quality for Machine Learning Tasks
Data Quality for Machine Learning Tasks
 
Graph neural networks overview
Graph neural networks overviewGraph neural networks overview
Graph neural networks overview
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
 
Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AI
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Distributed machine learning
Distributed machine learningDistributed machine learning
Distributed machine learning
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their Applications
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
 

Similar to Neural Field aware Factorization Machine

Automated Speech Recognition
Automated Speech Recognition Automated Speech Recognition
Automated Speech Recognition Pruthvij Thakar
 
IRJET- American Sign Language Classification
IRJET- American Sign Language ClassificationIRJET- American Sign Language Classification
IRJET- American Sign Language ClassificationIRJET Journal
 
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...Edge AI and Vision Alliance
 
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...ssuser4b1f48
 
Intelligent Systems Project: Bike sharing service modeling
Intelligent Systems Project: Bike sharing service modelingIntelligent Systems Project: Bike sharing service modeling
Intelligent Systems Project: Bike sharing service modelingAlessio Villardita
 
IRJET- Mango Classification using Convolutional Neural Networks
IRJET- Mango Classification using Convolutional Neural NetworksIRJET- Mango Classification using Convolutional Neural Networks
IRJET- Mango Classification using Convolutional Neural NetworksIRJET Journal
 
STOCK MARKET PREDICTION USING NEURAL NETWORKS
STOCK MARKET PREDICTION USING NEURAL NETWORKSSTOCK MARKET PREDICTION USING NEURAL NETWORKS
STOCK MARKET PREDICTION USING NEURAL NETWORKSIRJET Journal
 
Web-Based Online Embedded Security System And Alertness Via Social Media
Web-Based Online Embedded Security System And Alertness Via Social MediaWeb-Based Online Embedded Security System And Alertness Via Social Media
Web-Based Online Embedded Security System And Alertness Via Social MediaIRJET Journal
 
Zipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkZipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkDatabricks
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsIntel® Software
 
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDSFACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDSIRJET Journal
 
Realtime selenium interview questions
Realtime selenium interview questionsRealtime selenium interview questions
Realtime selenium interview questionsKuldeep Pawar
 
IRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
IRJET- Implementation of Gender Detection with Notice Board using Raspberry PiIRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
IRJET- Implementation of Gender Detection with Notice Board using Raspberry PiIRJET Journal
 
MN-3, MN-Core and HPL - SC21 Green500 BOF
MN-3, MN-Core and HPL - SC21 Green500 BOFMN-3, MN-Core and HPL - SC21 Green500 BOF
MN-3, MN-Core and HPL - SC21 Green500 BOFPreferred Networks
 
Iwsm2014 sizing the entire development process (mauricio aguiar & luigi bug...
Iwsm2014   sizing the entire development process (mauricio aguiar & luigi bug...Iwsm2014   sizing the entire development process (mauricio aguiar & luigi bug...
Iwsm2014 sizing the entire development process (mauricio aguiar & luigi bug...Nesma
 
Choosing the Right Transformer for Your Data Challenge
Choosing the Right Transformer for Your Data ChallengeChoosing the Right Transformer for Your Data Challenge
Choosing the Right Transformer for Your Data ChallengeSafe Software
 

Similar to Neural Field aware Factorization Machine (20)

Automated Speech Recognition
Automated Speech Recognition Automated Speech Recognition
Automated Speech Recognition
 
Bitcoin Price Prediction
Bitcoin Price PredictionBitcoin Price Prediction
Bitcoin Price Prediction
 
IRJET- American Sign Language Classification
IRJET- American Sign Language ClassificationIRJET- American Sign Language Classification
IRJET- American Sign Language Classification
 
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
 
Icbai 2018 ver_1
Icbai 2018 ver_1Icbai 2018 ver_1
Icbai 2018 ver_1
 
Intelligent Systems Project: Bike sharing service modeling
Intelligent Systems Project: Bike sharing service modelingIntelligent Systems Project: Bike sharing service modeling
Intelligent Systems Project: Bike sharing service modeling
 
IRJET- Mango Classification using Convolutional Neural Networks
IRJET- Mango Classification using Convolutional Neural NetworksIRJET- Mango Classification using Convolutional Neural Networks
IRJET- Mango Classification using Convolutional Neural Networks
 
STOCK MARKET PREDICTION USING NEURAL NETWORKS
STOCK MARKET PREDICTION USING NEURAL NETWORKSSTOCK MARKET PREDICTION USING NEURAL NETWORKS
STOCK MARKET PREDICTION USING NEURAL NETWORKS
 
Web-Based Online Embedded Security System And Alertness Via Social Media
Web-Based Online Embedded Security System And Alertness Via Social MediaWeb-Based Online Embedded Security System And Alertness Via Social Media
Web-Based Online Embedded Security System And Alertness Via Social Media
 
Zipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkZipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering Framework
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
 
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDSFACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
 
Realtime selenium interview questions
Realtime selenium interview questionsRealtime selenium interview questions
Realtime selenium interview questions
 
IRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
IRJET- Implementation of Gender Detection with Notice Board using Raspberry PiIRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
IRJET- Implementation of Gender Detection with Notice Board using Raspberry Pi
 
Jc nov.07.2019
Jc nov.07.2019Jc nov.07.2019
Jc nov.07.2019
 
MN-3, MN-Core and HPL - SC21 Green500 BOF
MN-3, MN-Core and HPL - SC21 Green500 BOFMN-3, MN-Core and HPL - SC21 Green500 BOF
MN-3, MN-Core and HPL - SC21 Green500 BOF
 
Iwsm2014 sizing the entire development process (mauricio aguiar & luigi bug...
Iwsm2014   sizing the entire development process (mauricio aguiar & luigi bug...Iwsm2014   sizing the entire development process (mauricio aguiar & luigi bug...
Iwsm2014 sizing the entire development process (mauricio aguiar & luigi bug...
 
Choosing the Right Transformer for Your Data Challenge
Choosing the Right Transformer for Your Data ChallengeChoosing the Right Transformer for Your Data Challenge
Choosing the Right Transformer for Your Data Challenge
 

More from InMobi

Responding to Coronavirus: How marketers can leverage digital responsibly
Responding to Coronavirus: How marketers can leverage digital responsiblyResponding to Coronavirus: How marketers can leverage digital responsibly
Responding to Coronavirus: How marketers can leverage digital responsiblyInMobi
 
2020: Celebrating the Era of the Connected Consumer
2020: Celebrating the Era of the Connected Consumer2020: Celebrating the Era of the Connected Consumer
2020: Celebrating the Era of the Connected ConsumerInMobi
 
Winning the Indian Festive Shopper in 2019
Winning the Indian Festive Shopper in 2019Winning the Indian Festive Shopper in 2019
Winning the Indian Festive Shopper in 2019InMobi
 
The Changing Face of the Indian Mobile User
The Changing Face of the Indian Mobile UserThe Changing Face of the Indian Mobile User
The Changing Face of the Indian Mobile UserInMobi
 
Unlocking the True Potential of Data on Mobile
Unlocking the True Potential of Data on MobileUnlocking the True Potential of Data on Mobile
Unlocking the True Potential of Data on MobileInMobi
 
InMobi State of Mobile Video Advertising Report 2018
InMobi State of Mobile Video Advertising Report 2018InMobi State of Mobile Video Advertising Report 2018
InMobi State of Mobile Video Advertising Report 2018InMobi
 
The Essential Mediation Toolkit - Korean
The Essential Mediation Toolkit - KoreanThe Essential Mediation Toolkit - Korean
The Essential Mediation Toolkit - KoreanInMobi
 
A Comprehensive Guide for App Marketers
A Comprehensive Guide for App MarketersA Comprehensive Guide for App Marketers
A Comprehensive Guide for App MarketersInMobi
 
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud Prevention
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud PreventionA Cure for Ad-Fraud: Turning Fraud Detection into Fraud Prevention
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud PreventionInMobi
 
[Webinar] driving accountability in mobile advertising
[Webinar] driving accountability in mobile advertising[Webinar] driving accountability in mobile advertising
[Webinar] driving accountability in mobile advertisingInMobi
 
The Brand Marketer's Guide to Mobile Video Viewability
The Brand Marketer's Guide to Mobile Video ViewabilityThe Brand Marketer's Guide to Mobile Video Viewability
The Brand Marketer's Guide to Mobile Video ViewabilityInMobi
 
Top 2017 Mobile Advertising Trends in Indonesia
Top 2017 Mobile Advertising Trends in IndonesiaTop 2017 Mobile Advertising Trends in Indonesia
Top 2017 Mobile Advertising Trends in IndonesiaInMobi
 
Mobile marketing strategy guide
Mobile marketing strategy guide Mobile marketing strategy guide
Mobile marketing strategy guide InMobi
 
InMobi Yearbook 2016
InMobi Yearbook 2016InMobi Yearbook 2016
InMobi Yearbook 2016InMobi
 
Boost Retention on Mobile and Keep Users Coming Back for More!
Boost Retention on Mobile and Keep Users Coming Back for More!Boost Retention on Mobile and Keep Users Coming Back for More!
Boost Retention on Mobile and Keep Users Coming Back for More!InMobi
 
Building Mobile Creatives that Deliver Real Results
Building Mobile Creatives that Deliver Real ResultsBuilding Mobile Creatives that Deliver Real Results
Building Mobile Creatives that Deliver Real ResultsInMobi
 
Everything you need to know about mobile video ads in india and apac
Everything you need to know about mobile video ads in india and apacEverything you need to know about mobile video ads in india and apac
Everything you need to know about mobile video ads in india and apacInMobi
 
The Golden Age of Mobile Video Advertising | Global
The Golden Age of Mobile Video Advertising | GlobalThe Golden Age of Mobile Video Advertising | Global
The Golden Age of Mobile Video Advertising | GlobalInMobi
 
Everything a developer needs to know about the mobile video ads
Everything a developer needs to know about the mobile video ads Everything a developer needs to know about the mobile video ads
Everything a developer needs to know about the mobile video ads InMobi
 
Programmatically Speaking with InMobi and Rubicon Project
Programmatically Speaking with InMobi and Rubicon ProjectProgrammatically Speaking with InMobi and Rubicon Project
Programmatically Speaking with InMobi and Rubicon ProjectInMobi
 

More from InMobi (20)

Responding to Coronavirus: How marketers can leverage digital responsibly
Responding to Coronavirus: How marketers can leverage digital responsiblyResponding to Coronavirus: How marketers can leverage digital responsibly
Responding to Coronavirus: How marketers can leverage digital responsibly
 
2020: Celebrating the Era of the Connected Consumer
2020: Celebrating the Era of the Connected Consumer2020: Celebrating the Era of the Connected Consumer
2020: Celebrating the Era of the Connected Consumer
 
Winning the Indian Festive Shopper in 2019
Winning the Indian Festive Shopper in 2019Winning the Indian Festive Shopper in 2019
Winning the Indian Festive Shopper in 2019
 
The Changing Face of the Indian Mobile User
The Changing Face of the Indian Mobile UserThe Changing Face of the Indian Mobile User
The Changing Face of the Indian Mobile User
 
Unlocking the True Potential of Data on Mobile
Unlocking the True Potential of Data on MobileUnlocking the True Potential of Data on Mobile
Unlocking the True Potential of Data on Mobile
 
InMobi State of Mobile Video Advertising Report 2018
InMobi State of Mobile Video Advertising Report 2018InMobi State of Mobile Video Advertising Report 2018
InMobi State of Mobile Video Advertising Report 2018
 
The Essential Mediation Toolkit - Korean
The Essential Mediation Toolkit - KoreanThe Essential Mediation Toolkit - Korean
The Essential Mediation Toolkit - Korean
 
A Comprehensive Guide for App Marketers
A Comprehensive Guide for App MarketersA Comprehensive Guide for App Marketers
A Comprehensive Guide for App Marketers
 
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud Prevention
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud PreventionA Cure for Ad-Fraud: Turning Fraud Detection into Fraud Prevention
A Cure for Ad-Fraud: Turning Fraud Detection into Fraud Prevention
 
[Webinar] driving accountability in mobile advertising
[Webinar] driving accountability in mobile advertising[Webinar] driving accountability in mobile advertising
[Webinar] driving accountability in mobile advertising
 
The Brand Marketer's Guide to Mobile Video Viewability
The Brand Marketer's Guide to Mobile Video ViewabilityThe Brand Marketer's Guide to Mobile Video Viewability
The Brand Marketer's Guide to Mobile Video Viewability
 
Top 2017 Mobile Advertising Trends in Indonesia
Top 2017 Mobile Advertising Trends in IndonesiaTop 2017 Mobile Advertising Trends in Indonesia
Top 2017 Mobile Advertising Trends in Indonesia
 
Mobile marketing strategy guide
Mobile marketing strategy guide Mobile marketing strategy guide
Mobile marketing strategy guide
 
InMobi Yearbook 2016
InMobi Yearbook 2016InMobi Yearbook 2016
InMobi Yearbook 2016
 
Boost Retention on Mobile and Keep Users Coming Back for More!
Boost Retention on Mobile and Keep Users Coming Back for More!Boost Retention on Mobile and Keep Users Coming Back for More!
Boost Retention on Mobile and Keep Users Coming Back for More!
 
Building Mobile Creatives that Deliver Real Results
Building Mobile Creatives that Deliver Real ResultsBuilding Mobile Creatives that Deliver Real Results
Building Mobile Creatives that Deliver Real Results
 
Everything you need to know about mobile video ads in india and apac
Everything you need to know about mobile video ads in india and apacEverything you need to know about mobile video ads in india and apac
Everything you need to know about mobile video ads in india and apac
 
The Golden Age of Mobile Video Advertising | Global
The Golden Age of Mobile Video Advertising | GlobalThe Golden Age of Mobile Video Advertising | Global
The Golden Age of Mobile Video Advertising | Global
 
Everything a developer needs to know about the mobile video ads
Everything a developer needs to know about the mobile video ads Everything a developer needs to know about the mobile video ads
Everything a developer needs to know about the mobile video ads
 
Programmatically Speaking with InMobi and Rubicon Project
Programmatically Speaking with InMobi and Rubicon ProjectProgrammatically Speaking with InMobi and Rubicon Project
Programmatically Speaking with InMobi and Rubicon Project
 

Recently uploaded

Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 

Recently uploaded (20)

Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 

Neural Field aware Factorization Machine

  • 1. Why Neural Net Field Aware Factorization Machines are able to break ground in digital behaviours prediction Presenter: Gunjan Sharma Co-Author: Varun Kumar Modi
  • 2. About the Authors Presenter: Gunjan Sharma System Architect @ InMobi (3 years) SE @Facebook (2.5 Years) DPE @Google (1 year) Twitter Handle: @gunjan_1409 LinkedIn: https://www.linkedin.com/in/gunjan- sharma-a6794414/ Co-author: Varun Kumar Modi Sr Research Scientist @ InMobi(5 years) LinkedIn: https://www.linkedin.com/in/varun- modi-33800652/
  • 3. Content 1) The problem and context 2) The Motivation 3) Building the model theory: piece by piece 4) Results of the 2 use cases 5) Understanding exactly why it works 6) Implementation at InMobi scale
  • 4. Content 1) The problem and context 2) The Motivation 3) Building the model theory: piece by piece 4) Results of the 2 use cases 5) Understanding exactly why it works 6) Implementation at InMobi scale
  • 5. InMobi is one of the largest advertising platform at scale globally InMobi reaches >2 billion MAU across the world - specialised in mobile In-app advertising JAPA N INDIA+ SEA CHINA Afri ca ANZ NORTH AMERICA KOREA EMEA Latin America LATIN AMERICA Afri ca AfricaAFRICA China APAC Consolidation has taken place to clean up the ecosystem few advertising platforms at scale exist North America (only Video) Very limited number of players have presence in Asia, InMobi is dominating Few players control each component of the chain; No presence of global players, except InMobi
  • 6. Problem stmt and why it matters ● What are the problems: Use case 1 - Conversion ratio (CVR) prediction: - CVR = Install rate of users = Probability of a install given a click - Usage: CPM = CTR * CVR * CPI Use case 2 - Video completion rate (VCR) prediction: - Video completion rate of users watching advertising videos given click ● Why are they important: ○ Performance business - based on arbitrage, so the model directly determines the margin/profit of the business and the ability of the campaign to achieve significant scale = > multi-million dollar businesses!
  • 7. Existing context and challenges ● Models traditionally used Linear/Logistic Regression and Tree-based models ● Both have their strengths and weaknesses when used in production ● What we need is an awesome model that sits somewhere in the middle and can bring in the best of both worlds LR Tree Based Generalise for unseen combinations Our use cases could not Potentially Underfit at times Potentially can overfit at times Requires lesser RAM Can at times bloat RAM usage specially with high cardinality features
  • 8. Content 1) The problem and context 2) The Motivation 3) Building the model theory: piece by piece 4) Results of the 2 use cases 5) Understanding exactly why it works 6) Implementation at InMobi scale
  • 9. Why think of NN for CVR/VCR prediction ● Using cross features in LR wasn’t cutting it for us. ● Plus at some point it starts to become cumbersome both at training and prediction time. ● All the major predictions noted here follow a complex curve ● LR left much to desire compared to Tree based models for example because interaction-terms are limited ● We tried couple of awesome models that were also not able to beat Tree based models We all agreed that Neural Nets are a suitable technology to find higher order interactions between our features At the same time they have the power of generalising to unseen combinations.
  • 10. Challenges Involved ● Traditionally NNs are more utilized for Classification problems ● We want to model our predictions as regression problem ● Most of the features are categorical which means we need to use one-hot encoding ● This causes NN to spew very bad results as they need a lot of data to train efficiently. ● Plus cardinality of some features is very high and it makes life more troublesome. ● Model should be easy to productionised both for training and serving ● Spark isn’t suited for custom NN networks. ● Model should be debuggable as much as possible to be able to explain the Business changes ● The resistance to using NN for a long time came because of the lack of understanding into their internals
  • 11. Content 1) The problem and context 2) The Motivation 3) Building the model theory: piece by piece 4) Results of the 2 use cases 5) Understanding exactly why it works 6) Implementation at InMobi scale
  • 12. Consider the following dummy dataset Publisher Advertiser Gender CVR ESPN Nike Male 0.01 CNBC Nike Male 0.0004 ESPN Adidas Female 0.008 Sony Coke Female 0.0005 Sony P&G Male 0.002
  • 13. Factorization Machine (FM) - What are those ESPN CNBC SONY Adi Nike Coke P&G Male Female X0 X1 X2 Y0 Y1 Y2 Z0 Z1 Z2 Publisher Advertiser Gender CVR ESPN Nike Male 0.01 1 0 0 0 1 0 0 1 0 = Publisher Latent Vector (PV) = Advertiser Latent Vector (AV) = Gender Latent Vector (GV) PVT*AV + AVT*GV + GVT*PV = pCVR NOTE: All vectors are K dimensional which is hyper parameter for the algorithm
  • 14. Factorization Machine (FM) - What are those ● K dimensional representation for every feature value ● Captures second order interactions across all the features (ATB = |A|*|B|*cos(Θ)) ● Essentially a combination of hyperbolas summed up to form the final prediction ● Works better than LR but tree based models are still more powerful. ● EG: Predict movie’s revenue: Features Movie City Gender Latent Features Horror Comedy Action Romance Second Order Intuition ● For every latent feature ● For every pair of original feature ● How much this latent feature affect revenue when considering these pair Final predicted revenue is linear sum over all latent features
  • 15. Field aware Factorization Machine (FFM) ESPN CNBC SONY Adi Nike Coke P&G Male Female XA 0 XA 1 XA 2 Publisher Advertiser Gender CVR ESPN Nike Male 0.01 1 0 0 0 1 0 0 1 0 PVA PVA T*AVP + AVG T*GVA + GVP T*PVG = pCVR NOTE: All vectors are K dimensional which is hyper parameter for the algorithm XG 0 XG 1 XG 2 PVG YP 0 YP 1 YP 2 AVP YG 0 YG 1 YG 2 AVG ZP 0 ZP 1 ZP 2 GVP ZA 0 ZA 1 ZA 2 GVA
  • 16. Field aware Factorization Machine (FFM) ● We have a K dimensional vector for every feature value for every other feature type ● Still second order interactions but with more degrees of freedom than FM ● Intuition: Latent features interact with every other cross feature differently Works significantly better than FM, but at certain cuts was still not able to beat Tree based model
  • 17. Deep neural-net with Factorisation Machine: DeepFM Sigmoid(FM + NeuralNet(PV :+ AV :+ GV)) = pCVR
  • 18. DeepFM ● Now we are entering the neural net world ● This model is a combination of FM and NN and the final prediction is sum of the output from the 2 models ● Here we optimize the entire graph together. ● It performs better than using the latent vectors from FM and then running them through neural net as a secondary optimization (FNN) ● It performs better than FM but not better than FFM ● Intuition: FM finds the second order interactions while neural net uses the latent vectors to find the higher order nonlinear interactions.
  • 19. Neural Factorization Machine: NFM NeuralNet((PV.*AV .+ AV.*GV .+ GV.*PV)T) = pCVR
  • 20. NFM ● In this architecture you only run the second order features through NN instead of the raw latent vectors ● Intuition: The neural net takes the second order interactions and uses them to find the higher order nonlinear interactions ● Performs better than DeepFM mostly attributed to the 2 facts ○ The size of the net is smaller hence converges faster. ○ The neural net can take the second order interactions and convert them easily to higher order interactions. ● Results were better than DeepFM as well. But still not better than FFM
  • 21. InMobi Spec: DeepFFM Feature1 F2E Dense Embeddings F3E F1E F3E F1E F2E Hidden Layers Act FF Machine Ypred Feature2 Feature3 Spare Features
  • 22. InMobi Spec: DeepFFM ● A simple upgrade to deepFM ● Performs better than both DeepFM and FFM ● Training is slower ● FFM part of things does the majority of the prediction heavy lifting. Evidently due to faster gradient convergence. ● Intuition: Take the latent vectors run them through NN for higher order interactions and use FFM for second order interactions.
  • 23. InMobi Spec: NFFM Feature1 F2E Dense Embeddings F3E F1E F3E F1E F2E Feature2 Feature3 Sparse Features FF Machine Hidden Layers ….... K inputs Ypred
  • 24. InMobi Spec: NFFM ● A simple upgrade to NFM ● Does better than everyone significantly. ● Converges faster than DeepFFM ● Intuition: Take the second order interactions from FFM and run them through neural net to find higher order nonlinear interactions.
  • 25. Content 1) The problem and context 2) The Motivation 3) Building the model theory: piece by piece 4) Results of the 2 use cases 5) Understanding exactly why it works 6) Implementation at InMobi scale
  • 26. Use case 1 - Results CVR Accuracy function: (ΣWᵢ * abs(Yactᵢ - Ypredᵢ)) ΣWᵢ Model FFM DeepFM DeepFFM NFFM Accuracy % Improvement over Linear model (small DS) 44% 35% 48% 64%
  • 27. Use case 1 - Results CVR Training Data Dates Test Date Accuracy % Improvement over Linear Model T1-T7 T7 21% T1-T7 T8 14% T2-T8 T8 20% T2-T8 T9 14% % Improvement over Tree model Cut1 21.7% Cut2 18.5%
  • 28. Use case 2 - Results VCR Error Ftn(AEPV - Absolute Error Per View): (Σ(Viewsᵢ-Cmpltdᵢ) * abs(Ypredᵢ) +(Cmpltdᵢ) * abs(1 - Ypredᵢ)) ΣViewsᵢ Model / % AEPV Improvement By Country OS Cut over last 7 day Avg Model Logistic Reg Logistic Reg(2nd order Autoregressive features) LR (GBT based Feature Engineering) NFFM Cut1 -3.71% 2.30% 2.51% 3.00% Cut2 -2.16% 3.05% 4.48% 28.83% Cut3 -0.31% -0.56% 5.65% 12.47%
  • 29. Use case 2 - Results VCR ● LR with L2 Regularisation ● 2nd Order features were selected based on Information Gain criteria ● GBT package in spark Mlib was used(numTrees = 400, maxDepth=8, sampling=0.5 minInstancePerNode = 10). ○ Training process was too slow, even with large enough resources. ○ Xgboost with Spark(tried later) was faster , and resulted in further Improvements ● NFFM: Increasing the number of layers till 3 resulted in further 20% improvement in the validation errors, no significant improvement after that
  • 30. Content 1) The problem and context 2) The Motivation 3) Building the model theory: piece by piece 4) Results of the 2 use cases 5) Understanding exactly why it works 6) Implementation at InMobi scale
  • 31. Building the full intuition Factorisation machine: ● Handling categorical features and sparse data matrix ● Extracting latent variables, e.g., identifying non-explicit segment profiles in the population Field-aware: ● Dimensionality reduction (high cardinality features to K dimension representation) ● Increases degrees of freedom (compared to FM in terms field-specific values) to enable exhaustive set of second-order interactions Neural network: ● Explores and weight higher order interactions - went up to 3 layers of interaction sucessfully ● Generates numerical prediction ● Training the factors based on performance of both FM machine and Neural Nets (instead of training them separately causing latent vectors to only be limited by power of FM)
  • 32. Content 1) The problem and context 2) The Motivation 3) Building the model theory: piece by piece 4) Results of the 2 use cases 5) Understanding exactly why it works 6) Implementation at InMobi scale
  • 33. Implementation details ● Hyper params are k, lambda, num layers, num nodes in layers, activation functions ● Implemented in Tensorflow ● Adam optimizer ● L2 regularization. No dropouts ● No batch-normalization ● 1 layer 100 nodes performs good enough and saves compute ● ReLU activations (converges faster) ● k=16 (try with powers of 2) ● Weighted RMSE as loss function for both use cases
  • 34. Predicting for unseen feature values ESPN CNBC SONY UNKNOWN? XA 0 XA 1 XA 2 XG 0 XG 1 XG 2 ● Avg latent feature interactions per feature for unknown values YA 0 YA 1 YA 2 YG 0 YG 1 YG 2 ZA 0 ZA 1 ZA 2 ZG 0 ZG 1 ZG 2 (XA 0+YA 0+ZA 0)/3 (XA 1+YA 1+ZA 1)/3 (XA 2+YA 2+ZA 2)/3 (XG 0+YG 0+ZG 0)/3 (XG 1+YG 1+ZG 1)/3 (XG 2+YG 2+ZG 2)/3
  • 35. Implementing @ low-latency, high-scale ● MLeap: MLeap framework provides support for models trained both in Spark and Tensorflow. Helps us train models in Spark for Tree based models and TF models for NN based models ● Offline training and challenges: We cannot train TF models on yarn cluster hence we use a GPU machine as gateway to pull data and from HDFS and train on GPU ● Online serving challenges: TF serving has pretty low throughput and wasn’t scaling for our QPS. Hence we are using local LRU cache with decent TTL to scale the TF serving
  • 36. Future research that we are currently pursuing... ● Hybrid Binning NFFM ● Distributed training and serving ● Dropouts & Batch Normalization ● Methods to interpret the latent-vector (Using methods like t-Distributed Stochastic Neighbour Embedding (t-SNE) etc)
  • 37. References FM: https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf FFM: http://research.criteo.com/ctr-prediction-linear-model-field-aware-factorization-machines/ DeepFM: https://arxiv.org/pdf/1703.04247.pdf NFM: https://arxiv.org/pdf/1708.05027.pdf GBT Based Feature Engg: http://quinonero.net/Publications/predicting-clicks-facebook.pdf