Science Engineering Product
• What we invest into
• How we structure and integrate the team
• What deliverables we expect
Actions and
Reactions
Types of
Problems
Existing product, existing solution Existing product, new solution
New product, existing solution New product, new solution
Science Engineering Product
Optimizing
Existing
Product
Good thing:
• well defined and controlled
environment
Bad thing:
• integration with existing infrastructure
Case N1. Replacing heuristics in existing product with ML
Production
Pipeline
Data Querying
AWS Athena
Data Archive
AWS S3
Data Collection Current Model
Model Training
scikit-learn
Training Data
7 recent days
Validation Data
5% of the last day
Model Validation
Passed?
Skyscanner
Traffic
Pre-processing
Experiments with
Challenger Model
5%
5%
90%
Training Component (AWS CF + AWS Data Pipeline)
Report Failure
Update ModelApache Kafka
Serving Component
ECMLPKDD’2018, https://arxiv.org/pdf/1812.01735.pdf
Optimizing
Existing
Product
Good thing:
• well defined and controlled
environment
Bad thing:
• integration with existing infrastructure
Science Engineering Product
Case N1. Replacing heuristics in existing product with ML
Iterating over
Existing
Algorithm
Good thing:
• return on infrastructure
investments
Bad thing:
• possibly limited impact
Science Engineering Product
Case N2. Iterating over existing ML algorithm in existing product
Building a
New Product
Good thing:
• less dependencies
Bad thing:
• high level of uncertainty
Science Engineering Product
Case N3. Building a first version of a new data product
Managing
Uncertainty
Levels of uncertainty:
– Is the position right?
– Is the user flow right?
– Is the message right?
– Is the design right?
– Is the algorithm right?
– What is the baseline?
– etc. etc.
Levels of uncertainty:
– Is the position right?
– Is the user flow right?
– Is the message right?
– Is the design right?
– Is the algorithm right?
– What is the baseline?
– etc. etc.
Managing
Uncertainty
+30% engagement
Managing
Uncertainty
Skyscanner Backpack
Levels of uncertainty:
– Is the position right?
– Is the user flow right?
– Is the message right?
– Is the design right?
– Is the algorithm right?
– What is the baseline?
– etc. etc.
Managing
Uncertainty
Levels of uncertainty:
– Is the position right?
– Is the user flow right?
– Is the message right?
– Is the design right?
– Is the algorithm right?
– What is the baseline?
– etc. etc.
Start really simple
Managing
Uncertainty
Levels of uncertainty:
– Is the position right?
– Is the user flow right?
– Is the message right?
– Is the design right?
– Is the algorithm right?
– What is the baseline?
– etc. etc.
Study previous experience
Case N4:
NewScience in
New Products
Existing product, existing solution Existing product, new solution
New product, existing solution New product, new solution
Case N4:
NewScience in
New Products
Existing product, existing solution Existing product, new solution
New product, existing solution New product, new solution