2. who am i? what does Ravelin do?
building intelligent data products
things to think about when building them
3. stephen whitworth
2 years at Hailo as data scientist/jack of some trades out of
university
product and marketplace analytics, agent based
modelling, data engineering, stream processing
services
data science/engineering at ravelin, specifically
focused on our detection capabilities
4. what is ravelin?
online fraud detection and prevention platform
stream data to us
we give fraud probability instantly + beautiful data
visualisation to understand your customers
backed by techstars/passion/playfair/amadeus/indeed.com
founder/wonga founder amongst other great investors
11. hybrid: augment expertise by learning rules from data
cards don’t commit fraud, people do
stop the customer before they even get to ordering
12. ‘a random forest is like a room full of
experts who have seen different
cases of fraud from different
perspectives’
13. ‘a random forest is like a room full of
experts who have seen different
cases of fraud from different
perspectives’
N
14. measure and optimise for the right thing(s) in your data
product
account for the fact that your customers are at different
stages to one another, and optimise for different things
15. precision: of all of my predictions, what % was I correct?
recall: out of all of the fraudsters, what % did I catch?
implicit tradeoff between conversion and fraud loss
‘accuracy’ a useless metric for fraud
17. use tools that make you disproportionately productive
shameless fans of BigQuery
our analysis stack: BigQuery, JupyterHub,
pandas, scikit-learn
internal Google network is super fast, so wise to
co-locate with your data
18. enable fast iteration by keeping model interfaces simple
hide arbitrarily complex transformations behind it
expose it over REST or a queue
version control them, roll backwards/forwards/sideways
19.
20. q: do you always trade performance for explainability?
a: no
if someone’s neck is on the line for your decision,
allow them to understand how you came to it
22. always be monitoring, probing for edge cases
dogfood - use robot customers
run strategies in ‘dark mode’ to determine performance
many ways things could break - be paranoid
‘machine learning: the high interest credit card of
technical debt’ - Google
23. in beta and signing up clients
looking for on-demand services/marketplaces,
payment service providers that are facing fraud
problems
talk to me afterwards
24. obligatory: we are hiring!
junior machine learning engineers/data scientists
stephen.whitworth@ravelin.com or talk to me after