Exploring the Future Potential of AI-Enabled Smartphone Processors
Data driven approaches in a technology startup
1. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
Confidential Prepared by Ver.
Data-driven approaches in a technology startup
1.0Michal Szczecinski
2. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
2
Hong Kong
Taiwan
Singapore
South Korea
China
China (300+ cities)
Since 02.2015
08. 2017 merged with 58
Suyun
Hong Kong
Since 07.2013
Singapore
Since 06.2014
South Korea (2 cities)
Since 10.2015
Taiwan
Since 11.2014
India
India
Since 03.2016
Established in 2013, GOGOVAN is the first app-based platform for
delivering goods in Asia.
3. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
3
What I will talk about?
Startup context
Goals of Analytics
Why data matters
Work cases
Lessons learnt
4. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
4
Hong Kong
Oxford
London
5. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
“Data guy”
• Business Intelligence
• Data engineering
• Data Science (data products)
• Data quality
• Digital Marketing/Growth
• Product analytics
• Financial modelling/forecasting
• Strategy analysis
• Big Data Research
• Data compliance
…...
Established multi-team contribution:
Corporate vs Startup
Multidisciplinary, tech wizz, “all-knowing”…. :
6. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 6
Goals of Analytics
Underlying vision is to make GOGOVAN data-driven.
6
1. Decision support
2. Knowledge discovery
3. Optimization
7. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 7
- Supporting all teams (product, operations,
marketing, customer service, engineering, finance,
management, legal and more…)
- Supporting all countries
- Everything related to data
- Multiple outputs (in house build dashboards, etl
jobs, interactive tools, notebooks, ML models,
scientific papers, ad hoc queries, alerts,
infrastructure and tools)
- Multiple input
- Data users across whole organisation
Data team
“Everything data” in GOGOVAN
7
8. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
8
Why data matters?
Just 3 examples (there is more…)
read more: https://towardsdatascience.com/what-does-a-data-team-really-do-12484482e683
9. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 9
1. Price - what user pays/what driver
earns.
2. Time - response, arrival, completion.
3. Quality - customer experience, effort,
reliability...
Service level
improving key components of our
service
9
10. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 10
1. Frontier - After certain point x as the
volume of orders grows, the completion
rate starts to fall exponentially.
2. Wall - Also there is a wall of soft limit of
numbers of orders that can be completed
no matter what is the volume of orders.
3. Improvement - In the whole history of
GOGOVAN that wall has been overcome
just once, very recently. Also this wall
has been steadily raising.
Completion rate
growing business activity
10
*axes and details removed for data confidentiality purposes
11. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 11
1. Transactions - is there any unusual
activity?
2. Partners - do all partners play fair?
3. Systems - are systems working fine?
4. Community - what are people saying?
5. Safety - are people and goods safe?
6. Competition - what’s going on in other
camps?
Anomaly detection
avoiding unexpected
11
12. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
12
What are we working on?
Applications - examples of projects and solutions..
Real use cases and tools (with transformed, hidden or masked details)
13. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
13
Decision Support
14. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 14
1. All-in-one place for data
2. Multi-use - reports, interactive tools,.
Self service, dashboards, algorithms,
docs, training videos etc.
3. Search
4. Tagging
5. Collaboration
Data Platform
Operating data services
14
15. Main dashboard charts
Goal: Provide decision support on all important areas of the company for the respective team members.
Action: Get important metrics by different breakdowns and time periods. Monitor progress and
Outcome: Lower Costs/More GMV/More Users
16. Next Generation self service analytics
Goal: Enable end users to to effectively analyse and retrieve the data.
Action: Build custom reports, share comments and insights, optimised UX.
Outcome: Lower costs/Better Service
17. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
17
Knowledge discovery
18. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 18
1. Focused on particular problem/question
2. Thousands of searchable and reproducible reports
3. Publishing tools
4. Auto Generated reports and alerts
5. Metadata and templates
6. Analytics Meetings
Notebooks
Scaling deep knowledge
18
19. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 19
Real Time Heatmap
19
1. Interactive monitoring
2. Adopted - used by ops
3. Goal: Visualize drivers and orders
4. Action: identify idle drivers and pending
orders, understand and affect distribution
of supply/demand
5. Outcome: Higher GMV/Better service
20. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 20
Marketplace analysis
Monitoring and stimulating GOGOVAN ecosystem
20
1. Arrival time
2. Distribution of orders
3. Supply/demand proportion
4. Completion time
5. Utilization rate
21. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
21
Optimization
22. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
22
Predicting demand
Algorithms
Selected examples
Predicting unmet demand
Predicting order status
Driver Matching
Route Optimization
Churn prediction
23. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 23
1. Responding to questions, what was an
impact of x ?
2. ARMA Exogenous Variable Model
(ARMAX)
3. DOW, Weather, Holiday
Demand prediction
Causal inference
23
24. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 24
1. Goal: Predict unmet demand and balance
supply/demand.
2. Action: Know how many more drivers we
need at particular regions at the
particular time in order to fulfill expected
demand.
3. Outcome: Better Service/Higher GMV
Unmet demand prediction
Balancing supply/demand
24
25. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 25
1. Goal: Optimise supply and demand.
2. Action: Match drivers to orders better so
that we optimise key operational KPIs.
3. Outcome: Lower Costs/Better
Service/Higher GMV
Dynamic Supply and Demand dispatching in
spatially structured region based on big data
analytics
Matching best driver.
25
26. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 26
1. Goal: Plan route in a way that utilizes
drivers time and provide cost benefits for
the customer.
2. Action: choose quickest route; avoid
obstacles, traffic and hot spots, predict
ETA, bundle orders so that is more cost
efficient for the driver
3. Outcome: Higher GMV/Lower
costs/Better Service
4. Scalable
5. Cost efficient
6. High performance
7. Customizable
8. In-house competitive advantage
Route Optimization
Increasing operations efficiency: route optimization.
Bundling and scheduling.
26
27. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 27
1. Interactive real time tool predicting
if/how fast order will be picked up
2. Response time (percentiles, absolute)
3. Zero rated probability
4. Feature Importance
5. Action: identify risky orders, assign
orders before they are cancelled by user,
add bonus/subsidy, redirect bad
performing orders to specified pool of
drivers/incentivize drivers/notify user to
add bonus in the app
6. Outcome: More revenue/More
users/Improved experience for user
Order status prediction
Estimating attractiveness of the order
27
28. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 28
1. Predicting churn
2. Identifying things that lead to churn
3. Prevent churn
Predicting churn
Engaging clients
28
29. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
29
How to become a data-driven organisation?
Lessons learnt
read more : article coming soon “Principles for becoming data-driven”.
31. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 31
1. Goals: 1) cross system data integration
2) analytics abstraction 3) data analytics
4) real time data services
2. Minimal management cost
3. Scalable
4. Well integrated with data analytics tools
5. Universal, being able to support different
type of systems and events
6. Facilitating productivity of data science
team , with minimized maintenance
effort and cognitive load
7. Ideally unified data science workflow
across batch and real time
Data Infrastructure
(GOGOTRACK)
Real Time analytics source
31
32. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission. 32
1. Scaling
2. Traceability
3. Multiple models
4. Flexibility
5. Multiple consumers
6. Reproducibility
7. Performance and availability
ML Logistics
(GOGOMI)
Operating data services
32
33. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
33
Data-driven Framework
34. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
34
ML/AI Initiatives
35. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
Ops Data Brain
35
Real-time Heatmap on steroids with ML recommendations for ops
36. Copyrights: Proprietary and confidential. Not to be distributed or reproduced without permission.
Confidential Prepared by Ver.
Michal Szczecinski
https://www.linkedin.com/in/michalszczecinski/
michal@gogotech.hk
Thank you
Michal Szczecinski