SlideShare a Scribd company logo
1 of 56
@neal_lathia
computer laboratory: university of cambridge
online   offline
urban
        data mining
 web




        urbanmining.wordpress.com
online
user data + algorithms → relevance ☺
public transport


user data + algorithms → relevance
“smart” cards
1 facilitate payment
2 collect user data
“smart” cards
time-stamped locations,
modality, payments,
user categories


anonymised with
persistent user ids
“smart” cards datasets
100% - 1 month
~5.1 million people
~78.8 million trips

5% - 2 x 83 days
~300k people
~7.7 million trips
Purchase Geography                                   Mobility Flow
45
                                                                                      Zone 1
                                          PAYG                                        Zone 2
40
                                          Travel Cards                                Zone 3
35                                                                                    Zone 4
                                                                                      Zone 5
30                                                                                    Zone 6
25

20

15

10

5                                                            arrive
0
     1   2   3       4    5    6      7        8         9
using transport data for...

    1 predicting disruption relevance
    2 personalised travel time
    3 fare purchase recommendation
can we use transport data for...

    1 predicting disruption relevance
      i.e., rank station importance correctly?
can we use transport data for...

       predicting disruption relevance
       i.e., rank station importance correctly?
       (where you will go in the future)
percentile ranking

0.0 (best)
…
0.5 (random)
…
1.0 (inverse)
percentile ranking

0.0 (best)
...
0.25 (rank stations by popularity)
...
0.5 (random)
…
1.0 (inverse)
percentile ranking

0.0 (best)
...
0.06 (factor in user's history)
...
0.25 (rank stations by popularity)
...
0.5 (random)
…
1.0 (inverse)
percentile ranking

0.0 (best)
…
0.05 (“those who touch in here also touch in at...”)
...
0.06 (factor in user's history)
...
0.25 (rank stations by popularity)
...
0.5 (random)
…
1.0 (inverse)
accurate ranking without

    1 explicitly asking
    2 network topology, rail schedule
using transport data for...

    1 predicting disruption relevance
    2 personalised travel time
can we use transport data for...

    2 predict your travel time
      i.e., time between touch in/out?
mean absolute error (minutes)

0.0 (best)
…
mean absolute error (minutes)

0.0 (best)
…
9.82 (time tabled)
mean absolute error (minutes)

0.0 (best)
…
3.30 (mean time)
...
9.82 (time tabled)
mean absolute error (minutes)

0.0 (best)
…
3.28 (“people who travel at this time...”)
3.30 (mean time)
...
9.82 (time tabled)
mean absolute error (minutes)

0.0 (best)
…
3.17 (“people who are as familiar as you...”)
3.28 (“people who travel at this time...”)
3.30 (mean time)
...
9.82 (time tabled)
mean absolute error (minutes)

0.0 (best)
…
3.13 (“your trips in the past...”)
3.17 (“people who are as familiar as you...”)
3.28 (“people who travel at this time...”)
3.30 (mean time)
...
9.82 (time tabled)
accurate predictions without

    1 explicitly asking
    2 network topology, rail schedule
    3 ongoing disruptions, delays
using transport data for...

    1 predicting disruption relevance
    2 personalised travel time
    3 fare purchase recommendation
30
                                                                                    Purchase Behaviour
                                                                                                            Travel Cards
                                                                   25
                                                                                                            PAYG


                                                                   20




                                                     % Purchases
                                                                   15



                                                                   10



                                                                   5



                                                                   0
                                                                        Mon   Tue       Wed    Thu    Fri   Sat      Sun




45
             Purchase Geography
                                                                                      Mobility Flow
40
                                      PAYG                                                                        Zone 1
                                      Travel Cards                                                                Zone 2
35                                                   arrive                                                       Zone 3
30                                                                                                                Zone 4
                                                                                                                  Zone 5
25                                                                                                                Zone 6
20

15

10

5

0
     1   2   3   4    5    6      7       8     9
(a) high regularity in purchases & movements
(b) small increments, short terms
(c) purchase on refused entry?
are people making the right choice?
£200 million
     overspend
(a) failure to predict your movements
(b) failing to match mobility with fares
can we use transport data for...

    3 predict the fares you should buy
      i.e., what will be cheapest?
classification accuracy

0.0% (worst)
...
100% (oracle)
classification accuracy

0.0 (worst)
…
77% everyone on pay as you go
...
100% (oracle)
classification accuracy

0.0 (worst)
…
77% everyone on pay as you go
80% naïve bayes
...
100% (oracle)
classification accuracy

0.0 (worst)
…
77% everyone on pay as you go
80% naïve bayes
…
97% (“people like you should have bought...”)
100% (oracle)
classification accuracy

0.0 (worst)
…
77% everyone on pay as you go
80% naïve bayes
…
97% (“people like you should have bought...”)
98% decision trees
100% (oracle)
money saved

£0.0 (worst)
…
£326,447.95 everyone on pay as you go
£393,585.81 naïve bayes
…
£465,822.17 (“people like you...”)
£473,918.38 decision trees
£479,583.91 (oracle)
“smart” cards
1 facilitate payment
2 collect user data

3 enable powerful,
  personalised
  information systems
using transport data for...

    1 behaviours ~ policy & incentives
    2 community well-being
References
N. Lathia, J. Froehlich, L. Capra. Mining Public Transport Usage for Personalised Intelligent
Transport Systems. In IEEE International Conference on Data Mining. December 2010, Sydney,
Australia.

N. Lathia, C. Smith, J. Froehlich, L. Capra. Individuals Among Commuters: Building
Personalised Transport Information Systems from Fare Collection Systems. Under submission.

N. Lathia, L. Capra. Mining Mobility Data to Minimise Travellers' Spending on Public Transport.
In ACM International Conference on Knowledge Discovery and Data Mining. August 2011. San
Diego, USA.

N. Lathia, L. Capra. How Smart is Your Smart Card? Measuring Travel Behaviours,
Perceptions, and Incentives. In ACM International Conference on Ubiquitous Computing.
September 2011. Beijing, China.

N. Lathia, D. Quercia, J. Crowcroft. The Hidden Image of the City: Sensing Community Well-
Being from Urban Mobility. To Appear, 10th International Conference on Pervasive Computing.
June 2012. Newcastle, UK.

More Related Content

Viewers also liked

Ameria Group: Investor Relations Presentation Q2 2014
Ameria Group: Investor Relations Presentation Q2 2014 Ameria Group: Investor Relations Presentation Q2 2014
Ameria Group: Investor Relations Presentation Q2 2014 Ameriabank
 
la comunicacion
la comunicacionla comunicacion
la comunicacion26008733
 
issb experience of one student
issb experience of one studentissb experience of one student
issb experience of one studentOmair Ayaz
 
Final m3 online session 1 wbs3760 24.2.17
Final m3 online session 1 wbs3760 24.2.17Final m3 online session 1 wbs3760 24.2.17
Final m3 online session 1 wbs3760 24.2.17Paula Nottingham
 
Paris Redis Meetup Introduction
Paris Redis Meetup IntroductionParis Redis Meetup Introduction
Paris Redis Meetup IntroductionGregory Boissinot
 
PSYCHOLOGICAL TESTS AT ISSB
PSYCHOLOGICAL TESTS AT ISSBPSYCHOLOGICAL TESTS AT ISSB
PSYCHOLOGICAL TESTS AT ISSBOmair Ayaz
 
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...Jasper Moelker
 

Viewers also liked (9)

Ameria Group: Investor Relations Presentation Q2 2014
Ameria Group: Investor Relations Presentation Q2 2014 Ameria Group: Investor Relations Presentation Q2 2014
Ameria Group: Investor Relations Presentation Q2 2014
 
Pmc profile 22032016
Pmc profile 22032016Pmc profile 22032016
Pmc profile 22032016
 
la comunicacion
la comunicacionla comunicacion
la comunicacion
 
issb experience of one student
issb experience of one studentissb experience of one student
issb experience of one student
 
Final m3 online session 1 wbs3760 24.2.17
Final m3 online session 1 wbs3760 24.2.17Final m3 online session 1 wbs3760 24.2.17
Final m3 online session 1 wbs3760 24.2.17
 
Paris Redis Meetup Introduction
Paris Redis Meetup IntroductionParis Redis Meetup Introduction
Paris Redis Meetup Introduction
 
PSYCHOLOGICAL TESTS AT ISSB
PSYCHOLOGICAL TESTS AT ISSBPSYCHOLOGICAL TESTS AT ISSB
PSYCHOLOGICAL TESTS AT ISSB
 
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...
 
Podocarpus
PodocarpusPodocarpus
Podocarpus
 

More from Neal Lathia

Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Neal Lathia
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Neal Lathia
 
Using language models to supercharge Monzo’s customer support
 Using language models to supercharge Monzo’s customer support Using language models to supercharge Monzo’s customer support
Using language models to supercharge Monzo’s customer supportNeal Lathia
 
Making Better Decisions Faster
Making Better Decisions FasterMaking Better Decisions Faster
Making Better Decisions FasterNeal Lathia
 
Machine Learning, Faster
Machine Learning, FasterMachine Learning, Faster
Machine Learning, FasterNeal Lathia
 
AI & Personalised Experiences
AI & Personalised ExperiencesAI & Personalised Experiences
AI & Personalised ExperiencesNeal Lathia
 
Opportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelOpportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelNeal Lathia
 
Bootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineBootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineNeal Lathia
 
Machine Learning for Product Managers
Machine Learning for Product ManagersMachine Learning for Product Managers
Machine Learning for Product ManagersNeal Lathia
 
Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Neal Lathia
 
Happier and Healthier with Smartphone Data
Happier and Healthier with Smartphone DataHappier and Healthier with Smartphone Data
Happier and Healthier with Smartphone DataNeal Lathia
 
Data Science in Digital Health
Data Science in Digital HealthData Science in Digital Health
Data Science in Digital HealthNeal Lathia
 
Using Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily LifeUsing Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily LifeNeal Lathia
 
Analysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataAnalysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataNeal Lathia
 
Cambridge Quantified Self Meetup
Cambridge Quantified Self MeetupCambridge Quantified Self Meetup
Cambridge Quantified Self MeetupNeal Lathia
 
Data Science in #mHealth
Data Science in #mHealthData Science in #mHealth
Data Science in #mHealthNeal Lathia
 
Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Neal Lathia
 
Emotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentEmotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentNeal Lathia
 
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Neal Lathia
 
Using Smartphones to Research Daily Life
Using Smartphones to Research Daily LifeUsing Smartphones to Research Daily Life
Using Smartphones to Research Daily LifeNeal Lathia
 

More from Neal Lathia (20)

Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)
 
Using language models to supercharge Monzo’s customer support
 Using language models to supercharge Monzo’s customer support Using language models to supercharge Monzo’s customer support
Using language models to supercharge Monzo’s customer support
 
Making Better Decisions Faster
Making Better Decisions FasterMaking Better Decisions Faster
Making Better Decisions Faster
 
Machine Learning, Faster
Machine Learning, FasterMachine Learning, Faster
Machine Learning, Faster
 
AI & Personalised Experiences
AI & Personalised ExperiencesAI & Personalised Experiences
AI & Personalised Experiences
 
Opportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelOpportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised Travel
 
Bootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineBootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation Engine
 
Machine Learning for Product Managers
Machine Learning for Product ManagersMachine Learning for Product Managers
Machine Learning for Product Managers
 
Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)
 
Happier and Healthier with Smartphone Data
Happier and Healthier with Smartphone DataHappier and Healthier with Smartphone Data
Happier and Healthier with Smartphone Data
 
Data Science in Digital Health
Data Science in Digital HealthData Science in Digital Health
Data Science in Digital Health
 
Using Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily LifeUsing Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily Life
 
Analysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataAnalysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone Data
 
Cambridge Quantified Self Meetup
Cambridge Quantified Self MeetupCambridge Quantified Self Meetup
Cambridge Quantified Self Meetup
 
Data Science in #mHealth
Data Science in #mHealthData Science in #mHealth
Data Science in #mHealth
 
Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport
 
Emotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentEmotion Sense: From Design to Deployment
Emotion Sense: From Design to Deployment
 
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
 
Using Smartphones to Research Daily Life
Using Smartphones to Research Daily LifeUsing Smartphones to Research Daily Life
Using Smartphones to Research Daily Life
 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

Turning Oyster Cards into Information

  • 2. online offline
  • 3. urban data mining web urbanmining.wordpress.com
  • 4.
  • 5.
  • 6.
  • 7. online user data + algorithms → relevance ☺
  • 8.
  • 9.
  • 10.
  • 11. public transport user data + algorithms → relevance
  • 12. “smart” cards 1 facilitate payment 2 collect user data
  • 13. “smart” cards time-stamped locations, modality, payments, user categories anonymised with persistent user ids
  • 14. “smart” cards datasets 100% - 1 month ~5.1 million people ~78.8 million trips 5% - 2 x 83 days ~300k people ~7.7 million trips
  • 15.
  • 16.
  • 17.
  • 18. Purchase Geography Mobility Flow 45 Zone 1 PAYG Zone 2 40 Travel Cards Zone 3 35 Zone 4 Zone 5 30 Zone 6 25 20 15 10 5 arrive 0 1 2 3 4 5 6 7 8 9
  • 19. using transport data for... 1 predicting disruption relevance 2 personalised travel time 3 fare purchase recommendation
  • 20.
  • 21. can we use transport data for... 1 predicting disruption relevance i.e., rank station importance correctly?
  • 22. can we use transport data for... predicting disruption relevance i.e., rank station importance correctly? (where you will go in the future)
  • 23. percentile ranking 0.0 (best) … 0.5 (random) … 1.0 (inverse)
  • 24. percentile ranking 0.0 (best) ... 0.25 (rank stations by popularity) ... 0.5 (random) … 1.0 (inverse)
  • 25. percentile ranking 0.0 (best) ... 0.06 (factor in user's history) ... 0.25 (rank stations by popularity) ... 0.5 (random) … 1.0 (inverse)
  • 26. percentile ranking 0.0 (best) … 0.05 (“those who touch in here also touch in at...”) ... 0.06 (factor in user's history) ... 0.25 (rank stations by popularity) ... 0.5 (random) … 1.0 (inverse)
  • 27. accurate ranking without 1 explicitly asking 2 network topology, rail schedule
  • 28. using transport data for... 1 predicting disruption relevance 2 personalised travel time
  • 29.
  • 30.
  • 31.
  • 32. can we use transport data for... 2 predict your travel time i.e., time between touch in/out?
  • 33. mean absolute error (minutes) 0.0 (best) …
  • 34. mean absolute error (minutes) 0.0 (best) … 9.82 (time tabled)
  • 35. mean absolute error (minutes) 0.0 (best) … 3.30 (mean time) ... 9.82 (time tabled)
  • 36. mean absolute error (minutes) 0.0 (best) … 3.28 (“people who travel at this time...”) 3.30 (mean time) ... 9.82 (time tabled)
  • 37. mean absolute error (minutes) 0.0 (best) … 3.17 (“people who are as familiar as you...”) 3.28 (“people who travel at this time...”) 3.30 (mean time) ... 9.82 (time tabled)
  • 38. mean absolute error (minutes) 0.0 (best) … 3.13 (“your trips in the past...”) 3.17 (“people who are as familiar as you...”) 3.28 (“people who travel at this time...”) 3.30 (mean time) ... 9.82 (time tabled)
  • 39. accurate predictions without 1 explicitly asking 2 network topology, rail schedule 3 ongoing disruptions, delays
  • 40. using transport data for... 1 predicting disruption relevance 2 personalised travel time 3 fare purchase recommendation
  • 41. 30 Purchase Behaviour Travel Cards 25 PAYG 20 % Purchases 15 10 5 0 Mon Tue Wed Thu Fri Sat Sun 45 Purchase Geography Mobility Flow 40 PAYG Zone 1 Travel Cards Zone 2 35 arrive Zone 3 30 Zone 4 Zone 5 25 Zone 6 20 15 10 5 0 1 2 3 4 5 6 7 8 9
  • 42. (a) high regularity in purchases & movements (b) small increments, short terms (c) purchase on refused entry?
  • 43. are people making the right choice?
  • 44. £200 million overspend
  • 45. (a) failure to predict your movements (b) failing to match mobility with fares
  • 46. can we use transport data for... 3 predict the fares you should buy i.e., what will be cheapest?
  • 48. classification accuracy 0.0 (worst) … 77% everyone on pay as you go ... 100% (oracle)
  • 49. classification accuracy 0.0 (worst) … 77% everyone on pay as you go 80% naïve bayes ... 100% (oracle)
  • 50. classification accuracy 0.0 (worst) … 77% everyone on pay as you go 80% naïve bayes … 97% (“people like you should have bought...”) 100% (oracle)
  • 51. classification accuracy 0.0 (worst) … 77% everyone on pay as you go 80% naïve bayes … 97% (“people like you should have bought...”) 98% decision trees 100% (oracle)
  • 52. money saved £0.0 (worst) … £326,447.95 everyone on pay as you go £393,585.81 naïve bayes … £465,822.17 (“people like you...”) £473,918.38 decision trees £479,583.91 (oracle)
  • 53. “smart” cards 1 facilitate payment 2 collect user data 3 enable powerful, personalised information systems
  • 54.
  • 55. using transport data for... 1 behaviours ~ policy & incentives 2 community well-being
  • 56. References N. Lathia, J. Froehlich, L. Capra. Mining Public Transport Usage for Personalised Intelligent Transport Systems. In IEEE International Conference on Data Mining. December 2010, Sydney, Australia. N. Lathia, C. Smith, J. Froehlich, L. Capra. Individuals Among Commuters: Building Personalised Transport Information Systems from Fare Collection Systems. Under submission. N. Lathia, L. Capra. Mining Mobility Data to Minimise Travellers' Spending on Public Transport. In ACM International Conference on Knowledge Discovery and Data Mining. August 2011. San Diego, USA. N. Lathia, L. Capra. How Smart is Your Smart Card? Measuring Travel Behaviours, Perceptions, and Incentives. In ACM International Conference on Ubiquitous Computing. September 2011. Beijing, China. N. Lathia, D. Quercia, J. Crowcroft. The Hidden Image of the City: Sensing Community Well- Being from Urban Mobility. To Appear, 10th International Conference on Pervasive Computing. June 2012. Newcastle, UK.