SlideShare a Scribd company logo
1 of 24
Download to read offline
ProbabilisticProbabilistic
ProgrammingProgramming
A Brief introduction to
Probabilistic Programming and Python
EuroSciPy - University of Cambridge August 2015
peadarcoyle@googlemail.com
All opinions my own
Who am I?Who am I?
I work as a Data Scientist for a large Telecommunications Company
Masters in Mathematics
Interned at Amazon
Was a consultant for a while
Occasional contributor to Pandas and other projects
Co-organizer of the Data Science Meetup in Luxembourg
Member of Royal Statistical Society and NumFOCUS
@springcoil
What is Probabilistic ProgrammingWhat is Probabilistic Programming
Basically using random variables instead of variables
Allows you to create a generative story rather than a black box
A different tool to Machine Learning
A different paradigm to frequentist statistics
Forces you to be explicit about your 'subjective' assumptions
Source: Olivier Grisel
Source: Olivier Grisel
Bayesian StatisticsBayesian Statistics
I studied Mathematics, and encountered in textbooks Bayesians
This is a hard area to do by pen and paper, and most integrals can't be
solved in exact form
Thankfully there was an invention of Monte Carlo Simulations
These simulations are used to approximate your likelihood function
Some terminologySome terminology
Attribution: Quantopian blog
How do you pick your prior?How do you pick your prior?
This is a bit of an art
You generally base the prior on experience
As you add more data this matters less and less
Huh but isn't ProbabilisticHuh but isn't Probabilistic
Programming just Stan and BUGS?Programming just Stan and BUGS?
No in Python you have PyMC3No in Python you have PyMC3
A complete rewrite of PyMC2 now in 'Beta' status
Based upon Theano
Computational techniques for handling gradients
Automatic Differentiation and GPU speedup
Theano - is also used in deep learning!
Currently there is a project to port ' ' from
I gave a thorough tutorial on this -
Key authors: John Salvatier, Thomas Wiecki, Chris Fonnesbeck
BMH PyMC2 to PyMC3
my github
Case study: Rugby AnalyticsCase study: Rugby Analytics
I wanted to do a model of the Six Nations last year.
I wanted to build an understandable model to predict the winner
Key Info: Inferring the 'strength' of each team.
We only have scoring data, which is noisy hence Bayesian Stats
What did I do?What did I do?
1. I picked Gamma as a prior for all teams
2. I used a Hierarchical Model because I wanted home advantage to be
stronger for stronger teams based
3. From this I was able to create a novel model based only on historical
results and scoring intensity
4. I simulated the likelihood function using MCMC
Run the modelRun the model
What actually happenedWhat actually happened
The model incorrectly predicted that England would come out on top.
Ireland actually won by points difference of 6 points.
It really came down to the wire!
"Prediction is difficult especially about the future"
One of the problems is what we call 'over-shrinkage' and you can
delve into the results to see what the errors are, my model was within
the errors.
Hat tip: Thanks to Abraham Flaxman and the PyMC3 on helping me
port this from PyMC2 to PyMC3
Lessons learnedLessons learned
I can build an explainable model using PyMC2 and PyMC3
Generative stories help you build up interest with your colleagues
Communication is the 'last mile' problem of Data Science
PyMC3 is cool please use it and please contribute
Wanna learn more?Wanna learn more?
BMHBMH
Jake VanDerPlas
PyMC3PyMC3
peadarcoyle@googlemail.compeadarcoyle@googlemail.com
Probabilistic Programming in Python

More Related Content

What's hot

Writing Smarter Applications with Machine Learning
Writing Smarter Applications with Machine LearningWriting Smarter Applications with Machine Learning
Writing Smarter Applications with Machine LearningAnoop Thomas Mathew
 
Model selection and tuning at scale
Model selection and tuning at scaleModel selection and tuning at scale
Model selection and tuning at scaleOwen Zhang
 
October hug
October hugOctober hug
October hughuguk
 
Knowledge graph convolutional networks - London 2018
Knowledge graph convolutional networks - London 2018Knowledge graph convolutional networks - London 2018
Knowledge graph convolutional networks - London 2018Vaticle
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine LearningFrank Evans
 
New Approaches at Natural Language Processing Systems
New Approaches at Natural Language Processing SystemsNew Approaches at Natural Language Processing Systems
New Approaches at Natural Language Processing SystemsAndrejkovics Zoltán
 
The Promise and Peril of Very Big Models
The Promise and Peril of Very Big ModelsThe Promise and Peril of Very Big Models
The Promise and Peril of Very Big ModelsRebecca Bilbro
 
Demystifying Artificial Intelligence and Neural Networks
Demystifying Artificial Intelligence and Neural NetworksDemystifying Artificial Intelligence and Neural Networks
Demystifying Artificial Intelligence and Neural NetworksGil Fewster
 

What's hot (8)

Writing Smarter Applications with Machine Learning
Writing Smarter Applications with Machine LearningWriting Smarter Applications with Machine Learning
Writing Smarter Applications with Machine Learning
 
Model selection and tuning at scale
Model selection and tuning at scaleModel selection and tuning at scale
Model selection and tuning at scale
 
October hug
October hugOctober hug
October hug
 
Knowledge graph convolutional networks - London 2018
Knowledge graph convolutional networks - London 2018Knowledge graph convolutional networks - London 2018
Knowledge graph convolutional networks - London 2018
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 
New Approaches at Natural Language Processing Systems
New Approaches at Natural Language Processing SystemsNew Approaches at Natural Language Processing Systems
New Approaches at Natural Language Processing Systems
 
The Promise and Peril of Very Big Models
The Promise and Peril of Very Big ModelsThe Promise and Peril of Very Big Models
The Promise and Peril of Very Big Models
 
Demystifying Artificial Intelligence and Neural Networks
Demystifying Artificial Intelligence and Neural NetworksDemystifying Artificial Intelligence and Neural Networks
Demystifying Artificial Intelligence and Neural Networks
 

Similar to Probabilistic Programming in Python

Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech
 
Artificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of IntelligenceArtificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of IntelligenceAbhishek Upadhyay
 
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Dhiana Deva
 
What is Gamification?
What is Gamification? What is Gamification?
What is Gamification? Karl Kapp
 
Probabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complexProbabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complexData Science Leuven
 
The math behind big systems analysis.
The math behind big systems analysis.The math behind big systems analysis.
The math behind big systems analysis.Theo Schlossnagle
 
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-DrivenWeapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Drivenindeedeng
 
Modelling for decisions
Modelling for decisionsModelling for decisions
Modelling for decisionscoppeliamla
 
Keepler | Understanding your own predictive models
Keepler | Understanding your own predictive modelsKeepler | Understanding your own predictive models
Keepler | Understanding your own predictive modelsKeepler Data Tech
 
ODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLBryan Bischof
 
No estimates - 10 new principles for testing
No estimates  - 10 new principles for testingNo estimates  - 10 new principles for testing
No estimates - 10 new principles for testingVasco Duarte
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Institute of Contemporary Sciences
 
Fantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl WeirFantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl WeirFuturice
 
[243] turning data into value
[243] turning data into value[243] turning data into value
[243] turning data into valueNAVER D2
 
Camilo Martinez, Software Development Team Lead at Booking.com - The lifecyc...
Camilo Martinez, Software Development Team Lead at Booking.com -  The lifecyc...Camilo Martinez, Software Development Team Lead at Booking.com -  The lifecyc...
Camilo Martinez, Software Development Team Lead at Booking.com - The lifecyc...Codiax
 
Big Data and Internet of Things for Managers
Big Data and Internet of Things for ManagersBig Data and Internet of Things for Managers
Big Data and Internet of Things for ManagersPeadar Coyle
 
Story Points considered harmful – a new look at estimation techniques
Story Points considered harmful – a new look at estimation techniquesStory Points considered harmful – a new look at estimation techniques
Story Points considered harmful – a new look at estimation techniquesVasco Duarte
 
The Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PMThe Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PMProduct School
 
AliceVision : pipeline de reconstruction 3D open source
AliceVision : pipeline de reconstruction 3D open sourceAliceVision : pipeline de reconstruction 3D open source
AliceVision : pipeline de reconstruction 3D open sourceOpen Source Experience
 
Machine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldMachine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldKen Tabor
 

Similar to Probabilistic Programming in Python (20)

Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivos
 
Artificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of IntelligenceArtificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of Intelligence
 
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
 
What is Gamification?
What is Gamification? What is Gamification?
What is Gamification?
 
Probabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complexProbabilistic machine learning for optimization and solving complex
Probabilistic machine learning for optimization and solving complex
 
The math behind big systems analysis.
The math behind big systems analysis.The math behind big systems analysis.
The math behind big systems analysis.
 
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-DrivenWeapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
 
Modelling for decisions
Modelling for decisionsModelling for decisions
Modelling for decisions
 
Keepler | Understanding your own predictive models
Keepler | Understanding your own predictive modelsKeepler | Understanding your own predictive models
Keepler | Understanding your own predictive models
 
ODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in ML
 
No estimates - 10 new principles for testing
No estimates  - 10 new principles for testingNo estimates  - 10 new principles for testing
No estimates - 10 new principles for testing
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
 
Fantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl WeirFantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl Weir
 
[243] turning data into value
[243] turning data into value[243] turning data into value
[243] turning data into value
 
Camilo Martinez, Software Development Team Lead at Booking.com - The lifecyc...
Camilo Martinez, Software Development Team Lead at Booking.com -  The lifecyc...Camilo Martinez, Software Development Team Lead at Booking.com -  The lifecyc...
Camilo Martinez, Software Development Team Lead at Booking.com - The lifecyc...
 
Big Data and Internet of Things for Managers
Big Data and Internet of Things for ManagersBig Data and Internet of Things for Managers
Big Data and Internet of Things for Managers
 
Story Points considered harmful – a new look at estimation techniques
Story Points considered harmful – a new look at estimation techniquesStory Points considered harmful – a new look at estimation techniques
Story Points considered harmful – a new look at estimation techniques
 
The Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PMThe Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PM
 
AliceVision : pipeline de reconstruction 3D open source
AliceVision : pipeline de reconstruction 3D open sourceAliceVision : pipeline de reconstruction 3D open source
AliceVision : pipeline de reconstruction 3D open source
 
Machine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our WorldMachine Learning: Understanding the Invisible Force Changing Our World
Machine Learning: Understanding the Invisible Force Changing Our World
 

More from Peadar Coyle

Introduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonIntroduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonPeadar Coyle
 
Variational Inference in Python
Variational Inference in PythonVariational Inference in Python
Variational Inference in PythonPeadar Coyle
 
From Lab to Factory: Creating value with data
From Lab to Factory: Creating value with dataFrom Lab to Factory: Creating value with data
From Lab to Factory: Creating value with dataPeadar Coyle
 
Consulting Skills for Data Scientists
Consulting Skills for Data ScientistsConsulting Skills for Data Scientists
Consulting Skills for Data ScientistsPeadar Coyle
 
A Map of the PyData Stack
A Map of the PyData StackA Map of the PyData Stack
A Map of the PyData StackPeadar Coyle
 
Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.Peadar Coyle
 
From Lab to Factory: Or how to turn data into value
From Lab to Factory: Or how to turn data into valueFrom Lab to Factory: Or how to turn data into value
From Lab to Factory: Or how to turn data into valuePeadar Coyle
 
How can Data Science benefit your business?
How can Data Science benefit your business?How can Data Science benefit your business?
How can Data Science benefit your business?Peadar Coyle
 

More from Peadar Coyle (8)

Introduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonIntroduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in Python
 
Variational Inference in Python
Variational Inference in PythonVariational Inference in Python
Variational Inference in Python
 
From Lab to Factory: Creating value with data
From Lab to Factory: Creating value with dataFrom Lab to Factory: Creating value with data
From Lab to Factory: Creating value with data
 
Consulting Skills for Data Scientists
Consulting Skills for Data ScientistsConsulting Skills for Data Scientists
Consulting Skills for Data Scientists
 
A Map of the PyData Stack
A Map of the PyData StackA Map of the PyData Stack
A Map of the PyData Stack
 
Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.
 
From Lab to Factory: Or how to turn data into value
From Lab to Factory: Or how to turn data into valueFrom Lab to Factory: Or how to turn data into value
From Lab to Factory: Or how to turn data into value
 
How can Data Science benefit your business?
How can Data Science benefit your business?How can Data Science benefit your business?
How can Data Science benefit your business?
 

Recently uploaded

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Probabilistic Programming in Python

  • 1. ProbabilisticProbabilistic ProgrammingProgramming A Brief introduction to Probabilistic Programming and Python EuroSciPy - University of Cambridge August 2015 peadarcoyle@googlemail.com All opinions my own
  • 2. Who am I?Who am I? I work as a Data Scientist for a large Telecommunications Company Masters in Mathematics Interned at Amazon Was a consultant for a while Occasional contributor to Pandas and other projects Co-organizer of the Data Science Meetup in Luxembourg Member of Royal Statistical Society and NumFOCUS @springcoil
  • 3. What is Probabilistic ProgrammingWhat is Probabilistic Programming Basically using random variables instead of variables Allows you to create a generative story rather than a black box A different tool to Machine Learning A different paradigm to frequentist statistics Forces you to be explicit about your 'subjective' assumptions
  • 6. Bayesian StatisticsBayesian Statistics I studied Mathematics, and encountered in textbooks Bayesians This is a hard area to do by pen and paper, and most integrals can't be solved in exact form Thankfully there was an invention of Monte Carlo Simulations These simulations are used to approximate your likelihood function
  • 7.
  • 10. How do you pick your prior?How do you pick your prior? This is a bit of an art You generally base the prior on experience As you add more data this matters less and less
  • 11.
  • 12. Huh but isn't ProbabilisticHuh but isn't Probabilistic Programming just Stan and BUGS?Programming just Stan and BUGS?
  • 13. No in Python you have PyMC3No in Python you have PyMC3 A complete rewrite of PyMC2 now in 'Beta' status Based upon Theano Computational techniques for handling gradients Automatic Differentiation and GPU speedup Theano - is also used in deep learning! Currently there is a project to port ' ' from I gave a thorough tutorial on this - Key authors: John Salvatier, Thomas Wiecki, Chris Fonnesbeck BMH PyMC2 to PyMC3 my github
  • 14. Case study: Rugby AnalyticsCase study: Rugby Analytics I wanted to do a model of the Six Nations last year. I wanted to build an understandable model to predict the winner Key Info: Inferring the 'strength' of each team. We only have scoring data, which is noisy hence Bayesian Stats
  • 15. What did I do?What did I do? 1. I picked Gamma as a prior for all teams 2. I used a Hierarchical Model because I wanted home advantage to be stronger for stronger teams based 3. From this I was able to create a novel model based only on historical results and scoring intensity 4. I simulated the likelihood function using MCMC
  • 16.
  • 17.
  • 18.
  • 19. Run the modelRun the model
  • 20.
  • 21. What actually happenedWhat actually happened The model incorrectly predicted that England would come out on top. Ireland actually won by points difference of 6 points. It really came down to the wire! "Prediction is difficult especially about the future" One of the problems is what we call 'over-shrinkage' and you can delve into the results to see what the errors are, my model was within the errors. Hat tip: Thanks to Abraham Flaxman and the PyMC3 on helping me port this from PyMC2 to PyMC3
  • 22. Lessons learnedLessons learned I can build an explainable model using PyMC2 and PyMC3 Generative stories help you build up interest with your colleagues Communication is the 'last mile' problem of Data Science PyMC3 is cool please use it and please contribute
  • 23. Wanna learn more?Wanna learn more? BMHBMH Jake VanDerPlas PyMC3PyMC3 peadarcoyle@googlemail.compeadarcoyle@googlemail.com