SlideShare a Scribd company logo
1 of 24
Download to read offline
The Snake in Your Data
How Python is Used Today by Data Science Teams
Matt Price
Principal Research Engineer
2019.09.24
2SLIDE
Agenda
● About ZeroFOX
● The Data Science Lifecycle
● Data Science at ZeroFOX
● Data Science Tools
● Prodigy Demo
● Q & A
3
About ZeroFOX
It’s a Digital World. Engage Securely.
Our Mission
ZeroFOX exists to protect digital engagement
Our Story
ZeroFOX was founded with the goal of creating
customer champions
With global reach and operation centers in the
United States, United Kingdom, Chile and India,
ZeroFOX provides best in class software, support
and services to organizations of all sizes.
Most Recognized. Most Awarded.
4
Social and Digital Channels
Your Organization
Domains | Executives | VIP’s | Employees | Brands | Locations
AI-Driven Analysis
Automated Analysis | Alerts | Reporting
Human-Driven Analysis
ZeroFOX OnWatch™ | ZeroFOX Alpha Team
Remediation
Takedown-as-a-Service™
Complete Digital Visibility & Protection
The ZeroFOX
Platform
Identify
Risks on social and
digital platforms
Protect
What matters to
your organization
Remediate
Threats to your brand
and business
Protection
Identification
Analysis
Remediation
5SLIDE
Agenda
● About ZeroFOX
● The Data Science Lifecycle
● Data Science at ZeroFOX
● Data Science Tools
● Prodigy Demo
● Q & A
6SLIDE
The Data Science Lifecycle
● Each stage builds on subsequent
stages
● Most effort is around data
collection efforts
● Iterative process
● Python is used throughout the
entire workflow
7SLIDE
Agenda
● About ZeroFOX
● The Data Science Lifecycle
● Data Science at ZeroFOX
● Data Science Tools
● Prodigy Demo
● Q & A
8SLIDE
ZeroFOX AI
Machine
Learning
Deep
Learning
Artificial Intelligence
NLP CV
Artificial Intelligence (AI)
The simulation of intelligent behavior
in machines
AI Techniques
Machine Learning (ML)
Study and use of algorithms and
statistical models that learn from data
Deep Learning
A technique within ML that uses
“large” Neural Networks
9SLIDE
ZeroFOX Data Science Architecture
● Tied into production data ingest
● Feedback loop from analysts
● Labeling is open to the entire
company
● Architecture is optimized for quick
iterations
10SLIDE
Agenda
● About ZeroFOX
● The Data Science Lifecycle
● Data Science at ZeroFOX
● Data Science Tools
● Prodigy Demo
● Q & A
11SLIDE
Python Tooling Categories
Data manipulation
Data structures and data transformations
Data visualization
Understanding what the data is
Modeling
Teaching machines to learn the underlying patterns in the data
Deployment
Integrating with the platform and making models available to the end customer
12SLIDE
Data Manipulation Tools
● Multi-dimensional arrays and matrices
● High level mathematical functions
● Fast, vectorized operations
● Multi-dimensional matrices wrapped in DataFrames
● Time series logic and operations
● Data analysis functions and tools
● CV and ML library
● Fast operations - focus on real-time video
● Low level operations
● PIL fork
● General image processing library
● High level operations
13SLIDE
ZeroFOX Data Science Architecture
NumPy
OpenCV
Pillow
NumPy
OpenCV
Pillow
NumPy
OpenCV
Pillow
NumPy
OpenCV
Pillow
NumPy
OpenCV
Pillow
Pandas
14SLIDE
Data Visualization Tools
● Interactive computing via notebooks
● Kernels run code and return output
● Focus on scientific computing
● Plotting library
● Low level plotting interface
● Compatible with a number of GUI toolkits
● Built on top of matplotlib
● High level plotting interface
● Categorical variable support
● Framework for building data visualization apps
● Open source and enterprise versions
● Interactive charts
15SLIDE
ZeroFOX Data Science Architecture
Jupyter
Matplotlib
Seaborn
Plotly
Matplotlib
Seaborn
Plotly
Jupyter
Matplotlib
Seaborn
Plotly
16SLIDE
Modeling Tools
● Solves the labeling problem
● Enables active learning
● Programmatic workflow definitions
● Extremely flexible
prodigy
● Machine learning and data analysis library
● Built on top of NumPy, SciPy, LIBSVM, and matplotlib
● Number of various scikits available
● High level deep learning library
● Serves as an interface to lower level backends
● Tensorflow supplies low level building blocks
● Pre-defined models
● Production-focused NLP framework
● Deep learning models powered by Thinc
● Define pipeline which outputs annotated
documents
17SLIDE
ZeroFOX Data Science Architecture
Prodigy
Prodigy
Scikit-learn
Prodigy
Keras + Tensorflow
spaCy
Scikit-learn
Keras + Tensorflow
spaCy
Scikit-learn
18SLIDE
Deployment
● Web server and framework focused on
high performance
● Secondarily focused on ease of use
● Flask-like framework API
● Decent extension ecosystem
● Python 3.6+ (heavily relies on async/await)
● MVC web framework
● Focused on easing development of
database-driven websites
● Large extension ecosystem
● CRUD interface for administrative tasks
19SLIDE
ZeroFOX Data Science Architecture
Sanic
Django
20SLIDE
Agenda
● About ZeroFOX
● The Data Science Lifecycle
● Data Science at ZeroFOX
● Data Science Tools
● Prodigy Demo
● Q & A
21SLIDE
Prodigy
● Created by Explosion.AI (Matthew Honnibal and Ines Montani)
○ Same company that develops spaCy and Thinc
● Designed to make annotating data simple but can do much more
● Is a tool (Python package) that you purchase
● Why Prodigy?
○ Solves the “hardest” problem in applied data science
○ Can programmatically define entire model workflow in a recipe
○ Out of the box support for spaCy
○ Supports computer vision annotation
○ Exports trained models as Python packages
22Slide
/
Prodigy Live Demo
23SLIDE
Agenda
● About ZeroFOX
● The Data Science Lifecycle
● Data Science at ZeroFOX
● Data Science Tools
● Prodigy Demo
● Q & A
24Slide
/
Questions?

More Related Content

What's hot

FIWARE Global Summit - DRACO: Managing the Stream of Context Information Hist...
FIWARE Global Summit - DRACO: Managing the Stream of Context Information Hist...FIWARE Global Summit - DRACO: Managing the Stream of Context Information Hist...
FIWARE Global Summit - DRACO: Managing the Stream of Context Information Hist...FIWARE
 
SQL o NoSQL? Progettare applicazioni 'Big Data-ready' attraverso l'utilizzo d...
SQL o NoSQL? Progettare applicazioni 'Big Data-ready' attraverso l'utilizzo d...SQL o NoSQL? Progettare applicazioni 'Big Data-ready' attraverso l'utilizzo d...
SQL o NoSQL? Progettare applicazioni 'Big Data-ready' attraverso l'utilizzo d...Codemotion
 
PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...
PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...
PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...PROIDEA
 
FIWARE Global Summit - Using ML/AI Techniques with FIWARE and Connected IoT D...
FIWARE Global Summit - Using ML/AI Techniques with FIWARE and Connected IoT D...FIWARE Global Summit - Using ML/AI Techniques with FIWARE and Connected IoT D...
FIWARE Global Summit - Using ML/AI Techniques with FIWARE and Connected IoT D...FIWARE
 
ORACLE ESPM Blockchain - Parte 03 - Discussão pós-jogo
ORACLE ESPM Blockchain - Parte 03 - Discussão pós-jogoORACLE ESPM Blockchain - Parte 03 - Discussão pós-jogo
ORACLE ESPM Blockchain - Parte 03 - Discussão pós-jogoFernando Galdino
 
FIWARE Global Summit - Keyrock: Protecting Microservices
FIWARE Global Summit - Keyrock: Protecting MicroservicesFIWARE Global Summit - Keyrock: Protecting Microservices
FIWARE Global Summit - Keyrock: Protecting MicroservicesFIWARE
 
Meetup code security
Meetup code securityMeetup code security
Meetup code securityUttamParmar7
 
Technology behind-real-time-log-analytics
Technology behind-real-time-log-analytics Technology behind-real-time-log-analytics
Technology behind-real-time-log-analytics Data Science Thailand
 
Druid meetup 2018-03-13
Druid meetup 2018-03-13Druid meetup 2018-03-13
Druid meetup 2018-03-13gianmerlino
 

What's hot (9)

FIWARE Global Summit - DRACO: Managing the Stream of Context Information Hist...
FIWARE Global Summit - DRACO: Managing the Stream of Context Information Hist...FIWARE Global Summit - DRACO: Managing the Stream of Context Information Hist...
FIWARE Global Summit - DRACO: Managing the Stream of Context Information Hist...
 
SQL o NoSQL? Progettare applicazioni 'Big Data-ready' attraverso l'utilizzo d...
SQL o NoSQL? Progettare applicazioni 'Big Data-ready' attraverso l'utilizzo d...SQL o NoSQL? Progettare applicazioni 'Big Data-ready' attraverso l'utilizzo d...
SQL o NoSQL? Progettare applicazioni 'Big Data-ready' attraverso l'utilizzo d...
 
PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...
PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...
PLNOG 13: B. van der Sloot, S. Abdel-Hafez: Running a 2 Tbps global IP networ...
 
FIWARE Global Summit - Using ML/AI Techniques with FIWARE and Connected IoT D...
FIWARE Global Summit - Using ML/AI Techniques with FIWARE and Connected IoT D...FIWARE Global Summit - Using ML/AI Techniques with FIWARE and Connected IoT D...
FIWARE Global Summit - Using ML/AI Techniques with FIWARE and Connected IoT D...
 
ORACLE ESPM Blockchain - Parte 03 - Discussão pós-jogo
ORACLE ESPM Blockchain - Parte 03 - Discussão pós-jogoORACLE ESPM Blockchain - Parte 03 - Discussão pós-jogo
ORACLE ESPM Blockchain - Parte 03 - Discussão pós-jogo
 
FIWARE Global Summit - Keyrock: Protecting Microservices
FIWARE Global Summit - Keyrock: Protecting MicroservicesFIWARE Global Summit - Keyrock: Protecting Microservices
FIWARE Global Summit - Keyrock: Protecting Microservices
 
Meetup code security
Meetup code securityMeetup code security
Meetup code security
 
Technology behind-real-time-log-analytics
Technology behind-real-time-log-analytics Technology behind-real-time-log-analytics
Technology behind-real-time-log-analytics
 
Druid meetup 2018-03-13
Druid meetup 2018-03-13Druid meetup 2018-03-13
Druid meetup 2018-03-13
 

Similar to Python meetup

Day 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramDay 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramFIWARE
 
Session 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers ProgramSession 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers ProgramFIWARE
 
Data Science in Production: Technologies That Drive Adoption of Data Science ...
Data Science in Production: Technologies That Drive Adoption of Data Science ...Data Science in Production: Technologies That Drive Adoption of Data Science ...
Data Science in Production: Technologies That Drive Adoption of Data Science ...Nir Yungster
 
The path to success with graph database and graph data science_ Neo4j GraphSu...
The path to success with graph database and graph data science_ Neo4j GraphSu...The path to success with graph database and graph data science_ Neo4j GraphSu...
The path to success with graph database and graph data science_ Neo4j GraphSu...Neo4j
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the tradeFangda Wang
 
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfThe Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfNeo4j
 
From open data to API-driven business
From open data to API-driven businessFrom open data to API-driven business
From open data to API-driven businessOpenDataSoft
 
Big Brother for Enterprises - The WSO2 Advantage
Big Brother for Enterprises - The WSO2 AdvantageBig Brother for Enterprises - The WSO2 Advantage
Big Brother for Enterprises - The WSO2 AdvantageWSO2
 
CHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopCHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopObject Automation
 
A Connections-first Approach to Supply Chain Optimization
A Connections-first Approach to Supply Chain OptimizationA Connections-first Approach to Supply Chain Optimization
A Connections-first Approach to Supply Chain OptimizationNeo4j
 
Nordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & TomorrowNordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & TomorrowNeo4j
 
Neo4j 4 Overview
Neo4j 4 OverviewNeo4j 4 Overview
Neo4j 4 OverviewNeo4j
 
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & TomorrowAmsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & TomorrowNeo4j
 
An overview of data and web-application development with Python
An overview of data and web-application development with PythonAn overview of data and web-application development with Python
An overview of data and web-application development with PythonSivaranjan Goswami
 
DDDP 2019 - Brown to Green
DDDP 2019  - Brown to GreenDDDP 2019  - Brown to Green
DDDP 2019 - Brown to GreenJohn Archer
 
Enterprise Application Development in Python.pptx
Enterprise Application Development in Python.pptxEnterprise Application Development in Python.pptx
Enterprise Application Development in Python.pptxAriHemingway
 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4jNeo4j
 
RedisGraph A Low Latency Graph DB: Pieter Cailliau
RedisGraph A Low Latency Graph DB: Pieter CailliauRedisGraph A Low Latency Graph DB: Pieter Cailliau
RedisGraph A Low Latency Graph DB: Pieter CailliauRedis Labs
 

Similar to Python meetup (20)

Day 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramDay 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers Program
 
Session 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers ProgramSession 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers Program
 
Intel Developer Program
Intel Developer ProgramIntel Developer Program
Intel Developer Program
 
Data Science in Production: Technologies That Drive Adoption of Data Science ...
Data Science in Production: Technologies That Drive Adoption of Data Science ...Data Science in Production: Technologies That Drive Adoption of Data Science ...
Data Science in Production: Technologies That Drive Adoption of Data Science ...
 
Sundance's presentation at B:RAI 2020
Sundance's presentation at B:RAI 2020Sundance's presentation at B:RAI 2020
Sundance's presentation at B:RAI 2020
 
The path to success with graph database and graph data science_ Neo4j GraphSu...
The path to success with graph database and graph data science_ Neo4j GraphSu...The path to success with graph database and graph data science_ Neo4j GraphSu...
The path to success with graph database and graph data science_ Neo4j GraphSu...
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
 
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfThe Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdf
 
From open data to API-driven business
From open data to API-driven businessFrom open data to API-driven business
From open data to API-driven business
 
Big Brother for Enterprises - The WSO2 Advantage
Big Brother for Enterprises - The WSO2 AdvantageBig Brother for Enterprises - The WSO2 Advantage
Big Brother for Enterprises - The WSO2 Advantage
 
CHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshopCHIPS Alliance_Object Automation Inc_workshop
CHIPS Alliance_Object Automation Inc_workshop
 
A Connections-first Approach to Supply Chain Optimization
A Connections-first Approach to Supply Chain OptimizationA Connections-first Approach to Supply Chain Optimization
A Connections-first Approach to Supply Chain Optimization
 
Nordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & TomorrowNordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
 
Neo4j 4 Overview
Neo4j 4 OverviewNeo4j 4 Overview
Neo4j 4 Overview
 
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & TomorrowAmsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
 
An overview of data and web-application development with Python
An overview of data and web-application development with PythonAn overview of data and web-application development with Python
An overview of data and web-application development with Python
 
DDDP 2019 - Brown to Green
DDDP 2019  - Brown to GreenDDDP 2019  - Brown to Green
DDDP 2019 - Brown to Green
 
Enterprise Application Development in Python.pptx
Enterprise Application Development in Python.pptxEnterprise Application Development in Python.pptx
Enterprise Application Development in Python.pptx
 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4j
 
RedisGraph A Low Latency Graph DB: Pieter Cailliau
RedisGraph A Low Latency Graph DB: Pieter CailliauRedisGraph A Low Latency Graph DB: Pieter Cailliau
RedisGraph A Low Latency Graph DB: Pieter Cailliau
 

More from Jeffrey Clark

Python memory management_v2
Python memory management_v2Python memory management_v2
Python memory management_v2Jeffrey Clark
 
Jwt with flask slide deck - alan swenson
Jwt with flask   slide deck - alan swensonJwt with flask   slide deck - alan swenson
Jwt with flask slide deck - alan swensonJeffrey Clark
 
Genericmeetupslides 110607190400-phpapp02
Genericmeetupslides 110607190400-phpapp02Genericmeetupslides 110607190400-phpapp02
Genericmeetupslides 110607190400-phpapp02Jeffrey Clark
 
Pyramiddcpythonfeb2013 131006105131-phpapp02
Pyramiddcpythonfeb2013 131006105131-phpapp02Pyramiddcpythonfeb2013 131006105131-phpapp02
Pyramiddcpythonfeb2013 131006105131-phpapp02Jeffrey Clark
 
Zpugdc2007 101105081808-phpapp01
Zpugdc2007 101105081808-phpapp01Zpugdc2007 101105081808-phpapp01
Zpugdc2007 101105081808-phpapp01Jeffrey Clark
 
Zpugdc deformpresentation-100709203803-phpapp01
Zpugdc deformpresentation-100709203803-phpapp01Zpugdc deformpresentation-100709203803-phpapp01
Zpugdc deformpresentation-100709203803-phpapp01Jeffrey Clark
 
Zpugdccherry 101105081729-phpapp01
Zpugdccherry 101105081729-phpapp01Zpugdccherry 101105081729-phpapp01
Zpugdccherry 101105081729-phpapp01Jeffrey Clark
 
Using Grok to Walk Like a Duck - Brandon Craig Rhodes
Using Grok to Walk Like a Duck - Brandon Craig RhodesUsing Grok to Walk Like a Duck - Brandon Craig Rhodes
Using Grok to Walk Like a Duck - Brandon Craig RhodesJeffrey Clark
 
What Makes A Great Dev Team - Mike Robinson
What Makes A Great Dev Team - Mike RobinsonWhat Makes A Great Dev Team - Mike Robinson
What Makes A Great Dev Team - Mike RobinsonJeffrey Clark
 
What Makes A Great Dev Team - Mike Robinson
What Makes A Great Dev Team - Mike RobinsonWhat Makes A Great Dev Team - Mike Robinson
What Makes A Great Dev Team - Mike RobinsonJeffrey Clark
 
Plone I18n Tutorial - Hanno Schlichting
Plone I18n Tutorial - Hanno SchlichtingPlone I18n Tutorial - Hanno Schlichting
Plone I18n Tutorial - Hanno SchlichtingJeffrey Clark
 
Real World Intranets - Joel Burton
Real World Intranets - Joel BurtonReal World Intranets - Joel Burton
Real World Intranets - Joel BurtonJeffrey Clark
 
State Of Zope 3 - Stephan Richter
State Of Zope 3 - Stephan RichterState Of Zope 3 - Stephan Richter
State Of Zope 3 - Stephan RichterJeffrey Clark
 
KSS Techniques - Joel Burton
KSS Techniques - Joel BurtonKSS Techniques - Joel Burton
KSS Techniques - Joel BurtonJeffrey Clark
 

More from Jeffrey Clark (20)

Python memory management_v2
Python memory management_v2Python memory management_v2
Python memory management_v2
 
Jwt with flask slide deck - alan swenson
Jwt with flask   slide deck - alan swensonJwt with flask   slide deck - alan swenson
Jwt with flask slide deck - alan swenson
 
Genericmeetupslides 110607190400-phpapp02
Genericmeetupslides 110607190400-phpapp02Genericmeetupslides 110607190400-phpapp02
Genericmeetupslides 110607190400-phpapp02
 
Pyramiddcpythonfeb2013 131006105131-phpapp02
Pyramiddcpythonfeb2013 131006105131-phpapp02Pyramiddcpythonfeb2013 131006105131-phpapp02
Pyramiddcpythonfeb2013 131006105131-phpapp02
 
Dc python meetup
Dc python meetupDc python meetup
Dc python meetup
 
Zpugdc2007 101105081808-phpapp01
Zpugdc2007 101105081808-phpapp01Zpugdc2007 101105081808-phpapp01
Zpugdc2007 101105081808-phpapp01
 
Zpugdc deformpresentation-100709203803-phpapp01
Zpugdc deformpresentation-100709203803-phpapp01Zpugdc deformpresentation-100709203803-phpapp01
Zpugdc deformpresentation-100709203803-phpapp01
 
Zpugdccherry 101105081729-phpapp01
Zpugdccherry 101105081729-phpapp01Zpugdccherry 101105081729-phpapp01
Zpugdccherry 101105081729-phpapp01
 
Tornado
TornadoTornado
Tornado
 
Science To Bfg
Science To BfgScience To Bfg
Science To Bfg
 
The PSF and You
The PSF and YouThe PSF and You
The PSF and You
 
Using Grok to Walk Like a Duck - Brandon Craig Rhodes
Using Grok to Walk Like a Duck - Brandon Craig RhodesUsing Grok to Walk Like a Duck - Brandon Craig Rhodes
Using Grok to Walk Like a Duck - Brandon Craig Rhodes
 
What Makes A Great Dev Team - Mike Robinson
What Makes A Great Dev Team - Mike RobinsonWhat Makes A Great Dev Team - Mike Robinson
What Makes A Great Dev Team - Mike Robinson
 
What Makes A Great Dev Team - Mike Robinson
What Makes A Great Dev Team - Mike RobinsonWhat Makes A Great Dev Team - Mike Robinson
What Makes A Great Dev Team - Mike Robinson
 
Plone I18n Tutorial - Hanno Schlichting
Plone I18n Tutorial - Hanno SchlichtingPlone I18n Tutorial - Hanno Schlichting
Plone I18n Tutorial - Hanno Schlichting
 
Real World Intranets - Joel Burton
Real World Intranets - Joel BurtonReal World Intranets - Joel Burton
Real World Intranets - Joel Burton
 
State Of Zope 3 - Stephan Richter
State Of Zope 3 - Stephan RichterState Of Zope 3 - Stephan Richter
State Of Zope 3 - Stephan Richter
 
KSS Techniques - Joel Burton
KSS Techniques - Joel BurtonKSS Techniques - Joel Burton
KSS Techniques - Joel Burton
 
Zenoss: Buildout
Zenoss: BuildoutZenoss: Buildout
Zenoss: Buildout
 
Opensourceweblion
OpensourceweblionOpensourceweblion
Opensourceweblion
 

Recently uploaded

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Python meetup

  • 1. The Snake in Your Data How Python is Used Today by Data Science Teams Matt Price Principal Research Engineer 2019.09.24
  • 2. 2SLIDE Agenda ● About ZeroFOX ● The Data Science Lifecycle ● Data Science at ZeroFOX ● Data Science Tools ● Prodigy Demo ● Q & A
  • 3. 3 About ZeroFOX It’s a Digital World. Engage Securely. Our Mission ZeroFOX exists to protect digital engagement Our Story ZeroFOX was founded with the goal of creating customer champions With global reach and operation centers in the United States, United Kingdom, Chile and India, ZeroFOX provides best in class software, support and services to organizations of all sizes. Most Recognized. Most Awarded.
  • 4. 4 Social and Digital Channels Your Organization Domains | Executives | VIP’s | Employees | Brands | Locations AI-Driven Analysis Automated Analysis | Alerts | Reporting Human-Driven Analysis ZeroFOX OnWatch™ | ZeroFOX Alpha Team Remediation Takedown-as-a-Service™ Complete Digital Visibility & Protection The ZeroFOX Platform Identify Risks on social and digital platforms Protect What matters to your organization Remediate Threats to your brand and business Protection Identification Analysis Remediation
  • 5. 5SLIDE Agenda ● About ZeroFOX ● The Data Science Lifecycle ● Data Science at ZeroFOX ● Data Science Tools ● Prodigy Demo ● Q & A
  • 6. 6SLIDE The Data Science Lifecycle ● Each stage builds on subsequent stages ● Most effort is around data collection efforts ● Iterative process ● Python is used throughout the entire workflow
  • 7. 7SLIDE Agenda ● About ZeroFOX ● The Data Science Lifecycle ● Data Science at ZeroFOX ● Data Science Tools ● Prodigy Demo ● Q & A
  • 8. 8SLIDE ZeroFOX AI Machine Learning Deep Learning Artificial Intelligence NLP CV Artificial Intelligence (AI) The simulation of intelligent behavior in machines AI Techniques Machine Learning (ML) Study and use of algorithms and statistical models that learn from data Deep Learning A technique within ML that uses “large” Neural Networks
  • 9. 9SLIDE ZeroFOX Data Science Architecture ● Tied into production data ingest ● Feedback loop from analysts ● Labeling is open to the entire company ● Architecture is optimized for quick iterations
  • 10. 10SLIDE Agenda ● About ZeroFOX ● The Data Science Lifecycle ● Data Science at ZeroFOX ● Data Science Tools ● Prodigy Demo ● Q & A
  • 11. 11SLIDE Python Tooling Categories Data manipulation Data structures and data transformations Data visualization Understanding what the data is Modeling Teaching machines to learn the underlying patterns in the data Deployment Integrating with the platform and making models available to the end customer
  • 12. 12SLIDE Data Manipulation Tools ● Multi-dimensional arrays and matrices ● High level mathematical functions ● Fast, vectorized operations ● Multi-dimensional matrices wrapped in DataFrames ● Time series logic and operations ● Data analysis functions and tools ● CV and ML library ● Fast operations - focus on real-time video ● Low level operations ● PIL fork ● General image processing library ● High level operations
  • 13. 13SLIDE ZeroFOX Data Science Architecture NumPy OpenCV Pillow NumPy OpenCV Pillow NumPy OpenCV Pillow NumPy OpenCV Pillow NumPy OpenCV Pillow Pandas
  • 14. 14SLIDE Data Visualization Tools ● Interactive computing via notebooks ● Kernels run code and return output ● Focus on scientific computing ● Plotting library ● Low level plotting interface ● Compatible with a number of GUI toolkits ● Built on top of matplotlib ● High level plotting interface ● Categorical variable support ● Framework for building data visualization apps ● Open source and enterprise versions ● Interactive charts
  • 15. 15SLIDE ZeroFOX Data Science Architecture Jupyter Matplotlib Seaborn Plotly Matplotlib Seaborn Plotly Jupyter Matplotlib Seaborn Plotly
  • 16. 16SLIDE Modeling Tools ● Solves the labeling problem ● Enables active learning ● Programmatic workflow definitions ● Extremely flexible prodigy ● Machine learning and data analysis library ● Built on top of NumPy, SciPy, LIBSVM, and matplotlib ● Number of various scikits available ● High level deep learning library ● Serves as an interface to lower level backends ● Tensorflow supplies low level building blocks ● Pre-defined models ● Production-focused NLP framework ● Deep learning models powered by Thinc ● Define pipeline which outputs annotated documents
  • 17. 17SLIDE ZeroFOX Data Science Architecture Prodigy Prodigy Scikit-learn Prodigy Keras + Tensorflow spaCy Scikit-learn Keras + Tensorflow spaCy Scikit-learn
  • 18. 18SLIDE Deployment ● Web server and framework focused on high performance ● Secondarily focused on ease of use ● Flask-like framework API ● Decent extension ecosystem ● Python 3.6+ (heavily relies on async/await) ● MVC web framework ● Focused on easing development of database-driven websites ● Large extension ecosystem ● CRUD interface for administrative tasks
  • 19. 19SLIDE ZeroFOX Data Science Architecture Sanic Django
  • 20. 20SLIDE Agenda ● About ZeroFOX ● The Data Science Lifecycle ● Data Science at ZeroFOX ● Data Science Tools ● Prodigy Demo ● Q & A
  • 21. 21SLIDE Prodigy ● Created by Explosion.AI (Matthew Honnibal and Ines Montani) ○ Same company that develops spaCy and Thinc ● Designed to make annotating data simple but can do much more ● Is a tool (Python package) that you purchase ● Why Prodigy? ○ Solves the “hardest” problem in applied data science ○ Can programmatically define entire model workflow in a recipe ○ Out of the box support for spaCy ○ Supports computer vision annotation ○ Exports trained models as Python packages
  • 23. 23SLIDE Agenda ● About ZeroFOX ● The Data Science Lifecycle ● Data Science at ZeroFOX ● Data Science Tools ● Prodigy Demo ● Q & A