SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
DataOps
Data Science Empowerment through
DevOps, Cloud Computing and Building
your own Applications
Kelly O’Briant
Data Science Product Engineer
kelly@rladies.org
@kellrstats | @RLadiesDC
• R-Ladies Washington DC Chapter Founder
and Organizer
• R-Ladies Global unofficial “cloud expert”
• Publish a monthly series called .rprofile
on the rOpenSci blog
• Business Science University
course developer
My Talk Goal:
I want you to leave this conference so
excited, you go back to work and completely
ignore whatever project you’re supposed to
be working on because you’re so pumped up
about building a data product and you can’t
stop yourself from doing it.
Motivation
Why I talk about Data Science Empowerment
R-Ladies events
• How do I get a job as a data scientist/analyst/anything?
• What should I study/learn/do/produce to be a data scientist?
• Am I even a data scientist? Is what I do data science?
Why are data products empowering?
• I use data products to justify/prove to myself that I belong, that my
ideas are valid and to help me communicate with people who are bad
at listening (or when I’m bad at speaking)
Motivation
Traumatic Experiences!
Windows Lab Linux Lab Mac Lab
R-Ladies + International Women’s Day
Twitter Campaign
• Create a twitter bot using R code
to tweet out a profile for every
woman in our Global speaker
directory
• Project collaboration through GitHub
• Docker linked to a local volume
• Twitter Application(s)
Deploy and Use H2O Machine Learning
Models in Production
• Build and validate a model in python
working in a Jupyter Notebook with the
H2O machine learning API
• Package the model code as a POJO or
MOJO file
• Deploy the model to H2O.ai STEAM to
create an ML prediction service complete
with a REST API query URL
Create and Maintain a Personal Website
• Use the blogdown package in an
RStudio project to create the
framework for a Hugo static
website
• Create content for the site by
writing Rmarkdown files
• Compile and deploy the static site –
choose a hosting mechanism:
GitHub? Continuous Integration
with Netlify?
Why are you so into R?
• It’s great for Data Science
• The community at large is awesome
• The female community is awesome
• R integrates with other tech
• It’s growing really fast in cool ways
• I can use it to build cool stuff
Why are you so into R?
• It’s great for Data Science
• The community at large is awesome
• The female community is awesome
• R integrates with other tech
• It’s growing really fast in cool ways
• I can use it to build cool stuff
#rstats
Why are you so into R?
• It’s great for Data Science
• The community at large is awesome
• The female community is awesome
• R integrates with other tech
• It’s growing really fast in cool ways
• I can use it to build cool stuff
Worldwide organization
that promotes gender diversity
in the R community via meetups
and mentorship in a friendly and
safe environment
Why are you so into R?
• It’s great for Data Science
• The community at large is awesome
• The female community is awesome
• R integrates with other tech
• It’s growing really fast in cool ways
• I can use it to build cool stuff
Why are you so into R?
• It’s great for Data Science
• The community at large is awesome
• The female community is awesome
• R integrates with other tech
• It’s growing really fast in cool ways
• I can use it to build cool stuff
Why are you so into R?
• It’s great for Data Science
• The community at large is awesome
• The female community is awesome
• R integrates with other tech
• It’s growing really fast in cool ways
• I can use it to build cool stuff
Back to the topic: DataOps
1. It usually takes a little DevOps to build a Data Product
2. Building more Data Products is empowering – good for your portfolio and soul
What is DevOps
And why should Data-oriented people care about it?
DevOps is…
“A combination of cultural philosophies, practices
and tools that increases an organizations ability to
deliver applications and services at high velocity.
- AWS DevOps Blog
Deliver applications and services at high velocity
Do This – without pulling all
your hair out?
Deliver applications and services at high velocity
Do This – Super Effectively
Host your analysis
• Share
• Publish
• Collaborate
• Prove a point
• Serve a purpose
• Be reproducible
• Save the day
What is DataOps?
DataOps?
Anywhere you can put a little DevOps magic into your data science workflow
Build More Data Products
So that you and others can use them to solve real problems
Try Shiny!
The Iris Dataset
Do Machine Learning!
So Hot Right Now
What Species
is this iris??
Credit: xkcd
1. Turn your ideas into R code
• Write functions to generate the
plots you’re envisioning
• Package: ggplot2
• Train and validate a machine
learning model to use
• Package: caret
geom_hist_basic <- function(var){
ggplot(iris, aes_string(x = var)) +
geom_histogram() +
facet_wrap(~ Species)
}
predict_matrix(fit.knn, validation)
Confusion Matrix and Statistics
Prediction setosa versicolor virginica
setosa 10 0 0
versicolor 0 8 1
virginica 0 2 9
2. Turn your R code into an R Shiny app
Client Side Code:
User Interface and
Input Elements
Server Side Code:
(Reactive) R Output
Elements
shinyApp(ui = fluidPage, server = serverFunction)
fluidPage
Code
serverFunction
Code
Try Plumber!
Let’s Build a REST API with R
1. Write Functions in R
Expose Data or Model
Produce Analysis or Visualization
Data Agnostic
Perform Analysis on New Data
2. Create Plumber
API Endpoints
- Get
- Post
4. Send Requests to
the Plumber Service
Through external (or
internal) Applications
- Jupyter Notebooks
- Web Apps
3. Host the Plumber
Script on a Server
- Create Plumber
router object
- Run in an R Session
Docker Image
RStudio
Server
R Session
Running
Plumber
REST API
My Local File
System
- Plumber.R
- Dockerfile
Local Volume Link
Applications
&
Notebooks
Requests!
Demo Framework
That’s it!
Now go build some sweet data products
Resources for Learning R
R-Ladies Global Meetups
• Get involved!
• More female speakers,
leaders, teachers, builders,
friends!
RLadies.org
@RLadiesGlobal
RStudio Webinars
• All of the talks
from RStudio::conf
2018 have just
been published
• Highly
recommend!
Resources for Learning Shiny Development
shiny.rstudio.com
Resources for Learning Plumber
www.rplumber.io
@TrestleJeff
on Twitter!
Note to self: Remember to give
out stickers
I have R-Ladies and R-Ladies Plumber Stickers!
I’m Kelly!
@kellrstats on Twitter

Más contenido relacionado

La actualidad más candente

Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...DataKitchen
 
Do Agile Data in Just 5 Shocking Steps!
Do Agile Data in Just 5 Shocking Steps!Do Agile Data in Just 5 Shocking Steps!
Do Agile Data in Just 5 Shocking Steps!DataKitchen
 
devopsdays Warsaw 2018 - Chaos while deploying ML
devopsdays Warsaw 2018 - Chaos while deploying MLdevopsdays Warsaw 2018 - Chaos while deploying ML
devopsdays Warsaw 2018 - Chaos while deploying MLThiago de Faria
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...Domino Data Lab
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
The Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceThe Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceEric Kavanagh
 
Bridged Overview by CodeData
Bridged Overview by CodeDataBridged Overview by CodeData
Bridged Overview by CodeDataSam Sur
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningDatabricks
 
Understanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application QualityUnderstanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application QualityDevOps.com
 
Moving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in HealthcareMoving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in HealthcarePerficient, Inc.
 
Beyond Batch: Is ETL still relevant in the API economy?
Beyond Batch: Is ETL still relevant in the API economy?Beyond Batch: Is ETL still relevant in the API economy?
Beyond Batch: Is ETL still relevant in the API economy?SnapLogic
 
Big Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondBig Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondDataWorks Summit/Hadoop Summit
 
The lean principles of data ops
The lean principles of data opsThe lean principles of data ops
The lean principles of data opsLars Albertsson
 
The DBA Is Dead (Again). Long Live the DBA !
The DBA Is Dead (Again). Long Live the DBA !The DBA Is Dead (Again). Long Live the DBA !
The DBA Is Dead (Again). Long Live the DBA !Christian Bilien
 
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & RedshiftIntroduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & RedshiftDataKitchen
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku
 

La actualidad más candente (20)

Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
 
Do Agile Data in Just 5 Shocking Steps!
Do Agile Data in Just 5 Shocking Steps!Do Agile Data in Just 5 Shocking Steps!
Do Agile Data in Just 5 Shocking Steps!
 
devopsdays Warsaw 2018 - Chaos while deploying ML
devopsdays Warsaw 2018 - Chaos while deploying MLdevopsdays Warsaw 2018 - Chaos while deploying ML
devopsdays Warsaw 2018 - Chaos while deploying ML
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
The Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data GovernanceThe Model Enterprise: A Blueprint for Enterprise Data Governance
The Model Enterprise: A Blueprint for Enterprise Data Governance
 
Bridged Overview by CodeData
Bridged Overview by CodeDataBridged Overview by CodeData
Bridged Overview by CodeData
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine Learning
 
Understanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application QualityUnderstanding DataOps and Its Impact on Application Quality
Understanding DataOps and Its Impact on Application Quality
 
Moving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in HealthcareMoving to the Cloud: Modernizing Data Architecture in Healthcare
Moving to the Cloud: Modernizing Data Architecture in Healthcare
 
Beyond Batch: Is ETL still relevant in the API economy?
Beyond Batch: Is ETL still relevant in the API economy?Beyond Batch: Is ETL still relevant in the API economy?
Beyond Batch: Is ETL still relevant in the API economy?
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Big Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondBig Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyond
 
Data engineering design patterns
Data engineering design patternsData engineering design patterns
Data engineering design patterns
 
Hadoop dev 01
Hadoop dev 01Hadoop dev 01
Hadoop dev 01
 
The lean principles of data ops
The lean principles of data opsThe lean principles of data ops
The lean principles of data ops
 
The DBA Is Dead (Again). Long Live the DBA !
The DBA Is Dead (Again). Long Live the DBA !The DBA Is Dead (Again). Long Live the DBA !
The DBA Is Dead (Again). Long Live the DBA !
 
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & RedshiftIntroduction to Big Data Technologies:  Hadoop/EMR/Map Reduce & Redshift
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & Redshift
 
Surviving the Hadoop Revolution
Surviving the Hadoop RevolutionSurviving the Hadoop Revolution
Surviving the Hadoop Revolution
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
 

Similar a Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a Hint of DevOps

Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning ProductsAndrew Musselman
 
Business in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationBusiness in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationInside Analysis
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
Wsrest13 gilherme keynote
Wsrest13 gilherme keynoteWsrest13 gilherme keynote
Wsrest13 gilherme keynoteruyalarcon
 
The New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data ExplorationThe New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data ExplorationInside Analysis
 
SciPy Latin America 2019
SciPy Latin America 2019SciPy Latin America 2019
SciPy Latin America 2019Travis Oliphant
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionWeCloudData
 
Drupal - Changing the Web by Connecting Open Minds - Josef Dabernig
Drupal - Changing the Web by Connecting Open Minds - Josef DabernigDrupal - Changing the Web by Connecting Open Minds - Josef Dabernig
Drupal - Changing the Web by Connecting Open Minds - Josef DabernigDrupalCampDN
 
Let's analyze how world reacts to road traffic by sentiment analysis final
Let's analyze how world reacts to road traffic by sentiment analysis finalLet's analyze how world reacts to road traffic by sentiment analysis final
Let's analyze how world reacts to road traffic by sentiment analysis finalSajeetharan
 
Enabling Data centric Teams
Enabling Data centric TeamsEnabling Data centric Teams
Enabling Data centric TeamsData Con LA
 
The Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterThe Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterInside Analysis
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data ScienceDataWorks Summit
 
Ncku csie talk about Spark
Ncku csie talk about SparkNcku csie talk about Spark
Ncku csie talk about SparkGiivee The
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the tradeFangda Wang
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teamsVenkatesh Umaashankar
 
From SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the SwitchFrom SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the SwitchRachel Berryman
 
Reproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and SparkReproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and SparkAdaryl "Bob" Wakefield, MBA
 
Become Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackTypeBecome Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackTypenathanmarz
 
SPSNYC2019 - What is Common Data Model and how to use it?
SPSNYC2019 - What is Common Data Model and how to use it?SPSNYC2019 - What is Common Data Model and how to use it?
SPSNYC2019 - What is Common Data Model and how to use it?Nicolas Georgeault
 

Similar a Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a Hint of DevOps (20)

Lean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science teamLean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science team
 
Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
 
Business in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationBusiness in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for Integration
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Wsrest13 gilherme keynote
Wsrest13 gilherme keynoteWsrest13 gilherme keynote
Wsrest13 gilherme keynote
 
The New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data ExplorationThe New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data Exploration
 
SciPy Latin America 2019
SciPy Latin America 2019SciPy Latin America 2019
SciPy Latin America 2019
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info Session
 
Drupal - Changing the Web by Connecting Open Minds - Josef Dabernig
Drupal - Changing the Web by Connecting Open Minds - Josef DabernigDrupal - Changing the Web by Connecting Open Minds - Josef Dabernig
Drupal - Changing the Web by Connecting Open Minds - Josef Dabernig
 
Let's analyze how world reacts to road traffic by sentiment analysis final
Let's analyze how world reacts to road traffic by sentiment analysis finalLet's analyze how world reacts to road traffic by sentiment analysis final
Let's analyze how world reacts to road traffic by sentiment analysis final
 
Enabling Data centric Teams
Enabling Data centric TeamsEnabling Data centric Teams
Enabling Data centric Teams
 
The Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterThe Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value Thereafter
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Ncku csie talk about Spark
Ncku csie talk about SparkNcku csie talk about Spark
Ncku csie talk about Spark
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
 
From SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the SwitchFrom SQL to Python - A Beginner's Guide to Making the Switch
From SQL to Python - A Beginner's Guide to Making the Switch
 
Reproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and SparkReproducible Research with R, The Tidyverse, Notebooks, and Spark
Reproducible Research with R, The Tidyverse, Notebooks, and Spark
 
Become Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackTypeBecome Efficient or Die: The Story of BackType
Become Efficient or Die: The Story of BackType
 
SPSNYC2019 - What is Common Data Model and how to use it?
SPSNYC2019 - What is Common Data Model and how to use it?SPSNYC2019 - What is Common Data Model and how to use it?
SPSNYC2019 - What is Common Data Model and how to use it?
 

Más de Rehgan Avon

Ezgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with REzgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with RRehgan Avon
 
Dr. Karen Amstutz - Digitizing Health: How Analytics are Disrupting Healthca...
Dr. Karen Amstutz - Digitizing Health:  How Analytics are Disrupting Healthca...Dr. Karen Amstutz - Digitizing Health:  How Analytics are Disrupting Healthca...
Dr. Karen Amstutz - Digitizing Health: How Analytics are Disrupting Healthca...Rehgan Avon
 
Amanda Cinnamon - Treat Your Code Like the Valuable Software It Is
Amanda Cinnamon - Treat Your Code Like the Valuable Software It IsAmanda Cinnamon - Treat Your Code Like the Valuable Software It Is
Amanda Cinnamon - Treat Your Code Like the Valuable Software It IsRehgan Avon
 
Cheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldCheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldRehgan Avon
 
Wei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelWei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelRehgan Avon
 
Helen Patton - Governing Big Data: Security, Privacy & Data Management
Helen Patton - Governing Big Data: Security, Privacy & Data ManagementHelen Patton - Governing Big Data: Security, Privacy & Data Management
Helen Patton - Governing Big Data: Security, Privacy & Data ManagementRehgan Avon
 
Dr. Lara Sucheston-Campbell - Building a working farm: Planning and planting ...
Dr. Lara Sucheston-Campbell - Building a working farm: Planning and planting ...Dr. Lara Sucheston-Campbell - Building a working farm: Planning and planting ...
Dr. Lara Sucheston-Campbell - Building a working farm: Planning and planting ...Rehgan Avon
 
Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...
Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...
Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...Rehgan Avon
 

Más de Rehgan Avon (9)

Ezgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with REzgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with R
 
Dr. Karen Amstutz - Digitizing Health: How Analytics are Disrupting Healthca...
Dr. Karen Amstutz - Digitizing Health:  How Analytics are Disrupting Healthca...Dr. Karen Amstutz - Digitizing Health:  How Analytics are Disrupting Healthca...
Dr. Karen Amstutz - Digitizing Health: How Analytics are Disrupting Healthca...
 
Amanda Cinnamon - Treat Your Code Like the Valuable Software It Is
Amanda Cinnamon - Treat Your Code Like the Valuable Software It IsAmanda Cinnamon - Treat Your Code Like the Valuable Software It Is
Amanda Cinnamon - Treat Your Code Like the Valuable Software It Is
 
Cheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldCheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial World
 
Wei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelWei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI Panel
 
Helen Patton - Governing Big Data: Security, Privacy & Data Management
Helen Patton - Governing Big Data: Security, Privacy & Data ManagementHelen Patton - Governing Big Data: Security, Privacy & Data Management
Helen Patton - Governing Big Data: Security, Privacy & Data Management
 
Dr. Lara Sucheston-Campbell - Building a working farm: Planning and planting ...
Dr. Lara Sucheston-Campbell - Building a working farm: Planning and planting ...Dr. Lara Sucheston-Campbell - Building a working farm: Planning and planting ...
Dr. Lara Sucheston-Campbell - Building a working farm: Planning and planting ...
 
Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...
Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...
Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...
 
BDAA_Newsletter
BDAA_NewsletterBDAA_Newsletter
BDAA_Newsletter
 

Último

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 

Último (20)

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 

Kelly O'Briant - DataOps in the Cloud: How To Supercharge Data Science with a Hint of DevOps

  • 1. DataOps Data Science Empowerment through DevOps, Cloud Computing and Building your own Applications
  • 2. Kelly O’Briant Data Science Product Engineer kelly@rladies.org @kellrstats | @RLadiesDC • R-Ladies Washington DC Chapter Founder and Organizer • R-Ladies Global unofficial “cloud expert” • Publish a monthly series called .rprofile on the rOpenSci blog • Business Science University course developer
  • 3. My Talk Goal: I want you to leave this conference so excited, you go back to work and completely ignore whatever project you’re supposed to be working on because you’re so pumped up about building a data product and you can’t stop yourself from doing it.
  • 4. Motivation Why I talk about Data Science Empowerment R-Ladies events • How do I get a job as a data scientist/analyst/anything? • What should I study/learn/do/produce to be a data scientist? • Am I even a data scientist? Is what I do data science? Why are data products empowering? • I use data products to justify/prove to myself that I belong, that my ideas are valid and to help me communicate with people who are bad at listening (or when I’m bad at speaking)
  • 6. R-Ladies + International Women’s Day Twitter Campaign • Create a twitter bot using R code to tweet out a profile for every woman in our Global speaker directory • Project collaboration through GitHub • Docker linked to a local volume • Twitter Application(s)
  • 7. Deploy and Use H2O Machine Learning Models in Production • Build and validate a model in python working in a Jupyter Notebook with the H2O machine learning API • Package the model code as a POJO or MOJO file • Deploy the model to H2O.ai STEAM to create an ML prediction service complete with a REST API query URL
  • 8. Create and Maintain a Personal Website • Use the blogdown package in an RStudio project to create the framework for a Hugo static website • Create content for the site by writing Rmarkdown files • Compile and deploy the static site – choose a hosting mechanism: GitHub? Continuous Integration with Netlify?
  • 9. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff
  • 10. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff #rstats
  • 11. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff Worldwide organization that promotes gender diversity in the R community via meetups and mentorship in a friendly and safe environment
  • 12. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff
  • 13. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff
  • 14. Why are you so into R? • It’s great for Data Science • The community at large is awesome • The female community is awesome • R integrates with other tech • It’s growing really fast in cool ways • I can use it to build cool stuff
  • 15. Back to the topic: DataOps 1. It usually takes a little DevOps to build a Data Product 2. Building more Data Products is empowering – good for your portfolio and soul
  • 16. What is DevOps And why should Data-oriented people care about it? DevOps is… “A combination of cultural philosophies, practices and tools that increases an organizations ability to deliver applications and services at high velocity. - AWS DevOps Blog
  • 17. Deliver applications and services at high velocity Do This – without pulling all your hair out?
  • 18. Deliver applications and services at high velocity Do This – Super Effectively Host your analysis • Share • Publish • Collaborate • Prove a point • Serve a purpose • Be reproducible • Save the day
  • 19. What is DataOps? DataOps? Anywhere you can put a little DevOps magic into your data science workflow
  • 20.
  • 21. Build More Data Products So that you and others can use them to solve real problems
  • 24. Do Machine Learning! So Hot Right Now What Species is this iris?? Credit: xkcd
  • 25. 1. Turn your ideas into R code • Write functions to generate the plots you’re envisioning • Package: ggplot2 • Train and validate a machine learning model to use • Package: caret geom_hist_basic <- function(var){ ggplot(iris, aes_string(x = var)) + geom_histogram() + facet_wrap(~ Species) } predict_matrix(fit.knn, validation) Confusion Matrix and Statistics Prediction setosa versicolor virginica setosa 10 0 0 versicolor 0 8 1 virginica 0 2 9
  • 26. 2. Turn your R code into an R Shiny app Client Side Code: User Interface and Input Elements Server Side Code: (Reactive) R Output Elements shinyApp(ui = fluidPage, server = serverFunction) fluidPage Code serverFunction Code
  • 28. Let’s Build a REST API with R 1. Write Functions in R Expose Data or Model Produce Analysis or Visualization Data Agnostic Perform Analysis on New Data 2. Create Plumber API Endpoints - Get - Post 4. Send Requests to the Plumber Service Through external (or internal) Applications - Jupyter Notebooks - Web Apps 3. Host the Plumber Script on a Server - Create Plumber router object - Run in an R Session
  • 29. Docker Image RStudio Server R Session Running Plumber REST API My Local File System - Plumber.R - Dockerfile Local Volume Link Applications & Notebooks Requests! Demo Framework
  • 30. That’s it! Now go build some sweet data products
  • 32. R-Ladies Global Meetups • Get involved! • More female speakers, leaders, teachers, builders, friends! RLadies.org @RLadiesGlobal
  • 33. RStudio Webinars • All of the talks from RStudio::conf 2018 have just been published • Highly recommend!
  • 34. Resources for Learning Shiny Development shiny.rstudio.com
  • 35. Resources for Learning Plumber www.rplumber.io @TrestleJeff on Twitter!
  • 36. Note to self: Remember to give out stickers I have R-Ladies and R-Ladies Plumber Stickers! I’m Kelly! @kellrstats on Twitter