SlideShare una empresa de Scribd logo
1 de 38
Bridging the Gap Between Data Science
& Engineering:
Building High-Performing Teams
How do I hire a data scientist?
Software Engineer Data Engineer Data Scientist Applied Scientist Research Scientist
Continuum of Skills
Software Engineer Data Engineer Data Scientist Applied Scientist Research Scientist
Continuum of Skills
Math &
Stats
Computer
Science
Domain
Expertise
Machine
Learning
Software
Engineering Research
Unicorn
Data Science
Many companies try to find all of these skills in a
single person.
Which leads to job requirements like this…
• MSc/PhD in Computer Science, Electrical Engineering, Math or Statistics
• At least 5 years of experience in solving real-world practical problems using Machine Learning
• At least 5 years of experience on mining and modeling large-scale data (hundreds of terabytes)
• Extensive in-depth knowledge of Data Mining, Machine Learning, Algorithms
• Knowledge of at least one high-level programming language (C++, Java)
• Knowledge of at least one scripting language (Perl, Python, Ruby)
• Knowledge of SQL and experience with large relational databases
• Knowledge of at least one ML toolset (R, Weka, KNIME, Octave, Mahout, scikit-learn)
• Strong ability to formalize and provide practical solutions to research problems
• Strong communication skills and ability to work independently to get an idea from inception to
implementation.
• Knowledge of the state of the art in at least one of Bayesian Optimization, Recommendation
Systems, Social Network Analysis, Information Retrieval
• At least 5 years of experience with storing, sampling, querying large-scale data (hundreds of
terabytes) and experimentation frameworks
• At least 5 years of experience with Hadoop, Spark, Mahout or Giraph
Data Science Unicorn
These people do exist, but they are often already
well-compensated, and only want to work on
interesting problems.
What can you do?
Build a team instead.
Bridging the Gap Between Data Science & Engineer: Building High-Performance Teams
Broad-range generalist
Deepexpertise
Look for T-shaped people
Machine Learning,
Statistics, Domain Knowledge
Softw
are
Engineering
Business
Acum
en
Distributed
Com
puting
Com
m
unication
Look for T-shaped people
• Compose teams of individuals who
have overlapping skill-sets and
deep expertise in one area
(machine learning, statistics,
engineering, business, etc.)
• The overlap allows them to speak
the same language and work
collaboratively on solving problems
How do I structure my data science team within
my organization?
Data Science Team Structures
CentralizedEmbeddedHub & Spoke
Centralized
Data Scientists sit on a team that
acts as internal consultants, fielding
and answering questions from
multiple teams within the
organization, defining tools for the
organization, and acting as highly
powered consultants.
Embedded
• Data Scientists are almost wholly
embedded within one particular team
and focus on solving problems for that
team.
• Teams are assigned to one particular
product or function within the company
and define and answer questions for
that product or function.
Hub & Spoke
• The data science team sits
together physically and works
collaboratively to solve problems.
• However, each data scientist (or
a combination of them) gets
deployed to work on problems
within the organization.
• Tends to apply to companies
who have a lot of users.
Data Science Team Structure
CentralizedEmbeddedHub & Spoke
> >
How do I get my data scientists to work with
engineering?
Data Science
Python R
modeling & prototyping production
Software Engineering
Java/C++ RoR/Javascript
Data Science Software Engineering
Python R Java/C++ RoR/Javascript
modeling & prototyping production
Data scientists learn
to write prototypes
in production
languages
Engineers learn the
basics of data
science so they can
understand how
the models work
Goal is to have both teams speak
the same language and engender
trust through communication
Data Science Data Engineering
Common Core
Data Science
Curriculum
Data Engineering
Curriculum
Data Science Data Engineering
Projects
Data Science Engineering
Initial Planning
Data Science Engineering
Data Science Engineering
Production
• Don’t look for unicorns, build collaborative
teams of T-shaped people
• Pay attention to how your data science team is
structured within your organization
• Get your data science and engineering teams to
speak the same language, allowing them to build
trust and work collaboratively
Summary
We believe an opportunity belongs 

to anyone with aptitude and ambition.
29Galvanize 2015
NODES ON THE NETWORK
COLORADO (BOULDER, DENVER, FORT COLLINS)
SEATTLE, WA
SAN FRANCISCO, CA
AUSTIN, TX (OPENING Q1 2016)
Programs: Full Stack Immersive, Data Science Immersive,
Entrepreneurship
Programs: Full Stack Immersive, Data Science Immersive,
Entrepreneurship
Programs: Full Stack Immersive, Data Science Immersive, Data
Engineering Immersive, Masters of Science in Data Science,
Entrepreneurship
Programs: Full Stack Immersive, Data Science Immersive,
Entrepreneurship
[Explanation Text]
30Galvanize 2015
PLACEMENT STATS
FULL STACK IMMERSIVE DATA SCIENCE IMMERSIVE
$43K $77KPre-program Salary
Average Starting Salary
97% Placement
Rate*
*Galvanize is a founder member of NESTA (New Economy Skills Training Association), a trade organization founded to regulate the new “bootcamp” market.
This place rate is more rigorous than that requested by state licensure agencies. The placement rate is calculated 6 months after graduation.
$72K $114KPre-program Salary
94%Placement
Rate*
Average Starting Salary
31Galvanize 2015
5 PROGRAMS
• Full Stack Immersive
• Data Science Immersive
• Data Engineering Immersive
Project over 500 Student Member Graduates in 2015
Currently over 1500 Members
• Master of Science in Data Science 

(University of New Haven)
• Startup Membership
32Galvanize 2015
FULL STACK IMMERSIVE
• 97% Placement Rate 

within 6 months
• $77K Average Starting Salary
• 6 Month Program
33Galvanize 2015
FULL STACK IMMERSIVE
34Galvanize 2015
DATA SCIENCE IMMERSIVE
• 94% Placement Rate 

within 6 months
• $114K Average Starting Salary
• 3 Month Program
35Galvanize 2015
DATA SCIENCE IMMERSIVE
Week 1 - Exploratory Data Analysis and Software Engineering Best Practices
Week 2 - Statistical Inference, Bayesian Methods, A/B Testing, Multi-Armed Bandit
Week 3 - Regression, Regularization, Gradient Descent
Week 4 - Supervised Machine Learning: Classification, Validation, Ensemble Methods
Week 5 - Clustering, Topic Modeling (NMF, LDA), NLP
Week 6 - Network Analysis, Matrix Factorization, and Time Series
Week 7 - Hadoop, Hive, and MapReduce
Week 8 - Data Visualization with D3.js, Data Products, and Fraud Detection Case Study
Weeks 9-10 - Capstone Projects
Week 12 - Onsite Interviews
36Galvanize 2015
DATA SCIENCE IMMERSIVE
37Galvanize 2015
DATA ENGINEERING IMMERSIVE
• Launched Oct. 2015
• Built in partnership with Nvent and
Concurrent
• 3 Month Program
THANK YOU
RYAN ORBAN | EVP OF PRODUCT & STRATEGY
ryan.orban@galvanize.com
@ryanorban
www.galvanize.com

Más contenido relacionado

La actualidad más candente

Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit SummitHabits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit SummitHabit Summit
 
Adaptive Strategy Combining OKR and Lean Portfolio Management
Adaptive Strategy Combining OKR and Lean Portfolio ManagementAdaptive Strategy Combining OKR and Lean Portfolio Management
Adaptive Strategy Combining OKR and Lean Portfolio ManagementEmiliano Soldi
 
Bliss.ai Initial VC Raising Pitch Deck
Bliss.ai Initial VC Raising Pitch Deck Bliss.ai Initial VC Raising Pitch Deck
Bliss.ai Initial VC Raising Pitch Deck AA BB
 
The World of StackOverflow
The World of StackOverflowThe World of StackOverflow
The World of StackOverflowRadu Murzea
 
MySQL fundraising pitch deck ($16 million Series B round - 2003)
MySQL fundraising pitch deck ($16 million Series B round - 2003)MySQL fundraising pitch deck ($16 million Series B round - 2003)
MySQL fundraising pitch deck ($16 million Series B round - 2003)Robin Wauters
 
The Productivity Secret Of The Best Leaders
The Productivity Secret Of The Best LeadersThe Productivity Secret Of The Best Leaders
The Productivity Secret Of The Best LeadersOfficevibe
 
Blueprint for Executive Hiring
Blueprint for Executive HiringBlueprint for Executive Hiring
Blueprint for Executive HiringGreylock Partners
 
10 Things your Audience Hates About your Presentation
10 Things your Audience Hates About your Presentation10 Things your Audience Hates About your Presentation
10 Things your Audience Hates About your PresentationStinson
 
DAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data QualityDAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data QualityDATAVERSITY
 
Solve for X with AI: a VC view of the Machine Learning & AI landscape
Solve for X with AI: a VC view of the Machine Learning & AI landscapeSolve for X with AI: a VC view of the Machine Learning & AI landscape
Solve for X with AI: a VC view of the Machine Learning & AI landscapeEd Fernandez
 
Modern Agile - Porque Agile necesitaba un refresh!
Modern Agile - Porque Agile necesitaba un refresh!Modern Agile - Porque Agile necesitaba un refresh!
Modern Agile - Porque Agile necesitaba un refresh!Johnny Ordóñez
 
Pragmatic Product Strategy - Ways of thinking and doing that bring people tog...
Pragmatic Product Strategy - Ways of thinking and doing that bring people tog...Pragmatic Product Strategy - Ways of thinking and doing that bring people tog...
Pragmatic Product Strategy - Ways of thinking and doing that bring people tog...Jonny Schneider
 
Trillion Dollar Coach Book (Bill Campbell)
Trillion Dollar Coach Book (Bill Campbell)Trillion Dollar Coach Book (Bill Campbell)
Trillion Dollar Coach Book (Bill Campbell)Eric Schmidt
 
What is business agility?
What is business agility?What is business agility?
What is business agility?Tze Chin Tang
 
Finding Our Happy Place in the Internet of Things
Finding Our Happy Place in the Internet of ThingsFinding Our Happy Place in the Internet of Things
Finding Our Happy Place in the Internet of ThingsPamela Pavliscak
 
ChatGPT Deck.pptx
ChatGPT Deck.pptxChatGPT Deck.pptx
ChatGPT Deck.pptxomornahid1
 
Coinbase Seed Round Pitch Deck
Coinbase Seed Round Pitch DeckCoinbase Seed Round Pitch Deck
Coinbase Seed Round Pitch DeckBrian Armstrong
 

La actualidad más candente (20)

Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit SummitHabits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
 
Adaptive Strategy Combining OKR and Lean Portfolio Management
Adaptive Strategy Combining OKR and Lean Portfolio ManagementAdaptive Strategy Combining OKR and Lean Portfolio Management
Adaptive Strategy Combining OKR and Lean Portfolio Management
 
Bliss.ai Initial VC Raising Pitch Deck
Bliss.ai Initial VC Raising Pitch Deck Bliss.ai Initial VC Raising Pitch Deck
Bliss.ai Initial VC Raising Pitch Deck
 
The World of StackOverflow
The World of StackOverflowThe World of StackOverflow
The World of StackOverflow
 
MySQL fundraising pitch deck ($16 million Series B round - 2003)
MySQL fundraising pitch deck ($16 million Series B round - 2003)MySQL fundraising pitch deck ($16 million Series B round - 2003)
MySQL fundraising pitch deck ($16 million Series B round - 2003)
 
The Productivity Secret Of The Best Leaders
The Productivity Secret Of The Best LeadersThe Productivity Secret Of The Best Leaders
The Productivity Secret Of The Best Leaders
 
Front series A deck
Front series A deckFront series A deck
Front series A deck
 
Blueprint for Executive Hiring
Blueprint for Executive HiringBlueprint for Executive Hiring
Blueprint for Executive Hiring
 
Zero to 50m
Zero to 50m Zero to 50m
Zero to 50m
 
10 Things your Audience Hates About your Presentation
10 Things your Audience Hates About your Presentation10 Things your Audience Hates About your Presentation
10 Things your Audience Hates About your Presentation
 
DAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data QualityDAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data Quality
 
Solve for X with AI: a VC view of the Machine Learning & AI landscape
Solve for X with AI: a VC view of the Machine Learning & AI landscapeSolve for X with AI: a VC view of the Machine Learning & AI landscape
Solve for X with AI: a VC view of the Machine Learning & AI landscape
 
Modern Agile - Porque Agile necesitaba un refresh!
Modern Agile - Porque Agile necesitaba un refresh!Modern Agile - Porque Agile necesitaba un refresh!
Modern Agile - Porque Agile necesitaba un refresh!
 
Pragmatic Product Strategy - Ways of thinking and doing that bring people tog...
Pragmatic Product Strategy - Ways of thinking and doing that bring people tog...Pragmatic Product Strategy - Ways of thinking and doing that bring people tog...
Pragmatic Product Strategy - Ways of thinking and doing that bring people tog...
 
Trillion Dollar Coach Book (Bill Campbell)
Trillion Dollar Coach Book (Bill Campbell)Trillion Dollar Coach Book (Bill Campbell)
Trillion Dollar Coach Book (Bill Campbell)
 
20 prompts for chatGPT that make life easier for developers.pdf
20 prompts for chatGPT that make life easier for developers.pdf20 prompts for chatGPT that make life easier for developers.pdf
20 prompts for chatGPT that make life easier for developers.pdf
 
What is business agility?
What is business agility?What is business agility?
What is business agility?
 
Finding Our Happy Place in the Internet of Things
Finding Our Happy Place in the Internet of ThingsFinding Our Happy Place in the Internet of Things
Finding Our Happy Place in the Internet of Things
 
ChatGPT Deck.pptx
ChatGPT Deck.pptxChatGPT Deck.pptx
ChatGPT Deck.pptx
 
Coinbase Seed Round Pitch Deck
Coinbase Seed Round Pitch DeckCoinbase Seed Round Pitch Deck
Coinbase Seed Round Pitch Deck
 

Similar a Bridging the Gap Between Data Science & Engineer: Building High-Performance Teams

Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights Joe Lamantia
 
Starting a career in data science
Starting a career in data scienceStarting a career in data science
Starting a career in data scienceBrian Spiering
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Simplilearn
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 
What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceAnnie Flippo
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teamsVenkatesh Umaashankar
 
Lessons after working as a data scientist for 1 year
Lessons after working as a data scientist for 1 yearLessons after working as a data scientist for 1 year
Lessons after working as a data scientist for 1 yearYao Yao
 
Become a successful Data Scientist. Start Now!
Become a successful Data Scientist. Start Now!Become a successful Data Scientist. Start Now!
Become a successful Data Scientist. Start Now!Edology
 
Data science-retreat-how it works plus advice for upcoming data scientists
Data science-retreat-how it works plus advice for upcoming data scientistsData science-retreat-how it works plus advice for upcoming data scientists
Data science-retreat-how it works plus advice for upcoming data scientistsJose Quesada
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionWeCloudData
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science TeamsEMC
 
A Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptxA Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptxRajSingh512965
 
Barga Data Science lecture 1
Barga Data Science lecture 1Barga Data Science lecture 1
Barga Data Science lecture 1Roger Barga
 
Data Science at UCSB Information Meeting
Data Science at UCSB Information MeetingData Science at UCSB Information Meeting
Data Science at UCSB Information MeetingJason Freeberg
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxsumitkumar600840
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseLisa Cohen
 
Anova Analytics - Advanced Analytics - Marketing & Sales Brochure .V5
Anova Analytics - Advanced Analytics - Marketing & Sales Brochure .V5Anova Analytics - Advanced Analytics - Marketing & Sales Brochure .V5
Anova Analytics - Advanced Analytics - Marketing & Sales Brochure .V5Soraya Hasbani (miller)
 

Similar a Bridging the Gap Between Data Science & Engineer: Building High-Performance Teams (20)

Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights
 
Starting a career in data science
Starting a career in data scienceStarting a career in data science
Starting a career in data science
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data Science
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
 
Lessons after working as a data scientist for 1 year
Lessons after working as a data scientist for 1 yearLessons after working as a data scientist for 1 year
Lessons after working as a data scientist for 1 year
 
Become a successful Data Scientist. Start Now!
Become a successful Data Scientist. Start Now!Become a successful Data Scientist. Start Now!
Become a successful Data Scientist. Start Now!
 
Data science-retreat-how it works plus advice for upcoming data scientists
Data science-retreat-how it works plus advice for upcoming data scientistsData science-retreat-how it works plus advice for upcoming data scientists
Data science-retreat-how it works plus advice for upcoming data scientists
 
Lean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science teamLean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science team
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info Session
 
Kevin Resume
Kevin ResumeKevin Resume
Kevin Resume
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science Teams
 
A Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptxA Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptx
 
Barga Data Science lecture 1
Barga Data Science lecture 1Barga Data Science lecture 1
Barga Data Science lecture 1
 
Data Science at UCSB Information Meeting
Data Science at UCSB Information MeetingData Science at UCSB Information Meeting
Data Science at UCSB Information Meeting
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptx
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the Enterprise
 
Anova Analytics - Advanced Analytics - Marketing & Sales Brochure .V5
Anova Analytics - Advanced Analytics - Marketing & Sales Brochure .V5Anova Analytics - Advanced Analytics - Marketing & Sales Brochure .V5
Anova Analytics - Advanced Analytics - Marketing & Sales Brochure .V5
 

Último

CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 

Último (17)

CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 

Bridging the Gap Between Data Science & Engineer: Building High-Performance Teams

  • 1. Bridging the Gap Between Data Science & Engineering: Building High-Performing Teams
  • 2. How do I hire a data scientist?
  • 3. Software Engineer Data Engineer Data Scientist Applied Scientist Research Scientist Continuum of Skills
  • 4. Software Engineer Data Engineer Data Scientist Applied Scientist Research Scientist Continuum of Skills
  • 6. Many companies try to find all of these skills in a single person.
  • 7. Which leads to job requirements like this… • MSc/PhD in Computer Science, Electrical Engineering, Math or Statistics • At least 5 years of experience in solving real-world practical problems using Machine Learning • At least 5 years of experience on mining and modeling large-scale data (hundreds of terabytes) • Extensive in-depth knowledge of Data Mining, Machine Learning, Algorithms • Knowledge of at least one high-level programming language (C++, Java) • Knowledge of at least one scripting language (Perl, Python, Ruby) • Knowledge of SQL and experience with large relational databases • Knowledge of at least one ML toolset (R, Weka, KNIME, Octave, Mahout, scikit-learn) • Strong ability to formalize and provide practical solutions to research problems • Strong communication skills and ability to work independently to get an idea from inception to implementation. • Knowledge of the state of the art in at least one of Bayesian Optimization, Recommendation Systems, Social Network Analysis, Information Retrieval • At least 5 years of experience with storing, sampling, querying large-scale data (hundreds of terabytes) and experimentation frameworks • At least 5 years of experience with Hadoop, Spark, Mahout or Giraph
  • 9. These people do exist, but they are often already well-compensated, and only want to work on interesting problems.
  • 10. What can you do? Build a team instead.
  • 13. Machine Learning, Statistics, Domain Knowledge Softw are Engineering Business Acum en Distributed Com puting Com m unication Look for T-shaped people
  • 14. • Compose teams of individuals who have overlapping skill-sets and deep expertise in one area (machine learning, statistics, engineering, business, etc.) • The overlap allows them to speak the same language and work collaboratively on solving problems
  • 15. How do I structure my data science team within my organization?
  • 16. Data Science Team Structures CentralizedEmbeddedHub & Spoke
  • 17. Centralized Data Scientists sit on a team that acts as internal consultants, fielding and answering questions from multiple teams within the organization, defining tools for the organization, and acting as highly powered consultants.
  • 18. Embedded • Data Scientists are almost wholly embedded within one particular team and focus on solving problems for that team. • Teams are assigned to one particular product or function within the company and define and answer questions for that product or function.
  • 19. Hub & Spoke • The data science team sits together physically and works collaboratively to solve problems. • However, each data scientist (or a combination of them) gets deployed to work on problems within the organization. • Tends to apply to companies who have a lot of users.
  • 20. Data Science Team Structure CentralizedEmbeddedHub & Spoke > >
  • 21. How do I get my data scientists to work with engineering?
  • 22. Data Science Python R modeling & prototyping production Software Engineering Java/C++ RoR/Javascript
  • 23. Data Science Software Engineering Python R Java/C++ RoR/Javascript modeling & prototyping production
  • 24. Data scientists learn to write prototypes in production languages Engineers learn the basics of data science so they can understand how the models work Goal is to have both teams speak the same language and engender trust through communication
  • 25. Data Science Data Engineering Common Core Data Science Curriculum Data Engineering Curriculum Data Science Data Engineering Projects
  • 26. Data Science Engineering Initial Planning Data Science Engineering Data Science Engineering Production
  • 27. • Don’t look for unicorns, build collaborative teams of T-shaped people • Pay attention to how your data science team is structured within your organization • Get your data science and engineering teams to speak the same language, allowing them to build trust and work collaboratively Summary
  • 28. We believe an opportunity belongs 
 to anyone with aptitude and ambition.
  • 29. 29Galvanize 2015 NODES ON THE NETWORK COLORADO (BOULDER, DENVER, FORT COLLINS) SEATTLE, WA SAN FRANCISCO, CA AUSTIN, TX (OPENING Q1 2016) Programs: Full Stack Immersive, Data Science Immersive, Entrepreneurship Programs: Full Stack Immersive, Data Science Immersive, Entrepreneurship Programs: Full Stack Immersive, Data Science Immersive, Data Engineering Immersive, Masters of Science in Data Science, Entrepreneurship Programs: Full Stack Immersive, Data Science Immersive, Entrepreneurship [Explanation Text]
  • 30. 30Galvanize 2015 PLACEMENT STATS FULL STACK IMMERSIVE DATA SCIENCE IMMERSIVE $43K $77KPre-program Salary Average Starting Salary 97% Placement Rate* *Galvanize is a founder member of NESTA (New Economy Skills Training Association), a trade organization founded to regulate the new “bootcamp” market. This place rate is more rigorous than that requested by state licensure agencies. The placement rate is calculated 6 months after graduation. $72K $114KPre-program Salary 94%Placement Rate* Average Starting Salary
  • 31. 31Galvanize 2015 5 PROGRAMS • Full Stack Immersive • Data Science Immersive • Data Engineering Immersive Project over 500 Student Member Graduates in 2015 Currently over 1500 Members • Master of Science in Data Science 
 (University of New Haven) • Startup Membership
  • 32. 32Galvanize 2015 FULL STACK IMMERSIVE • 97% Placement Rate 
 within 6 months • $77K Average Starting Salary • 6 Month Program
  • 34. 34Galvanize 2015 DATA SCIENCE IMMERSIVE • 94% Placement Rate 
 within 6 months • $114K Average Starting Salary • 3 Month Program
  • 35. 35Galvanize 2015 DATA SCIENCE IMMERSIVE Week 1 - Exploratory Data Analysis and Software Engineering Best Practices Week 2 - Statistical Inference, Bayesian Methods, A/B Testing, Multi-Armed Bandit Week 3 - Regression, Regularization, Gradient Descent Week 4 - Supervised Machine Learning: Classification, Validation, Ensemble Methods Week 5 - Clustering, Topic Modeling (NMF, LDA), NLP Week 6 - Network Analysis, Matrix Factorization, and Time Series Week 7 - Hadoop, Hive, and MapReduce Week 8 - Data Visualization with D3.js, Data Products, and Fraud Detection Case Study Weeks 9-10 - Capstone Projects Week 12 - Onsite Interviews
  • 37. 37Galvanize 2015 DATA ENGINEERING IMMERSIVE • Launched Oct. 2015 • Built in partnership with Nvent and Concurrent • 3 Month Program
  • 38. THANK YOU RYAN ORBAN | EVP OF PRODUCT & STRATEGY ryan.orban@galvanize.com @ryanorban www.galvanize.com