SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
Demystifying Data Science
Venkatesh
Data Science Expert and Machine Learning Researcher
What is Data Science?
Data science is an interdisciplinary field that uses
scientific methods, processes, algorithms and systems
to extract knowledge and insights from data in
various forms, both structured and
unstructured,[1][2]
similar to data mining.
Well Tell me in Layman terms
Data
Domain Expertise
Algorithms
Insights
Data products
Automation /
Optimization
Business value
Intelligent Systems - A simple definition
Systems that perform actions that,
if performed by humans, would be
considered intelligent
Sensing
Language
Understanding
Planning
Problem Solving
Knowledge
Decision Making
Learning
Inference
Language
Generation
Robotics
Control
Tasks of Intelligence
Companies have AI issues
Engineering wants to get its hands on Machine Learning
The C-Suite needs an “AI” strategy
Marketing wants to include “AI” in product descriptions
Product is afraid of falling behind
Everyone is pitching you technology
Wait.. What happened to Data warehousing?
First, What is data warehousing?
● Integrated: Constructed by combining data from heterogeneous sources such as relational databases, flat files, etc.
● Time-Variant: Provides information with respect to a particular time period.
● Non-volatile: Data once entered into the warehouse should not change
However, it does not provide:
1. Automatic discovery of patterns
2. Prediction of likely outcomes
3. Creation of actionable information
Courtesy: https://www.educba.com/data-warehousing-vs-data-mining/
What about Business intelligence? Reporting?
● Summarizes the factual/historical data
● Delivers reports, KPIs and trends in a visually
pleasing manner
● Allows organisation to see the big picture
● Assists them to make better decisions to support
the mission.
● BI systems are designed to look backwards
based on real data from real events.
“What Happened and what needs to change ?”
● Data Science looks forward, interpreting the
information to predict what might happen in the future.
“Why it happened and how to change it ?”
STATISTICAL MACHINE LEARNING
= Cat
DEEP LEARNING
92%
EVIDENCE-BASED REASONING
RECOMMENDATION SYSTEMS
NATURAL LANGUAGE GENERATION
CHAT/CONVERSATIONAL INTERFACESROBOTIC PROCESS
AUTOMATION
TEXT ANALYSIS
What makes a Data Science Team?
Research
Courtesy: https://www.business-science.io/business/2018/09/18/data-science-team.html
Who are the members?
Data Engineers Data Scientists
Full Stack
Developers
Product
Managers
Research
Data Engineer. Does he only do ETL?
● Industry has shifted from drag-and-drop ETL tools towards a more
programmatic approach
● Nature of data that needs to be processed is changing day by day
(Processing Files/Batches --> Real time stream data)
Expected Skill Sets:
● Should not stick with a set of tools for building data pipelines
● Has to be a good software engineer
● Comfortable in working with open source platforms
● Adaptable to constantly evolving open source tools
● Employ a variety of tools and languages to marry systems together
Courtesy: http://podcast.freecodecamp.org/ep-37-the-rise-of-the-data-engineer
Why does a DS team need Full Stack Developer?
● Development of Pilots and MVP Applications
○ Productize the data science work so it can serve
an internal stakeholder
○ Interactive display of results/stats/insights
● Responsible for bringing a Software Engineering culture into the Data Science process
○ Build Infrastructure as Code - Automatization of the Data Science team infrastructure and testing
○ Continuous Integration and Versioning Control
○ Development of APIs to help integrate data products and source into applications
○ Building tools for internal use like tools for data collection, data labelling
Courtesy: https://towardsdatascience.com/what-is-the-role-of-an-ai-software-engineer-in-a-data-science-team-eec987203ceb
Data Scientists come in many types
Type A Type B Type C
● High understanding of domain
knowledge
● Uses ready made tools instead
of developing algorithms
● Has less or no hands-on
experience in building software
applications
● Insight oriented
● Focus in better understanding
of the business
● Has basic theoretical
knowledge in data science
● Has good hands-on experience
in building software
applications
● Capable of building an
end-to-end prototype or MVP
● Deep understanding of data
science algorithms
● Has great hands on product
development skills
Domain Experts
● Experts both by education and experience in that domain
● Aware of what data is available and judge how good it is
● Major contribution in Feature Engineering and Modeling
● Use and apply the deliverables of a data science project
in the real world
● Communicate with the intended users of the project’s outcome
● Define the framework for a data science project as they would know
○ What are the current challenges
○ How they must be answered to be practically useful
● Can learn enough data science to make a reasonable model using standardized tools
Courtesy: https://www.linkedin.com/pulse/role-domain-knowledge-data-science-patrick-bangert/
Cutting edge Research
● Seek to understand and develop systems by advancing the
longer-term academic problems surrounding AI
● Actively engage with the research community through
○ publications
○ participation in technical conferences and workshops
● Has the skills to craft customized data science and
machine learning algorithms
● Their focus will be to do research, not solve a business problem
● Data science researchers should not be an early hire
Building a team for Startup
Courtesy: https://thinkgrowth.org/the-startup-founders-guide-to-analytics-1d2176f20ac1
Building a team for an Enterprise
Courtesy: https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/breaking-away-the-secrets-to-scaling-analytics
Who should lead the DS Team
Courtesy: https://www.altexsoft.com/blog/datascience/how-to-structure-data-science-team-key-models-and-roles/
Spotify Case Study
‘Center of Excellence’ Model
Keys to Excellence
Courtesy: https://www.slideshare.net/productschool/the-why-how-of-enterprise-analytics-w-spotify-data-scientist-79046775
● Interview process focused on practical data skills
● ‘Data challenges’ - Airbnb data + real question
● Lightning talks
● Support community
● Multi-stage screening:
○ Recruiter screen
○ Take home data challenge
○ Onsite challenge
○ Trained graders
○ Two graders for each test to ensure consistency
○ 1:1s with hiring manager, business partner, CV
AirBnB
Courtesy: https://www.slideshare.net/Work-Bench/scaling-data-science-at-airbnb
Evolution Of AirBnB’s DS Team
2012 2013 2014 2015
Work
Structure
Centralised(Work
closely within team)
Started working
with other team
members
Embedded with
other teams
Embedded with
other teams
Team size 7 14 28 55
Specialisation All generalists(Data
Engineers + Data
Scientists + Data
Analysts)
Hired first Data
Engineer
Separate team for
Data Engineering
Data Science
Infrastructure
team, Specialised
roles for NLP, CV
Hiring Take home Data
challenge followed
by
1:1 interview with
whole team and
founders
Onsite data
challenge
Created rubrics
and grading
criteria
Started hiring
interns
Started focusing
on diversity and
specialised roles
for NLP,CV
Courtesy: https://vimeo.com/148942395
Facebook
● On-boards infra data scientists and engineers through the Bootcamp program
● Provides broad exposure to engineering systems in a supportive learning environment.
● Encourage engineering teams to identify mentors to guide new data scientists as they ramp up in their first projects.
● New data scientists receive mentoring on the ways to communicate the results of their complex analyses.
● Data Scientists are presented with the following options:
○ develop deep domain expertise in an area and spend several years embedded with a team
○ move across partner teams every 12 to 18 months in order to develop a broad understanding
● Provides opportunities to learn and master state-of-art skills:
○ Internal training sessions and chalk talks
○ Invite external speakers to cover important developments in the field
○ Closely connected to the academic community
○ Attend and present at major conferences such as INFORMS, KDD, and NIPS
Courtesy: https://code.fb.com/core-data/building-data-science-teams-to-have-an-impact-at-scale
Apple’s Acqui-hiring Strategy
● Apple acqui-hires startups to make its technology smarter and faster
● It buys a whole company to get the team and/or technology
● Hoping to compete with Google’s search service, Apple bought Siri in 2010
● Pandora, Spotify, and Google Music started to predict songs a user will like.
● Apple saw this, which likely prompted the company to purchase Beats Music
(a streaming music service that has a similar algorithm)
● Recently Apple has hired at least 18 people, including at least two co-founders,
one of whom is the CEO from an enterprise consulting startup
called Silicon Valley Data Science
Where should the focus be
Don’t focus on the technology
Focus on the functionality
The functionality is driven by business needs
The functionality is supported by algorithms & data
The algorithms are instrumental to business
Courtesy: Kristian Hammond, NorthWestern University
Data: Do you have the data that support it?
Task: Is your task genuinely data driven?
Scale: Do you need the scale automation
provides?
What you need to ask when considering AI
THANK YOU FOR YOUR ATTENTION
DO YOU HAVE ANY QUESTIONS ?

Más contenido relacionado

La actualidad más candente

Data Architecture Strategies
Data Architecture StrategiesData Architecture Strategies
Data Architecture StrategiesDATAVERSITY
 
Data Governance Program Powerpoint Presentation Slides
Data Governance Program Powerpoint Presentation SlidesData Governance Program Powerpoint Presentation Slides
Data Governance Program Powerpoint Presentation SlidesSlideTeam
 
Data Governance Powerpoint Presentation Slides
Data Governance Powerpoint Presentation SlidesData Governance Powerpoint Presentation Slides
Data Governance Powerpoint Presentation SlidesSlideTeam
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...DATAVERSITY
 
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...DATAVERSITY
 
Collibra - Forrester Presentation : Data Governance 2.0
Collibra - Forrester Presentation : Data Governance 2.0Collibra - Forrester Presentation : Data Governance 2.0
Collibra - Forrester Presentation : Data Governance 2.0Guillaume LE GALIARD
 
Developing a Data Strategy
Developing a Data StrategyDeveloping a Data Strategy
Developing a Data StrategyMartha Horler
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceANOOP V S
 
Data Governance Best Practices and Lessons Learned
Data Governance Best Practices and Lessons LearnedData Governance Best Practices and Lessons Learned
Data Governance Best Practices and Lessons LearnedDATAVERSITY
 
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsData Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsDATAVERSITY
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science TeamsEMC
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Edureka!
 
How to Build Data Science Teams
How to Build Data Science TeamsHow to Build Data Science Teams
How to Build Data Science TeamsGanes Kesari
 
Data Quality Best Practices
Data Quality Best PracticesData Quality Best Practices
Data Quality Best PracticesDATAVERSITY
 
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data PipelinesBest Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data PipelinesEric Kavanagh
 
Connected Planning Anaplan and Deloitte
Connected Planning Anaplan and DeloitteConnected Planning Anaplan and Deloitte
Connected Planning Anaplan and DeloitteKevinaRizkikamila
 
How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019Randall Hunt
 

La actualidad más candente (20)

Data Architecture Strategies
Data Architecture StrategiesData Architecture Strategies
Data Architecture Strategies
 
Data Governance Program Powerpoint Presentation Slides
Data Governance Program Powerpoint Presentation SlidesData Governance Program Powerpoint Presentation Slides
Data Governance Program Powerpoint Presentation Slides
 
Data Governance Powerpoint Presentation Slides
Data Governance Powerpoint Presentation SlidesData Governance Powerpoint Presentation Slides
Data Governance Powerpoint Presentation Slides
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
 
ML Playbook
ML PlaybookML Playbook
ML Playbook
 
Collibra - Forrester Presentation : Data Governance 2.0
Collibra - Forrester Presentation : Data Governance 2.0Collibra - Forrester Presentation : Data Governance 2.0
Collibra - Forrester Presentation : Data Governance 2.0
 
Developing a Data Strategy
Developing a Data StrategyDeveloping a Data Strategy
Developing a Data Strategy
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Data Governance Best Practices and Lessons Learned
Data Governance Best Practices and Lessons LearnedData Governance Best Practices and Lessons Learned
Data Governance Best Practices and Lessons Learned
 
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsData Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and Roadmaps
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science Teams
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
 
How to Build Data Science Teams
How to Build Data Science TeamsHow to Build Data Science Teams
How to Build Data Science Teams
 
Data Quality Best Practices
Data Quality Best PracticesData Quality Best Practices
Data Quality Best Practices
 
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data PipelinesBest Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
 
Connected Planning Anaplan and Deloitte
Connected Planning Anaplan and DeloitteConnected Planning Anaplan and Deloitte
Connected Planning Anaplan and Deloitte
 
How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019
 

Similar a Building successful data science teams

Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...NadinaLisbon1
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceDatabricks
 
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...Ali Alkan
 
Nadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a NutshellNadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a NutshellIT Arena
 
Applied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science DeptApplied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science DeptJonathan Sedar
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Edureka!
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabadVamsiNihal
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabadsaitejavella
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training HyderabadNithinsunil1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
data science training and placement
data science training and placementdata science training and placement
data science training and placementSaiprasadVella
 
online data science training
online data science trainingonline data science training
online data science trainingDIGITALSAI1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabadVamsiNihal
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in HyderabadKumarNaik21
 

Similar a Building successful data science teams (20)

Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field Experience
 
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
 
Nadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a NutshellNadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
 
Applied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science DeptApplied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science Dept
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
 
online data science training
online data science trainingonline data science training
online data science training
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
 

Último

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 

Building successful data science teams

  • 1. Demystifying Data Science Venkatesh Data Science Expert and Machine Learning Researcher
  • 2. What is Data Science? Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured,[1][2] similar to data mining.
  • 3. Well Tell me in Layman terms Data Domain Expertise Algorithms Insights Data products Automation / Optimization Business value
  • 4. Intelligent Systems - A simple definition Systems that perform actions that, if performed by humans, would be considered intelligent
  • 6. Companies have AI issues Engineering wants to get its hands on Machine Learning The C-Suite needs an “AI” strategy Marketing wants to include “AI” in product descriptions Product is afraid of falling behind Everyone is pitching you technology
  • 7. Wait.. What happened to Data warehousing? First, What is data warehousing? ● Integrated: Constructed by combining data from heterogeneous sources such as relational databases, flat files, etc. ● Time-Variant: Provides information with respect to a particular time period. ● Non-volatile: Data once entered into the warehouse should not change However, it does not provide: 1. Automatic discovery of patterns 2. Prediction of likely outcomes 3. Creation of actionable information Courtesy: https://www.educba.com/data-warehousing-vs-data-mining/
  • 8. What about Business intelligence? Reporting? ● Summarizes the factual/historical data ● Delivers reports, KPIs and trends in a visually pleasing manner ● Allows organisation to see the big picture ● Assists them to make better decisions to support the mission. ● BI systems are designed to look backwards based on real data from real events. “What Happened and what needs to change ?” ● Data Science looks forward, interpreting the information to predict what might happen in the future. “Why it happened and how to change it ?”
  • 9. STATISTICAL MACHINE LEARNING = Cat DEEP LEARNING 92% EVIDENCE-BASED REASONING RECOMMENDATION SYSTEMS NATURAL LANGUAGE GENERATION CHAT/CONVERSATIONAL INTERFACESROBOTIC PROCESS AUTOMATION TEXT ANALYSIS
  • 10. What makes a Data Science Team? Research Courtesy: https://www.business-science.io/business/2018/09/18/data-science-team.html
  • 11. Who are the members? Data Engineers Data Scientists Full Stack Developers Product Managers Research
  • 12. Data Engineer. Does he only do ETL? ● Industry has shifted from drag-and-drop ETL tools towards a more programmatic approach ● Nature of data that needs to be processed is changing day by day (Processing Files/Batches --> Real time stream data) Expected Skill Sets: ● Should not stick with a set of tools for building data pipelines ● Has to be a good software engineer ● Comfortable in working with open source platforms ● Adaptable to constantly evolving open source tools ● Employ a variety of tools and languages to marry systems together Courtesy: http://podcast.freecodecamp.org/ep-37-the-rise-of-the-data-engineer
  • 13. Why does a DS team need Full Stack Developer? ● Development of Pilots and MVP Applications ○ Productize the data science work so it can serve an internal stakeholder ○ Interactive display of results/stats/insights ● Responsible for bringing a Software Engineering culture into the Data Science process ○ Build Infrastructure as Code - Automatization of the Data Science team infrastructure and testing ○ Continuous Integration and Versioning Control ○ Development of APIs to help integrate data products and source into applications ○ Building tools for internal use like tools for data collection, data labelling Courtesy: https://towardsdatascience.com/what-is-the-role-of-an-ai-software-engineer-in-a-data-science-team-eec987203ceb
  • 14. Data Scientists come in many types Type A Type B Type C ● High understanding of domain knowledge ● Uses ready made tools instead of developing algorithms ● Has less or no hands-on experience in building software applications ● Insight oriented ● Focus in better understanding of the business ● Has basic theoretical knowledge in data science ● Has good hands-on experience in building software applications ● Capable of building an end-to-end prototype or MVP ● Deep understanding of data science algorithms ● Has great hands on product development skills
  • 15. Domain Experts ● Experts both by education and experience in that domain ● Aware of what data is available and judge how good it is ● Major contribution in Feature Engineering and Modeling ● Use and apply the deliverables of a data science project in the real world ● Communicate with the intended users of the project’s outcome ● Define the framework for a data science project as they would know ○ What are the current challenges ○ How they must be answered to be practically useful ● Can learn enough data science to make a reasonable model using standardized tools Courtesy: https://www.linkedin.com/pulse/role-domain-knowledge-data-science-patrick-bangert/
  • 16. Cutting edge Research ● Seek to understand and develop systems by advancing the longer-term academic problems surrounding AI ● Actively engage with the research community through ○ publications ○ participation in technical conferences and workshops ● Has the skills to craft customized data science and machine learning algorithms ● Their focus will be to do research, not solve a business problem ● Data science researchers should not be an early hire
  • 17. Building a team for Startup Courtesy: https://thinkgrowth.org/the-startup-founders-guide-to-analytics-1d2176f20ac1
  • 18. Building a team for an Enterprise Courtesy: https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/breaking-away-the-secrets-to-scaling-analytics
  • 19. Who should lead the DS Team Courtesy: https://www.altexsoft.com/blog/datascience/how-to-structure-data-science-team-key-models-and-roles/
  • 20. Spotify Case Study ‘Center of Excellence’ Model Keys to Excellence Courtesy: https://www.slideshare.net/productschool/the-why-how-of-enterprise-analytics-w-spotify-data-scientist-79046775
  • 21. ● Interview process focused on practical data skills ● ‘Data challenges’ - Airbnb data + real question ● Lightning talks ● Support community ● Multi-stage screening: ○ Recruiter screen ○ Take home data challenge ○ Onsite challenge ○ Trained graders ○ Two graders for each test to ensure consistency ○ 1:1s with hiring manager, business partner, CV AirBnB Courtesy: https://www.slideshare.net/Work-Bench/scaling-data-science-at-airbnb
  • 22. Evolution Of AirBnB’s DS Team 2012 2013 2014 2015 Work Structure Centralised(Work closely within team) Started working with other team members Embedded with other teams Embedded with other teams Team size 7 14 28 55 Specialisation All generalists(Data Engineers + Data Scientists + Data Analysts) Hired first Data Engineer Separate team for Data Engineering Data Science Infrastructure team, Specialised roles for NLP, CV Hiring Take home Data challenge followed by 1:1 interview with whole team and founders Onsite data challenge Created rubrics and grading criteria Started hiring interns Started focusing on diversity and specialised roles for NLP,CV Courtesy: https://vimeo.com/148942395
  • 23. Facebook ● On-boards infra data scientists and engineers through the Bootcamp program ● Provides broad exposure to engineering systems in a supportive learning environment. ● Encourage engineering teams to identify mentors to guide new data scientists as they ramp up in their first projects. ● New data scientists receive mentoring on the ways to communicate the results of their complex analyses. ● Data Scientists are presented with the following options: ○ develop deep domain expertise in an area and spend several years embedded with a team ○ move across partner teams every 12 to 18 months in order to develop a broad understanding ● Provides opportunities to learn and master state-of-art skills: ○ Internal training sessions and chalk talks ○ Invite external speakers to cover important developments in the field ○ Closely connected to the academic community ○ Attend and present at major conferences such as INFORMS, KDD, and NIPS Courtesy: https://code.fb.com/core-data/building-data-science-teams-to-have-an-impact-at-scale
  • 24. Apple’s Acqui-hiring Strategy ● Apple acqui-hires startups to make its technology smarter and faster ● It buys a whole company to get the team and/or technology ● Hoping to compete with Google’s search service, Apple bought Siri in 2010 ● Pandora, Spotify, and Google Music started to predict songs a user will like. ● Apple saw this, which likely prompted the company to purchase Beats Music (a streaming music service that has a similar algorithm) ● Recently Apple has hired at least 18 people, including at least two co-founders, one of whom is the CEO from an enterprise consulting startup called Silicon Valley Data Science
  • 25. Where should the focus be Don’t focus on the technology Focus on the functionality The functionality is driven by business needs The functionality is supported by algorithms & data The algorithms are instrumental to business Courtesy: Kristian Hammond, NorthWestern University
  • 26. Data: Do you have the data that support it? Task: Is your task genuinely data driven? Scale: Do you need the scale automation provides? What you need to ask when considering AI
  • 27. THANK YOU FOR YOUR ATTENTION DO YOU HAVE ANY QUESTIONS ?