SlideShare una empresa de Scribd logo
1 de 43
Descargar para leer sin conexión
Data Science:
How did we get here and where are we going?
June 2017
http://bit.ly/data-la
WIFI: CrossCamp.us Events
About us
We train developers and data
scientists through 1-on-1
mentorship and career prep
About us
• Noel Duarte
• Los Angeles Area
General Manager
• UC Berkeley ’15 — worked
primarily with R for
population genetics
analysis, at Thinkful since
January 2016
• Kyle Polich
• Data science mentor at
Thinkful
• Host for Data Skeptic, a
podcast devoted to all
things data science and
advancements in the
industry
About you
Why are you here?
• I already have a career in data
• I’m curious about switching to a career in data
• I want to learn what data science is and why it’s
important
Today’s goals
• Why is data science important?
• What is a data scientist and what do they do?
• How and why has the field emerged?
• How can one become a data scientist? (And why
would you want to?)
Why is data science important?
By 2018, the United States alone could face a shortage
of 140,000 to 190,000 people with deep analytical skills
as well as 1.5 million managers and analysts with the
know-how to use the analysis of big data to make
effective decisions.
- McKinsey Global Institute (MGI)
Data Scientist:
Case study: LinkedIn (2006)
“[LinkedIn] was like arriving at a conference reception
and realizing you don’t know anyone. So you just stand
in the corner sipping your drink—and you probably
leave early.”
-LinkedIn Manager, June 2006
The new guy
• Joined LinkedIn in 2006,
only 8M users (450M in
2016)
• Started experiments to
predict people’s networks
• Engineers were dismissive:
“you can already import
your address book”
The result
Data, data everywhere 🚀
• Uber — Where drivers should hang out
• Netflix — movie recommendations
• Ebola epidemic — Mobile mapping in Senegal to
fight disease
Data, data everywhere 🚀
Big Data — what exactly does it mean?
Big Data: datasets whose size is beyond the ability of
typical database software tools to capture, store,
manage, and analyze
Big Data — brief history
• Trend “started” in 2005 (Hadoop!)
• Web 2.0 - Majority of content is created by users
• Mobile accelerates this — data/person skyrockets
Big Data — 3 Vs
Big Data — tldr;
90% of the data in the world today has been created
in the last two years alone.
- IBM, May 2013
In come data scientists!
Intersection of engineering, statistics, & communication
The data science process
Let’s come back to LinkedIn’s evolution in 2006 and
examine it using a typical* data science approach.
• Frame the question
• Collect the raw data
• Process the data
• Explore the data
• Communicate results
Case: Frame the question
What questions do we want to answer?
Case: Frame the question
• What connections (type and number) lead to higher
user engagement?
• Which connections do people want to make but are
currently limited from making?
• How might we predict these types of connections
with limited data from the user?
Case: Collect the data
What data do we need to answer these questions?
Case: Collect the data
• Connection data (who is who connected to?)
• Demographic data (what is profile of connection?)
• Retention data (how do people stay or leave?)
• Engagement data (how do they use the site?)
Case: Process the data
How is the data “dirty” and how can we clean it?
Case: Process the data
• User input
• Redundancies
• Feature changes
• Data model changes
Case: Explore the data
What are the meaningful patterns in the data?
Case: Explore the data
• Triangle closing
• Time overlaps
• Geographic clustering
Case: Communicate results
How do we communicate this? To whom?
Case: Communicate results
• Tell story at the right technical level for each audience
• Make sure to focus on Whats In It For You (WIIFY!)
• Be objective, don’t lie with statistics
• Be visual! Show, don’t just tell
Tools to explore “big data”
• SQL Queries
• Business Analytics Software
• Machine Learning Algorithms
Tool #1: SQL queries
SQL is the standard querying language to access and
manipulate databases
SQL example
friends
id full_name age
1 Dan Friedman 24
2 Jared Jones 27
3 Paul Gu 22
4 Noel Duarte 73
SELECT full_name FROM friends WHERE age=73
Tool #2: Analytics software
Business analytics software for your database enabling
you to easily find and communicate insights visually
Tableau example
Tool #3: Machine Learning Algorithms
Machine learning algorithms provide computers
with the ability to learn without being explicitly
programmed — “programming by example”
Iris data set example
Iris data set example
Use cases for machine learning
• Classification — Predict categories
• Regression — Predict values
• Anomaly Detection — Find unusual occurrences
• Clustering — Discover structure
I’m in! Where do I start?
• Knowledge of statistics, algorithms, & software
• Comfort with languages & tools (Python, SQL,
Tableau)
• Inquisitiveness and intellectual curiosity
• Strong communication skills
Ways to keep learning
More Structure
Less Structure
Less Support More Support
1-on-1 mentorship enables flexibility
325+ mentors with an average of 10
years of experience in the field
Support ‘round the clock
You
Your mentor
Q&A Sessions
In-person
workshops
Career coachSlack
Program Manager
Want to try us/data science out?
Talk to us now or be on the look out for our email 📬
Thinkful’s Data Science
Prep Course covers:
- Python fundamentals
- Statistics
- Data science concepts
- Capstone project
$250 for 3 weeks

Más contenido relacionado

La actualidad más candente

Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceNiko Vuokko
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Edureka!
 
Big data introduction
Big data introductionBig data introduction
Big data introductionChirag Ahuja
 
Основы создания витрин данных - создание схемы звезда и снежинка
Основы создания витрин данных - создание  схемы звезда и снежинкаОсновы создания витрин данных - создание  схемы звезда и снежинка
Основы создания витрин данных - создание схемы звезда и снежинкаSergey Sukharev
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceANOOP V S
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Hritika Raj
 
Data Science Applications | Data Science For Beginners | Data Science Trainin...
Data Science Applications | Data Science For Beginners | Data Science Trainin...Data Science Applications | Data Science For Beginners | Data Science Trainin...
Data Science Applications | Data Science For Beginners | Data Science Trainin...Edureka!
 
Data quality and bi
Data quality and biData quality and bi
Data quality and bijeffd00
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
 
Mastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott CordoMastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott CordoSpark Summit
 
Exploring the Data science Process
Exploring the Data science ProcessExploring the Data science Process
Exploring the Data science ProcessVishal Patel
 
An introduction to QuerySurge webinar
An introduction to QuerySurge webinarAn introduction to QuerySurge webinar
An introduction to QuerySurge webinarRTTS
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineeringThang Bui (Bob)
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 

La actualidad más candente (20)

Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Data Engineering Basics
Data Engineering BasicsData Engineering Basics
Data Engineering Basics
 
Основы создания витрин данных - создание схемы звезда и снежинка
Основы создания витрин данных - создание  схемы звезда и снежинкаОсновы создания витрин данных - создание  схемы звезда и снежинка
Основы создания витрин данных - создание схемы звезда и снежинка
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Big data
Big dataBig data
Big data
 
Data Science Applications | Data Science For Beginners | Data Science Trainin...
Data Science Applications | Data Science For Beginners | Data Science Trainin...Data Science Applications | Data Science For Beginners | Data Science Trainin...
Data Science Applications | Data Science For Beginners | Data Science Trainin...
 
Data quality and bi
Data quality and biData quality and bi
Data quality and bi
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Scope and Career in Analytics
Scope and Career in AnalyticsScope and Career in Analytics
Scope and Career in Analytics
 
Mastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott CordoMastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott Cordo
 
Exploring the Data science Process
Exploring the Data science ProcessExploring the Data science Process
Exploring the Data science Process
 
Data science Big Data
Data science Big DataData science Big Data
Data science Big Data
 
An introduction to QuerySurge webinar
An introduction to QuerySurge webinarAn introduction to QuerySurge webinar
An introduction to QuerySurge webinar
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 

Similar a Getting Started in Data Science

Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Thinkful
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Thinkful
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)Thinkful
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)Thinkful
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data ScienceTJ Stalcup
 
2017 06-14-getting started with data science
2017 06-14-getting started with data science2017 06-14-getting started with data science
2017 06-14-getting started with data scienceThinkful
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCTJ Stalcup
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science TJ Stalcup
 
Getting started in ds (july 17) atlanta
Getting started in ds (july 17)   atlantaGetting started in ds (july 17)   atlanta
Getting started in ds (july 17) atlantaThinkful
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sdThinkful
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17Thinkful
 
intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...jybufgofasfbkpoovh
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdxThinkful
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sdThinkful
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectbodaceacat
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSara-Jayne Terp
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3varshakumar21
 
Deck 92-146 (3)
Deck 92-146 (3)Deck 92-146 (3)
Deck 92-146 (3)Thinkful
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Joanne Luciano
 

Similar a Getting Started in Data Science (20)

Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
2017 06-14-getting started with data science
2017 06-14-getting started with data science2017 06-14-getting started with data science
2017 06-14-getting started with data science
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DC
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
Getting started in ds (july 17) atlanta
Getting started in ds (july 17)   atlantaGetting started in ds (july 17)   atlanta
Getting started in ds (july 17) atlanta
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sd
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17
 
intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sd
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3
 
Deck 92-146 (3)
Deck 92-146 (3)Deck 92-146 (3)
Deck 92-146 (3)
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020
 

Más de Thinkful

893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370Thinkful
 
LA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: FundamentalsLA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: FundamentalsThinkful
 
LA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: FundamentalsLA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: FundamentalsThinkful
 
Twit botsd1.30.18
Twit botsd1.30.18Twit botsd1.30.18
Twit botsd1.30.18Thinkful
 
Build your-own-instagram-filters-with-javascript-202-335 (1)
Build your-own-instagram-filters-with-javascript-202-335 (1)Build your-own-instagram-filters-with-javascript-202-335 (1)
Build your-own-instagram-filters-with-javascript-202-335 (1)Thinkful
 
Baggwjs124
Baggwjs124Baggwjs124
Baggwjs124Thinkful
 
Become a Data Scientist: A Thinkful Info Session
Become a Data Scientist: A Thinkful Info SessionBecome a Data Scientist: A Thinkful Info Session
Become a Data Scientist: A Thinkful Info SessionThinkful
 
Vpet sd-1.25.18
Vpet sd-1.25.18Vpet sd-1.25.18
Vpet sd-1.25.18Thinkful
 
LA 1/18/18 Become A Web Developer: A Thinkful Info Session
LA 1/18/18 Become A Web Developer: A Thinkful Info SessionLA 1/18/18 Become A Web Developer: A Thinkful Info Session
LA 1/18/18 Become A Web Developer: A Thinkful Info SessionThinkful
 
How to Choose a Programming Language
How to Choose a Programming LanguageHow to Choose a Programming Language
How to Choose a Programming LanguageThinkful
 
Batbwjs117
Batbwjs117Batbwjs117
Batbwjs117Thinkful
 
1/16/18 Intro to JS Workshop
1/16/18 Intro to JS Workshop1/16/18 Intro to JS Workshop
1/16/18 Intro to JS WorkshopThinkful
 
LA 1/16/18 Intro to Javascript: Fundamentals
LA 1/16/18 Intro to Javascript: FundamentalsLA 1/16/18 Intro to Javascript: Fundamentals
LA 1/16/18 Intro to Javascript: FundamentalsThinkful
 
(LA 1/16/18) Intro to JavaScript: Fundamentals
(LA 1/16/18) Intro to JavaScript: Fundamentals(LA 1/16/18) Intro to JavaScript: Fundamentals
(LA 1/16/18) Intro to JavaScript: FundamentalsThinkful
 
Websitesd1.15.17.
Websitesd1.15.17.Websitesd1.15.17.
Websitesd1.15.17.Thinkful
 
Bavpwjs110
Bavpwjs110Bavpwjs110
Bavpwjs110Thinkful
 
Byowwhc110
Byowwhc110Byowwhc110
Byowwhc110Thinkful
 
Getting started-jan-9-2018
Getting started-jan-9-2018Getting started-jan-9-2018
Getting started-jan-9-2018Thinkful
 
Introjs1.9.18tf
Introjs1.9.18tfIntrojs1.9.18tf
Introjs1.9.18tfThinkful
 

Más de Thinkful (20)

893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
 
LA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: FundamentalsLA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: Fundamentals
 
LA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: FundamentalsLA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: Fundamentals
 
Itjsf129
Itjsf129Itjsf129
Itjsf129
 
Twit botsd1.30.18
Twit botsd1.30.18Twit botsd1.30.18
Twit botsd1.30.18
 
Build your-own-instagram-filters-with-javascript-202-335 (1)
Build your-own-instagram-filters-with-javascript-202-335 (1)Build your-own-instagram-filters-with-javascript-202-335 (1)
Build your-own-instagram-filters-with-javascript-202-335 (1)
 
Baggwjs124
Baggwjs124Baggwjs124
Baggwjs124
 
Become a Data Scientist: A Thinkful Info Session
Become a Data Scientist: A Thinkful Info SessionBecome a Data Scientist: A Thinkful Info Session
Become a Data Scientist: A Thinkful Info Session
 
Vpet sd-1.25.18
Vpet sd-1.25.18Vpet sd-1.25.18
Vpet sd-1.25.18
 
LA 1/18/18 Become A Web Developer: A Thinkful Info Session
LA 1/18/18 Become A Web Developer: A Thinkful Info SessionLA 1/18/18 Become A Web Developer: A Thinkful Info Session
LA 1/18/18 Become A Web Developer: A Thinkful Info Session
 
How to Choose a Programming Language
How to Choose a Programming LanguageHow to Choose a Programming Language
How to Choose a Programming Language
 
Batbwjs117
Batbwjs117Batbwjs117
Batbwjs117
 
1/16/18 Intro to JS Workshop
1/16/18 Intro to JS Workshop1/16/18 Intro to JS Workshop
1/16/18 Intro to JS Workshop
 
LA 1/16/18 Intro to Javascript: Fundamentals
LA 1/16/18 Intro to Javascript: FundamentalsLA 1/16/18 Intro to Javascript: Fundamentals
LA 1/16/18 Intro to Javascript: Fundamentals
 
(LA 1/16/18) Intro to JavaScript: Fundamentals
(LA 1/16/18) Intro to JavaScript: Fundamentals(LA 1/16/18) Intro to JavaScript: Fundamentals
(LA 1/16/18) Intro to JavaScript: Fundamentals
 
Websitesd1.15.17.
Websitesd1.15.17.Websitesd1.15.17.
Websitesd1.15.17.
 
Bavpwjs110
Bavpwjs110Bavpwjs110
Bavpwjs110
 
Byowwhc110
Byowwhc110Byowwhc110
Byowwhc110
 
Getting started-jan-9-2018
Getting started-jan-9-2018Getting started-jan-9-2018
Getting started-jan-9-2018
 
Introjs1.9.18tf
Introjs1.9.18tfIntrojs1.9.18tf
Introjs1.9.18tf
 

Último

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Último (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Getting Started in Data Science

  • 1. Data Science: How did we get here and where are we going? June 2017 http://bit.ly/data-la WIFI: CrossCamp.us Events
  • 2. About us We train developers and data scientists through 1-on-1 mentorship and career prep
  • 3. About us • Noel Duarte • Los Angeles Area General Manager • UC Berkeley ’15 — worked primarily with R for population genetics analysis, at Thinkful since January 2016 • Kyle Polich • Data science mentor at Thinkful • Host for Data Skeptic, a podcast devoted to all things data science and advancements in the industry
  • 4. About you Why are you here? • I already have a career in data • I’m curious about switching to a career in data • I want to learn what data science is and why it’s important
  • 5. Today’s goals • Why is data science important? • What is a data scientist and what do they do? • How and why has the field emerged? • How can one become a data scientist? (And why would you want to?)
  • 6. Why is data science important? By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions. - McKinsey Global Institute (MGI)
  • 8. Case study: LinkedIn (2006) “[LinkedIn] was like arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink—and you probably leave early.” -LinkedIn Manager, June 2006
  • 9. The new guy • Joined LinkedIn in 2006, only 8M users (450M in 2016) • Started experiments to predict people’s networks • Engineers were dismissive: “you can already import your address book”
  • 11. Data, data everywhere 🚀 • Uber — Where drivers should hang out • Netflix — movie recommendations • Ebola epidemic — Mobile mapping in Senegal to fight disease
  • 13. Big Data — what exactly does it mean? Big Data: datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze
  • 14. Big Data — brief history • Trend “started” in 2005 (Hadoop!) • Web 2.0 - Majority of content is created by users • Mobile accelerates this — data/person skyrockets
  • 15. Big Data — 3 Vs
  • 16. Big Data — tldr; 90% of the data in the world today has been created in the last two years alone. - IBM, May 2013
  • 17. In come data scientists!
  • 18. Intersection of engineering, statistics, & communication
  • 19. The data science process Let’s come back to LinkedIn’s evolution in 2006 and examine it using a typical* data science approach. • Frame the question • Collect the raw data • Process the data • Explore the data • Communicate results
  • 20. Case: Frame the question What questions do we want to answer?
  • 21. Case: Frame the question • What connections (type and number) lead to higher user engagement? • Which connections do people want to make but are currently limited from making? • How might we predict these types of connections with limited data from the user?
  • 22. Case: Collect the data What data do we need to answer these questions?
  • 23. Case: Collect the data • Connection data (who is who connected to?) • Demographic data (what is profile of connection?) • Retention data (how do people stay or leave?) • Engagement data (how do they use the site?)
  • 24. Case: Process the data How is the data “dirty” and how can we clean it?
  • 25. Case: Process the data • User input • Redundancies • Feature changes • Data model changes
  • 26. Case: Explore the data What are the meaningful patterns in the data?
  • 27. Case: Explore the data • Triangle closing • Time overlaps • Geographic clustering
  • 28. Case: Communicate results How do we communicate this? To whom?
  • 29. Case: Communicate results • Tell story at the right technical level for each audience • Make sure to focus on Whats In It For You (WIIFY!) • Be objective, don’t lie with statistics • Be visual! Show, don’t just tell
  • 30. Tools to explore “big data” • SQL Queries • Business Analytics Software • Machine Learning Algorithms
  • 31. Tool #1: SQL queries SQL is the standard querying language to access and manipulate databases
  • 32. SQL example friends id full_name age 1 Dan Friedman 24 2 Jared Jones 27 3 Paul Gu 22 4 Noel Duarte 73 SELECT full_name FROM friends WHERE age=73
  • 33. Tool #2: Analytics software Business analytics software for your database enabling you to easily find and communicate insights visually
  • 35. Tool #3: Machine Learning Algorithms Machine learning algorithms provide computers with the ability to learn without being explicitly programmed — “programming by example”
  • 36. Iris data set example
  • 37. Iris data set example
  • 38. Use cases for machine learning • Classification — Predict categories • Regression — Predict values • Anomaly Detection — Find unusual occurrences • Clustering — Discover structure
  • 39. I’m in! Where do I start? • Knowledge of statistics, algorithms, & software • Comfort with languages & tools (Python, SQL, Tableau) • Inquisitiveness and intellectual curiosity • Strong communication skills
  • 40. Ways to keep learning More Structure Less Structure Less Support More Support
  • 41. 1-on-1 mentorship enables flexibility 325+ mentors with an average of 10 years of experience in the field
  • 42. Support ‘round the clock You Your mentor Q&A Sessions In-person workshops Career coachSlack Program Manager
  • 43. Want to try us/data science out? Talk to us now or be on the look out for our email 📬 Thinkful’s Data Science Prep Course covers: - Python fundamentals - Statistics - Data science concepts - Capstone project $250 for 3 weeks