SlideShare una empresa de Scribd logo
1 de 52
Descargar para leer sin conexión
http://bit.ly/ds-dc
network: In3Guest
March 2017
Intro to Data Science
Me
• TJ Stalcup
• Lead DC Mentor @ Thinkful
• API Evangelist @ WealthEngine
• Github: tjstalcup
• Twitter: @tjstalcup
You
I already have a career in data
I’m serious about switching into a career in data
I’m curious about switching into a career in data
I just want to see what all the fuss is about
Today’s Goals
What is a data scientist and what do they do?
How and why has the field emerged?
How can one become a data scientist?
Why do we care?
“The United States alone faces a shortage of
140,000 to 190,000 people with deep analytical
skills as well as 1.5 million managers and
analysts to analyze big data and make
decisions based on their findings.”
- @McKinsey
Why do we care?
Also… average salaries are $115,000 a year
Nate Silver
FiveThirtyEight.com
“I think data-scientist is a sexed up term for a statistician”
Example: LinkedIn 2006
“[LinkedIn] was like arriving at a conference
reception and realizing you don’t know
anyone. So you just stand in the corner
sipping your drink—and you probably leave
early.”
-LinkedIn Manager, June 2006
Enter: Data Scientist
Joined LinkedIn in 2006, only 8M
users (450M in 2016)
Started experiments to predict
people’s networks
Engineers were dismissive: “you
can already import your address
book”
Jonathan Goldman
The Result
Other Examples
Uber — Where drivers should hang out
Netflix — $1M movie recommendations
contest
Ebola — Mobile mapping in Senegal to fight
disease
Big Data
Big Data: datasets whose size is beyond the
ability of typical database software tools to
capture, store, manage, and analyze
Big Data - History
Trend “started” in 2005 (Hadoop!)
Web 2.0 - Majority of content is created by
users
Mobile accelerates this — data/person
skyrockets
Hadoop?
HDFS
MapReduce
Hadoop Distributed File System
File is too big….Distribute!
Too many files….Distribute!
Yahoo has over 10,000 servers running
Hadoop
MapReduce
Data + Processing Software
Distributed Processing
Map all of the data, reduce it
MapReduce
Big Data
90% of the data in the world today has been
created in the last two years alone
- IBM, May 2013
Big Data
Data Scientists - We Can Be Heroes
Data Scientists - Jack of all Trades
The Process - LinkedIn Example
Frame the question
Collect the raw data
Process the data
Explore the data
Communicate results
Case: Frame the Question
What questions do we want to answer?
Case: Frame the Question
What connections (type and number) lead to
higher user engagement?
Which connections do people want to make
but are currently limited from making?
How might we predict these types of
connections with limited data from the user?
Case: Collect the Data
What data do we need to answer these
questions?
Case: Collect the Data
Connection data (who is who connected to?)
Demographic data (what is the profile of the
connection)
Retention data (how do people stay or leave)
Engagement data (how do they use the site)
Case: Process the Data
How is the data “dirty” and how can we clean
it?
Case: Process the Data
User input - 80/20
Redundancies - 2 emails
Feature changes
Data model changes
Case: Explore the Data
What are the meaningful patterns in the
data?
Case: Explore the Data
Triangle closing
Time overlaps
Geographic clustering
Case: Communicate Findings
How do we communicate this? To whom?
Case: Communicate Findings
Tell story at the right technical level for each
audience
Make sure to focus on Whats In It For You
(WIIFY!)
Be objective, don’t lie with statistics
Be visual! Show, don’t just tell
Tools
SQL Queries
Business Analytics Software
Machine Learning Algorithms
#1 - SQL Queries
SQL is the standard querying language
to access and manipulate databases
#1 - SQL Queries
friends
id full_name age
1 Dan Friedman 24
2 Tyler Brewer 27
3 David Coulter 22
4 TJ Stalcup 33
SELECT full_name FROM friends WHERE age>22
#2: Visualization Software
Business analytics software for your database
enabling you to easily find and communicate
insights visually
#2: Visualization Software
#3: Machine Learning Algorithms
Machine learning algorithms provide computers
with the ability to learn without being explicitly
programmed — “programming by example”
Iris Data Set
Iris Data Set
Iris Data Set
?
Use Cases for Machine Learning
Classification — Predict categories
Regression — Predict values
Anomaly Detection — Find unusual occurrences
Clustering — Discover structure
It’s not easy but someone has to do it
That someone might be you
Knowledge of statistics, algorithms, &
software
Comfort with languages & tools (Python,
SQL, Tableau)
Inquisitiveness and intellectual curiosity
Strong communication skills
It’s all Teachable!
Ways to keep learningLevelofsupport
Learning methods
1-on-1 mentorship enables flexibility
325+ mentors with an average of 10
years of experience in the field
Support ‘round the clock
Our results
Job Titles after GraduationMonths until Employed
Try us out!
• Initial 3-week prep
course includes six
mentor sessions for
$250
• Learn Python, Python’s
data science toolkit,
stats
• Option to continue
onto Data Science
bootcamp
• Talk to me (or email
tj@thinkful.com) if
you’re interested

Más contenido relacionado

La actualidad más candente

NOVA Data Science Meetup 1/19/2017 - Presentation 1
NOVA Data Science Meetup 1/19/2017 - Presentation 1NOVA Data Science Meetup 1/19/2017 - Presentation 1
NOVA Data Science Meetup 1/19/2017 - Presentation 1NOVA DATASCIENCE
 
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...Linuxmalaysia Malaysia
 
How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace Mohamadreza Mohtat
 
Lessons Learned The Hard Way: 32+ Data Science Interviews
Lessons Learned The Hard Way: 32+ Data Science InterviewsLessons Learned The Hard Way: 32+ Data Science Interviews
Lessons Learned The Hard Way: 32+ Data Science InterviewsGregory Kamradt
 
How to become a Data Scientist?
How to become a Data Scientist? How to become a Data Scientist?
How to become a Data Scientist? HackerEarth
 
Python for Data Science - TDC 2015
Python for Data Science - TDC 2015Python for Data Science - TDC 2015
Python for Data Science - TDC 2015Gabriel Moreira
 
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making DigitYser
 
How to Hire Data Scientists
How to Hire Data ScientistsHow to Hire Data Scientists
How to Hire Data ScientistsGalvanize
 
Isolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’sIsolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’seSAT Journals
 
Wtf is data science?
Wtf is data science?Wtf is data science?
Wtf is data science?Dylan
 
BIG DATA MANAGEMENT - forget the hype, let's talk about the facts!
BIG DATA MANAGEMENT - forget the hype, let's talk about the facts! BIG DATA MANAGEMENT - forget the hype, let's talk about the facts!
BIG DATA MANAGEMENT - forget the hype, let's talk about the facts! Lisa Lang
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 
Applications of Machine Learning at USC
Applications of Machine Learning at USCApplications of Machine Learning at USC
Applications of Machine Learning at USCSri Ambati
 
5 ways to get more from data science
5 ways to get more from data science5 ways to get more from data science
5 ways to get more from data scienceTyrone Systems
 
iTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun SukhaniiTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun SukhaniiTrain
 

La actualidad más candente (20)

What is Data Science
What is Data ScienceWhat is Data Science
What is Data Science
 
NOVA Data Science Meetup 1/19/2017 - Presentation 1
NOVA Data Science Meetup 1/19/2017 - Presentation 1NOVA Data Science Meetup 1/19/2017 - Presentation 1
NOVA Data Science Meetup 1/19/2017 - Presentation 1
 
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
 
Lecture #03
Lecture #03Lecture #03
Lecture #03
 
How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace
 
Lessons Learned The Hard Way: 32+ Data Science Interviews
Lessons Learned The Hard Way: 32+ Data Science InterviewsLessons Learned The Hard Way: 32+ Data Science Interviews
Lessons Learned The Hard Way: 32+ Data Science Interviews
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
How to become a Data Scientist?
How to become a Data Scientist? How to become a Data Scientist?
How to become a Data Scientist?
 
Python for Data Science - TDC 2015
Python for Data Science - TDC 2015Python for Data Science - TDC 2015
Python for Data Science - TDC 2015
 
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making
 
How to Hire Data Scientists
How to Hire Data ScientistsHow to Hire Data Scientists
How to Hire Data Scientists
 
Isolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’sIsolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’s
 
Wtf is data science?
Wtf is data science?Wtf is data science?
Wtf is data science?
 
BIG DATA MANAGEMENT - forget the hype, let's talk about the facts!
BIG DATA MANAGEMENT - forget the hype, let's talk about the facts! BIG DATA MANAGEMENT - forget the hype, let's talk about the facts!
BIG DATA MANAGEMENT - forget the hype, let's talk about the facts!
 
BigDataCSEKeyNote_2012
BigDataCSEKeyNote_2012BigDataCSEKeyNote_2012
BigDataCSEKeyNote_2012
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Applications of Machine Learning at USC
Applications of Machine Learning at USCApplications of Machine Learning at USC
Applications of Machine Learning at USC
 
5 ways to get more from data science
5 ways to get more from data science5 ways to get more from data science
5 ways to get more from data science
 
iTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun SukhaniiTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun Sukhani
 

Similar a Intro to Data Science

Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Thinkful
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data ScienceThinkful
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Thinkful
 
Getting started in ds (july 17) atlanta
Getting started in ds (july 17)   atlantaGetting started in ds (july 17)   atlanta
Getting started in ds (july 17) atlantaThinkful
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17Thinkful
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)Thinkful
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)Thinkful
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sdThinkful
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sdThinkful
 
Deck 92-146 (3)
Deck 92-146 (3)Deck 92-146 (3)
Deck 92-146 (3)Thinkful
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdxThinkful
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in dataDavid Rostcheck
 
Big Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressBig Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressMarcel Blattner, PhD
 
Data fluency for the 21st century
Data fluency for the 21st centuryData fluency for the 21st century
Data fluency for the 21st centuryMartinFrigaard
 
Digital Pragmatism with Business Intelligence, Big Data and Data Visualisation
Digital Pragmatism with Business Intelligence, Big Data and Data VisualisationDigital Pragmatism with Business Intelligence, Big Data and Data Visualisation
Digital Pragmatism with Business Intelligence, Big Data and Data VisualisationJen Stirrup
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3varshakumar21
 
Big Data Visualisation with Hadoop and PowerPivot
Big Data Visualisation with Hadoop and PowerPivotBig Data Visualisation with Hadoop and PowerPivot
Big Data Visualisation with Hadoop and PowerPivotJen Stirrup
 

Similar a Intro to Data Science (20)

Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
 
Getting started in ds (july 17) atlanta
Getting started in ds (july 17)   atlantaGetting started in ds (july 17)   atlanta
Getting started in ds (july 17) atlanta
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sd
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sd
 
Deck 92-146 (3)
Deck 92-146 (3)Deck 92-146 (3)
Deck 92-146 (3)
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in data
 
Big Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressBig Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR Congress
 
Data fluency for the 21st century
Data fluency for the 21st centuryData fluency for the 21st century
Data fluency for the 21st century
 
Digital Pragmatism with Business Intelligence, Big Data and Data Visualisation
Digital Pragmatism with Business Intelligence, Big Data and Data VisualisationDigital Pragmatism with Business Intelligence, Big Data and Data Visualisation
Digital Pragmatism with Business Intelligence, Big Data and Data Visualisation
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3
 
Data science unit1
Data science unit1Data science unit1
Data science unit1
 
Big Data Visualisation with Hadoop and PowerPivot
Big Data Visualisation with Hadoop and PowerPivotBig Data Visualisation with Hadoop and PowerPivot
Big Data Visualisation with Hadoop and PowerPivot
 

Más de TJ Stalcup

Intro to JavaScript - Thinkful DC
Intro to JavaScript - Thinkful DCIntro to JavaScript - Thinkful DC
Intro to JavaScript - Thinkful DCTJ Stalcup
 
Frontend Crash Course
Frontend Crash CourseFrontend Crash Course
Frontend Crash CourseTJ Stalcup
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data ScienceTJ Stalcup
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data ScienceTJ Stalcup
 
Build Your Own Website - Intro to HTML & CSS
Build Your Own Website - Intro to HTML & CSSBuild Your Own Website - Intro to HTML & CSS
Build Your Own Website - Intro to HTML & CSSTJ Stalcup
 
Intro to Python
Intro to PythonIntro to Python
Intro to PythonTJ Stalcup
 
Intro to Python
Intro to PythonIntro to Python
Intro to PythonTJ Stalcup
 
Predict the Oscars using Data Science
Predict the Oscars using Data SciencePredict the Oscars using Data Science
Predict the Oscars using Data ScienceTJ Stalcup
 
Thinkful DC - Intro to JavaScript
Thinkful DC - Intro to JavaScriptThinkful DC - Intro to JavaScript
Thinkful DC - Intro to JavaScriptTJ Stalcup
 
Data Science Your Vacation
Data Science Your VacationData Science Your Vacation
Data Science Your VacationTJ Stalcup
 
Data Science Your Vacation
Data Science Your VacationData Science Your Vacation
Data Science Your VacationTJ Stalcup
 
Build a Game with Javascript
Build a Game with JavascriptBuild a Game with Javascript
Build a Game with JavascriptTJ Stalcup
 
Thinkful DC FrontEnd Crash Course - HTML & CSS
Thinkful DC FrontEnd Crash Course - HTML & CSSThinkful DC FrontEnd Crash Course - HTML & CSS
Thinkful DC FrontEnd Crash Course - HTML & CSSTJ Stalcup
 
Build Your Own Instagram Filters
Build Your Own Instagram FiltersBuild Your Own Instagram Filters
Build Your Own Instagram FiltersTJ Stalcup
 
Choosing a Programming Language
Choosing a Programming LanguageChoosing a Programming Language
Choosing a Programming LanguageTJ Stalcup
 
Frontend Crash Course
Frontend Crash CourseFrontend Crash Course
Frontend Crash CourseTJ Stalcup
 
Thinkful FrontEnd Crash Course - HTML & CSS
Thinkful FrontEnd Crash Course - HTML & CSSThinkful FrontEnd Crash Course - HTML & CSS
Thinkful FrontEnd Crash Course - HTML & CSSTJ Stalcup
 
Thinkful FrontEnd Crash Course - HTML & CSS
Thinkful FrontEnd Crash Course - HTML & CSSThinkful FrontEnd Crash Course - HTML & CSS
Thinkful FrontEnd Crash Course - HTML & CSSTJ Stalcup
 
Build a Virtual Pet with JavaScript
Build a Virtual Pet with JavaScriptBuild a Virtual Pet with JavaScript
Build a Virtual Pet with JavaScriptTJ Stalcup
 
Intro to Javascript
Intro to JavascriptIntro to Javascript
Intro to JavascriptTJ Stalcup
 

Más de TJ Stalcup (20)

Intro to JavaScript - Thinkful DC
Intro to JavaScript - Thinkful DCIntro to JavaScript - Thinkful DC
Intro to JavaScript - Thinkful DC
 
Frontend Crash Course
Frontend Crash CourseFrontend Crash Course
Frontend Crash Course
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data Science
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data Science
 
Build Your Own Website - Intro to HTML & CSS
Build Your Own Website - Intro to HTML & CSSBuild Your Own Website - Intro to HTML & CSS
Build Your Own Website - Intro to HTML & CSS
 
Intro to Python
Intro to PythonIntro to Python
Intro to Python
 
Intro to Python
Intro to PythonIntro to Python
Intro to Python
 
Predict the Oscars using Data Science
Predict the Oscars using Data SciencePredict the Oscars using Data Science
Predict the Oscars using Data Science
 
Thinkful DC - Intro to JavaScript
Thinkful DC - Intro to JavaScriptThinkful DC - Intro to JavaScript
Thinkful DC - Intro to JavaScript
 
Data Science Your Vacation
Data Science Your VacationData Science Your Vacation
Data Science Your Vacation
 
Data Science Your Vacation
Data Science Your VacationData Science Your Vacation
Data Science Your Vacation
 
Build a Game with Javascript
Build a Game with JavascriptBuild a Game with Javascript
Build a Game with Javascript
 
Thinkful DC FrontEnd Crash Course - HTML & CSS
Thinkful DC FrontEnd Crash Course - HTML & CSSThinkful DC FrontEnd Crash Course - HTML & CSS
Thinkful DC FrontEnd Crash Course - HTML & CSS
 
Build Your Own Instagram Filters
Build Your Own Instagram FiltersBuild Your Own Instagram Filters
Build Your Own Instagram Filters
 
Choosing a Programming Language
Choosing a Programming LanguageChoosing a Programming Language
Choosing a Programming Language
 
Frontend Crash Course
Frontend Crash CourseFrontend Crash Course
Frontend Crash Course
 
Thinkful FrontEnd Crash Course - HTML & CSS
Thinkful FrontEnd Crash Course - HTML & CSSThinkful FrontEnd Crash Course - HTML & CSS
Thinkful FrontEnd Crash Course - HTML & CSS
 
Thinkful FrontEnd Crash Course - HTML & CSS
Thinkful FrontEnd Crash Course - HTML & CSSThinkful FrontEnd Crash Course - HTML & CSS
Thinkful FrontEnd Crash Course - HTML & CSS
 
Build a Virtual Pet with JavaScript
Build a Virtual Pet with JavaScriptBuild a Virtual Pet with JavaScript
Build a Virtual Pet with JavaScript
 
Intro to Javascript
Intro to JavascriptIntro to Javascript
Intro to Javascript
 

Último

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Último (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

Intro to Data Science

  • 2. March 2017 Intro to Data Science
  • 3. Me • TJ Stalcup • Lead DC Mentor @ Thinkful • API Evangelist @ WealthEngine • Github: tjstalcup • Twitter: @tjstalcup
  • 4. You I already have a career in data I’m serious about switching into a career in data I’m curious about switching into a career in data I just want to see what all the fuss is about
  • 5. Today’s Goals What is a data scientist and what do they do? How and why has the field emerged? How can one become a data scientist?
  • 6. Why do we care? “The United States alone faces a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts to analyze big data and make decisions based on their findings.” - @McKinsey
  • 7. Why do we care? Also… average salaries are $115,000 a year
  • 8.
  • 9. Nate Silver FiveThirtyEight.com “I think data-scientist is a sexed up term for a statistician”
  • 10.
  • 11. Example: LinkedIn 2006 “[LinkedIn] was like arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink—and you probably leave early.” -LinkedIn Manager, June 2006
  • 12. Enter: Data Scientist Joined LinkedIn in 2006, only 8M users (450M in 2016) Started experiments to predict people’s networks Engineers were dismissive: “you can already import your address book” Jonathan Goldman
  • 14. Other Examples Uber — Where drivers should hang out Netflix — $1M movie recommendations contest Ebola — Mobile mapping in Senegal to fight disease
  • 15. Big Data Big Data: datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze
  • 16. Big Data - History Trend “started” in 2005 (Hadoop!) Web 2.0 - Majority of content is created by users Mobile accelerates this — data/person skyrockets
  • 18. Hadoop Distributed File System File is too big….Distribute! Too many files….Distribute! Yahoo has over 10,000 servers running Hadoop
  • 19. MapReduce Data + Processing Software Distributed Processing Map all of the data, reduce it
  • 21. Big Data 90% of the data in the world today has been created in the last two years alone - IBM, May 2013
  • 23. Data Scientists - We Can Be Heroes
  • 24. Data Scientists - Jack of all Trades
  • 25. The Process - LinkedIn Example Frame the question Collect the raw data Process the data Explore the data Communicate results
  • 26. Case: Frame the Question What questions do we want to answer?
  • 27. Case: Frame the Question What connections (type and number) lead to higher user engagement? Which connections do people want to make but are currently limited from making? How might we predict these types of connections with limited data from the user?
  • 28. Case: Collect the Data What data do we need to answer these questions?
  • 29. Case: Collect the Data Connection data (who is who connected to?) Demographic data (what is the profile of the connection) Retention data (how do people stay or leave) Engagement data (how do they use the site)
  • 30. Case: Process the Data How is the data “dirty” and how can we clean it?
  • 31. Case: Process the Data User input - 80/20 Redundancies - 2 emails Feature changes Data model changes
  • 32. Case: Explore the Data What are the meaningful patterns in the data?
  • 33. Case: Explore the Data Triangle closing Time overlaps Geographic clustering
  • 34. Case: Communicate Findings How do we communicate this? To whom?
  • 35. Case: Communicate Findings Tell story at the right technical level for each audience Make sure to focus on Whats In It For You (WIIFY!) Be objective, don’t lie with statistics Be visual! Show, don’t just tell
  • 36. Tools SQL Queries Business Analytics Software Machine Learning Algorithms
  • 37. #1 - SQL Queries SQL is the standard querying language to access and manipulate databases
  • 38. #1 - SQL Queries friends id full_name age 1 Dan Friedman 24 2 Tyler Brewer 27 3 David Coulter 22 4 TJ Stalcup 33 SELECT full_name FROM friends WHERE age>22
  • 39. #2: Visualization Software Business analytics software for your database enabling you to easily find and communicate insights visually
  • 41. #3: Machine Learning Algorithms Machine learning algorithms provide computers with the ability to learn without being explicitly programmed — “programming by example”
  • 45. Use Cases for Machine Learning Classification — Predict categories Regression — Predict values Anomaly Detection — Find unusual occurrences Clustering — Discover structure
  • 46. It’s not easy but someone has to do it
  • 47. That someone might be you Knowledge of statistics, algorithms, & software Comfort with languages & tools (Python, SQL, Tableau) Inquisitiveness and intellectual curiosity Strong communication skills It’s all Teachable!
  • 48. Ways to keep learningLevelofsupport Learning methods
  • 49. 1-on-1 mentorship enables flexibility 325+ mentors with an average of 10 years of experience in the field
  • 51. Our results Job Titles after GraduationMonths until Employed
  • 52. Try us out! • Initial 3-week prep course includes six mentor sessions for $250 • Learn Python, Python’s data science toolkit, stats • Option to continue onto Data Science bootcamp • Talk to me (or email tj@thinkful.com) if you’re interested