3. Me
• TJ Stalcup
• Lead DC Mentor @ Thinkful
• API Evangelist @ WealthEngine
• Github: tjstalcup
• Twitter: @tjstalcup
4. You
I already have a career in data
I’m serious about switching into a career in data
I’m curious about switching into a career in data
I just want to see what all the fuss is about
5. Today’s Goals
What is a data scientist and what do they do?
How and why has the field emerged?
How can one become a data scientist?
6. Why do we care?
“The United States alone faces a shortage of
140,000 to 190,000 people with deep analytical
skills as well as 1.5 million managers and
analysts to analyze big data and make
decisions based on their findings.”
- @McKinsey
7. Why do we care?
Also… average salaries are $115,000 a year
11. Example: LinkedIn 2006
“[LinkedIn] was like arriving at a conference
reception and realizing you don’t know
anyone. So you just stand in the corner
sipping your drink—and you probably leave
early.”
-LinkedIn Manager, June 2006
12. Enter: Data Scientist
Joined LinkedIn in 2006, only 8M
users (450M in 2016)
Started experiments to predict
people’s networks
Engineers were dismissive: “you
can already import your address
book”
Jonathan Goldman
14. Other Examples
Uber — Where drivers should hang out
Netflix — $1M movie recommendations
contest
Ebola — Mobile mapping in Senegal to fight
disease
15. Big Data
Big Data: datasets whose size is beyond the
ability of typical database software tools to
capture, store, manage, and analyze
16. Big Data - History
Trend “started” in 2005 (Hadoop!)
Web 2.0 - Majority of content is created by
users
Mobile accelerates this — data/person
skyrockets
25. The Process
Frame the question
Collect the raw data
Process the data
Explore the data
Communicate results
26. Case: Frame the Question
What questions do we want to answer?
27. Case: Frame the Question
What connections (type and number) lead to
higher user engagement?
Which connections do people want to make
but are currently limited from making?
How might we predict these types of
connections with limited data from the user?
28. Case: Collect the Data
What data do we need to answer these
questions?
29. Case: Collect the Data
Connection data (who is who connected to?)
Demographic data (what is profile of
connection)
Retention data (how do people stay or leave)
Engagement data (how do they use the site)
30. Case: Process the Data
How is the data “dirty” and how can we clean
it?
31. Case: Process the Data
User input - 80/20
Redundancies - 2 emails
Feature changes
Data model changes
32. Case: Explore the Data
What are the meaningful patterns in the
data?
33. Case: Explore the Data
Triangle closing
Time overlaps
Geographic clustering
35. Case: Communicate Findings
Tell story at the right technical level for each
audience
Make sure to focus on Whats In It For You
(WIIFY!)
Be objective, don’t lie with statistics
Be visual! Show, don’t just tell
41. #3: Machine Learning Algorithms
Machine learning algorithms provide computers
with the ability to learn without being explicitly
programmed — “programming by example”
46. That someone might be you
Knowledge of statistics, algorithms, &
software
Comfort with languages & tools (Python,
SQL, Tableau)
Inquisitiveness and intellectual curiosity
Strong communication skills
It’s all Teachable!
47. Data Science Bootcamp
Syllabus: Python Toolkit, Statistics & Probability,
Experimentation, Machine Learning, Communicating
Data, Algorithms and Big Data
48. or Web Development Bootcamp
Syllabus: Beginner and Intermediate Frontend
Development, Backend Development, CS
Fundamentals, Product Engineering
49. What is Thinkful?
Online skills bootcamp with 1-on-1 mentorship —
learn anytime & anywhere & get a job,
guaranteed.
Anyone who’s committed can learn to code.
52. Special Prep Course Offer
• Three-week program, includes six mentor sessions
• Covers Python programming, Data Science Toolkit, Stats
Refresher
• Option to continue into data science bootcamp
• Prep course costs $500 (can apply to cost of full
bootcamp)
• Talk to us about special 50% discount (available until
the end of the week).