AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
Career in Data Science (July 2017, DTLA)
1. Data Science:
How did we get here and where are we going?
July 2017
http://bit.ly/data-la
WIFI: CrossCamp.us Events
2. About us
We train developers and data
scientists through 1-on-1
mentorship and career prep
3. About us
• Noel Duarte
• New Markets Manager,
Thinkful
• UC Berkeley ’15 — worked
primarily with R for
population genetics
analysis, at Thinkful since
January 2016
• Kyle Polich
• Data science mentor at
Thinkful
• Host for Data Skeptic, a
podcast devoted to all
things data science and
advancements in the
industry
4. About you
Why are you here?
• I already have a career in data
• I’m curious about switching to a career in data
• I want to learn what data science is and why it’s
important
5. Today’s goals
• Why is data science important?
• What is a data scientist and what do they do?
• How and why has the field emerged?
• How can one become a data scientist? (And why
would you want to?)
6. Why is data science important?
By 2018, the United States alone could face a shortage
of 140,000 to 190,000 people with deep analytical skills
as well as 1.5 million managers and analysts with the
know-how to use the analysis of big data to make
effective decisions.
- McKinsey Global Institute (MGI)
8. Case study: LinkedIn (2006)
“[LinkedIn] was like arriving at a conference reception
and realizing you don’t know anyone. So you just stand
in the corner sipping your drink—and you probably
leave early.”
-LinkedIn Manager, June 2006
9. The new guy
• Joined LinkedIn in 2006,
only 8M users (450M in
2016)
• Started experiments to
predict people’s networks
• Engineers were dismissive:
“you can already import
your address book”
11. Data, data everywhere 🚀
• Uber — Where drivers should hang out
• Netflix — movie recommendations
• Ebola epidemic — Mobile mapping in Senegal to
fight disease
13. Big Data — what exactly does it mean?
Big Data: datasets whose size is beyond the ability of
typical database software tools to capture, store,
manage, and analyze
14. Big Data — brief history
• Trend “started” in 2005 (Hadoop!)
• Web 2.0 - Majority of content is created by users
• Mobile accelerates this — data/person skyrockets
19. The data science process
Let’s come back to LinkedIn’s evolution in 2006 and
examine it using a typical* data science approach.
• Frame the question
• Collect the raw data
• Process the data
• Explore the data
• Communicate results
20. Case: Frame the question
What questions do we want to answer?
21. Case: Frame the question
• What connections (type and number) lead to higher
user engagement?
• Which connections do people want to make but are
currently limited from making?
• How might we predict these types of connections
with limited data from the user?
22. Case: Collect the data
What data do we need to answer these questions?
23. Case: Collect the data
• Connection data (who is who connected to?)
• Demographic data (what is profile of connection?)
• Retention data (how do people stay or leave?)
• Engagement data (how do they use the site?)
24. Case: Process the data
How is the data “dirty” and how can we clean it?
25. Case: Process the data
• User input
• Redundancies
• Feature changes
• Data model changes
26. Case: Explore the data
What are the meaningful patterns in the data?
27. Case: Explore the data
• Triangle closing (or triadic closure)
• Time overlaps
• Geographic clustering
29. Case: Communicate results
• Tell story at the right technical level for each audience
• Make sure to focus on Whats In It For You (WIIFY!)
• Be objective, don’t lie with statistics
• Be visual! Show, don’t just tell
30. Tools to explore “big data”
• SQL Queries
• Business Analytics Software
• Machine Learning Algorithms
31. Tool #1: SQL queries
SQL is the standard querying language to access and
manipulate databases
32. SQL example
friends
id full_name age
1 Dan Friedman 24
2 Jared Jones 27
3 Paul Gu 22
4 Noel Duarte 73
SELECT full_name FROM friends WHERE age=73
33. Tool #2: Analytics software
Business analytics software for your database enabling
you to easily find and communicate insights visually
35. Tool #3: Machine Learning Algorithms
Machine learning algorithms provide computers
with the ability to learn without being explicitly
programmed — “programming by example”
39. I’m in! Where do I start?
• Knowledge of statistics, algorithms, & software
• Comfort with languages & tools (Python, SQL,
Tableau)
• Inquisitiveness and intellectual curiosity
• Strong communication skills
40. Ways to keep learning
More Structure
Less Structure
Less Support More Support
41. 1-on-1 mentorship enables flexibility
325+ mentors with an average of 10
years of experience in the field
42. Support ‘round the clock
You
Your mentor
Q&A Sessions
In-person
workshops
Career coachSlack
Program Manager
43. Want to try us/data science out?
Talk to us now or be on the look out for our email 📬
Thinkful’s Data Science
Prep Course covers:
- Python fundamentals
- Statistics
- Data science concepts
- Capstone project
$250 for 3 weeks