Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Deck 92-146 (3)

Próximo SlideShare
Data sci sd-11.6.17
Data sci sd-11.6.17
Cargando en…3

Eche un vistazo a continuación

1 de 48 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Deck 92-146 (3) (20)


Más de Thinkful (20)

Más reciente (20)


Deck 92-146 (3)

  1. 1. Getting Started with Data Science December 2017 Deskhub-main - stake2017!
  2. 2. Jordan Zurowski Thinkful Community Manager MA in Industrial/Organizational Psychology About me
  3. 3. About you You already have a career in data I'm interested in switching into a data career I just want to see what all the fuss is about
  4. 4. About Thinkful Thinkful helps people become developers or data scientists through 1-on-1 mentorship and project-based learning These workshops are built using this approach.
  5. 5. Today's Goals What is Data Science? How and why has the field emerged? What do they do? Next steps
  6. 6. Example: LinkedIn 2006 “[LinkedIn] was like arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink—and you probably leave early.” -LinkedIn Manager, June 2006
  7. 7. Enter: Data Scientist Jonathan Goldman Joined LinkedIn in 2006, only 8M users (450M in 2016) Started experiments to predict people’s networks Engineers were dismissive: “you can already import your address book”
  8. 8. The Result
  9. 9. Other Examples Uber — Where drivers should hang out Tala — Microfinance loan approval
  10. 10. Why now? Big Data: datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze
  11. 11. Brief history of "big data" Trend "started" in 2005 Web 2.0 - Majority of content is created by users Mobile accelerates this — data/person skyrockets
  12. 12. Big Data 90% of the data in the world today has been created in the last two years alone - IBM, May 2013
  13. 13. The Problem
  14. 14. The Solution
  15. 15. Data Scientists - Jack of All Trades
  16. 16. Data Science is just the beginning “The United States alone faces a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts to analyze big data and make decisions based on their findings.” - McKinsey
  17. 17. The Process - LinkedIn Example Frame the question Collect the raw data Process the data Explore the data Communicate results
  18. 18. Case: Frame the Question What questions do we want to answer?
  19. 19. Case: Frame the Question What connections (type and number) lead to higher user engagement? Which connections do people want to make but are currently limited from making? How might we predict these types of connections with limited data from the user?
  20. 20. Case: Collect the Data What data do we need to answer these questions?
  21. 21. Case: Collect the Data Connection data (who is who connected to?) Demographic data (what is the profile of the connection) Engagement data (how do they use the site)
  22. 22. Case: Process the Data How is the data “dirty” and how can we clean it?
  23. 23. Case: Process the Data User input Redundancies Feature changes Data model changes
  24. 24. Case: Explore the Data What are the meaningful patterns in the data?
  25. 25. Case: Explore the Data Triangle closing Time overlaps Geographic overlaps
  26. 26. Case: Communicate Findings How do we communicate this? To whom?
  27. 27. Case: Communicate Findings “People You Know” feature increased clickthrough by 30% (generating millions more page views)
  28. 28. Tools SQL Queries Business Analytics Software Machine Learning Algorithms
  29. 29. #1 SQL Queries SQL is the standard querying language to access and manipulate databases
  30. 30. #1 SQL Queries SELECT full_name FROM friends WHERE age>22
  31. 31. #2: Visualization Software Business analytics software for your database enabling you to easily find and communicate insights visually
  32. 32. #2: Visualization Software
  33. 33. #3: Machine Learning Algorithms Machine learning algorithms provide computers with the ability to learn without being explicitly programmed — “programming by example”
  34. 34. Iris Data Set
  35. 35. Iris Data Set
  36. 36. Iris Data Set
  37. 37. Use Cases for Machine Learning Classification — Predict categories Regression — Predict values Anomaly Fraud Detection — Find unusual occurrences Clustering — Discover structure
  38. 38. It may seem like a daunting opportunity
  39. 39. But if you're interested... Knowledge of statistics, algorithms, & software Comfort with languages & tools (Python, SQL, Tableau) Inquisitiveness and intellectual curiosity Strong communication skills It’s all Teachable!
  40. 40. Ways to keep learning
  41. 41. For aspiring developers... Source: Bureau of Labor Statistics
  42. 42. 92%of grads placed in full-time tech jobs job guarantee Link for the third party audit jobs report: Thinkful's track record of getting students jobs
  43. 43. Our students receive unprecedented support 1-on-1 Learning Mentor 1-on-1 Career MentorProgram Manager San Diego Community You
  44. 44. 1-on-1 mentorship enables flexible learning Learn anywhere, anytime, and at your own schedule You don't have to quit your job to start career transition
  45. 45. Thinkful's Free Resource Introduction to Python, Data Visualization, and Stats. Unlimited mentor-led Q&A sessions Personal Program Manager course