Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

[DSC Europe 22] The Future of Data Science Education - Jose Portilla


Eche un vistazo a continuación

1 de 81 Anuncio

[DSC Europe 22] The Future of Data Science Education - Jose Portilla

Descargar para leer sin conexión

In this talk, we'll discuss the current state of Data Science and Machine Learning education and what is needed to build a future educational model that works for everyone, including new students and enterprise clients who need to upskill their modern workforce. Topics discussed will include an overview of the current landscape in data science education and how modern technologies allow true scalability in the education space.

In this talk, we'll discuss the current state of Data Science and Machine Learning education and what is needed to build a future educational model that works for everyone, including new students and enterprise clients who need to upskill their modern workforce. Topics discussed will include an overview of the current landscape in data science education and how modern technologies allow true scalability in the education space.


Más Contenido Relacionado

Más de DataScienceConferenc1 (20)

Más reciente (20)


[DSC Europe 22] The Future of Data Science Education - Jose Portilla

  1. 1. The Future of Data Science Education Speaker: Jose Marcial Portilla
  2. 2. About Me ● Jose Marcial Portilla ○ Founder of Pierian Training ■ Programming, Data Science and Machine Learning B2C and B2B Education ○ Instructor on Udemy ■ Teaching over 3 million students through self-paced on-demand courses.
  3. 3. About this Talk ● We’ll be discussing Data Science education. ● Before we discuss the Future of Data Science Education, we’ll need to understand: ○ The need for Data Science education and the present in Data Science education.
  4. 4. One quick note before we continue… ● This presentation stems from an experience rooted in the United States, which has some unique characteristics in higher education when compared to most other countries!
  5. 5. The Need for Data Science Education
  6. 6. Why Data Science Education? ● Why specifically do we need an education designated as “Data Science”? ● Why is “Computer Science” or “Statistics” not enough?
  7. 7. The Need for Data Science Education ● Motivation for Data Science Education: ○ Data science is a uniquely multidisciplinary field. ○ Data science will be a necessary skill set for a modern workforce. ○ There is a huge need for data scientists across many industries.
  8. 8. Data Science is a Multidisciplinary Field ● Data Science skills are at a unique intersection of Computer Science and Statistics. ● In many formal academic institutions Computer Science falls under the School of Engineering while Statistics is under the Department of Mathematics.
  9. 9. Data Science is a Multidisciplinary Field ● Most students receive higher education for future employment opportunities, and employers looking for Data Scientists benefit from degrees with a clear signal of applicable skills.
  10. 10. Data Science is a Multidisciplinary Field ● While “learning on the job” will always be a critical part of the transition from student to employee, many of the day-to-day skills used by practitioners can be taught before employment in the workforce.
  11. 11. Data Science is a Multidisciplinary Field ● Many current data scientists majored in a STEM field, but not directly labeled as “Data Science”. ● You’ve probably met quite a few physics majors who became data scientists! ● If our goal is to help students become employed as data scientists, we need a way to clearly signal to employers that the student has the necessary skill set to be a successful data scientist.
  12. 12. Modern Workforce Skills ● The necessary skill set of the modern workforce is rapidly changing across all industries. ● What “technical skills” were on a resume from just 20 years ago?
  13. 13. Modern Workforce Skills ● Previous technical skills like “Microsoft Office” are now assumed to be known for any applicant. ● In the future, programming skills such as “Python” could also eventually become assumed, even for non-developer positions.
  14. 14. Organizations are Adopting AI Rapidly ● According to a 2022 IBM Report, 35% of companies report using AI in their business, with an additional 42% exploring using AI. ● The number one barrier to AI adoption reported was “limited AI skills, expertise or knowledge”
  15. 15. Organizations are Adopting AI Rapidly ● Data Science and AI represent unique ethical considerations in their implementation as well! ● Typically degrees in just Computer Science or just Statistics won’t cover any AI Ethics topics.
  16. 16. Organizations are Adopting AI Rapidly ● In the 2022 IBM AI Report, 74% of companies using AI reported they had not taken any steps to ensure trustworthy or responsible AI, such as reducing bias.
  17. 17. Digital Skills Readiness ● Salesforce has conducted a Digital Skills Index based on 23,000+ workers across 19 countries, including topics such as the future of work, job readiness, and continuous learning.
  18. 18. Digital Skills Readiness ● According to the Salesforce Digital Skills Index, only 7% of Gen Z respondents believed they had digital skills in AI and only 20% believed they had coding skills.
  19. 19. Digital Skills Readiness ● Skills gap comes at a cost, RAND Europe Report estimated that 14 G20 countries could miss out on $11.5 trillion cumulative GDP growth if the skills gap isn’t addressed.
  20. 20. The Need for Data Science Education ● We’ve explored clear motivations and needs for Data Science Education: ○ Uniquely multidisciplinary field, deserving of its own major. ○ Skills are necessary for the modern workforce. ○ Employers are expanding work done with AI. ○ Employees feel the need to upskill, even those who already work a tech companies.
  21. 21. The Present in Data Science Education
  22. 22. The Present in Data Science Education ● Clearly, we have a need for data science education! ● Before we discuss the future, what is the present situation in data science education?
  23. 23. The Present in Data Science Education ● To understand the current status of Data Science education, let’s discuss: ○ Sources of Knowledge ○ Online Solutions ○ Academic Institutions ○ Certifications and Degrees ○ Bootcamp Programs
  24. 24. How do developers learn to code? ● 2022 StackOverflow Developer Survey
  25. 25. Online Resources ● Websites such as Udemy can provide meaningful content, directly applicable to current employer needs at an affordable price. ● Instructors have flexibility to have course materials directly relate to real-world job tasks, rather than academic interests.
  26. 26. Online Resources ● Online resources and websites can also reach huge scale, allowing for millions of students to learn with just an internet connection. ● Many of the online resources are free and high quality, including the documentation of many data science libraries.
  27. 27. Limitations of Online Options ● While online resources are a great and popular option, they do have limitations if they are purely self-paced video courses. ● Purely online self-paced VOD courses assume a “one size fits all” approach.
  28. 28. Limitations of Online Options ● Self-paced online learning can be great for motivated individuals, but intimidating for newer, less technical students. ● Often platforms don’t have good one-on-one communication systems in place for individual student support.
  29. 29. Academic Institutions ● Often the process of introducing new majors or updating course curriculums in a University can be slow. ● However many Universities are now offering graduate degrees (most commonly MS programs) in Data Science.
  30. 30. Academic Institutions ● Even in the digital age more than half of developers report they learned to code in school. ● However as we’ve discussed Data Science is more than just learning to code, there is a strong need for Undergraduate students to have the option to major in Data Science.
  31. 31. Academic Institutions ● Fortunately some academic institutions such as Stanford University have recently introduced “Data Science” as an undergraduate degree.
  32. 32. Academic Institutions ● Unfortunately while academic institutions such as universities solve many of the problems of online learning, such as providing tutoring, class setting, or individualized support, they are not scalable to large audiences.
  33. 33. Academic Institution Certification ● Some Universities are trying to lend their brand to certification programs, but they often outsource the actual teaching to 3rd party providers.
  34. 34. Academic Institution Issues ● The top universities in the USA with Data Science programs are also extremely competitive and don’t increase the size of their student pool, especially in relation to their resources.
  35. 35. Academic Institution Issues ● For example, Stanford University has an endowment of $36.6 Billion, with 5-year annualized investment performance of 10.9% returns. ● Certain institutions seem to act more like hedge funds that happen to also teach classes!
  36. 36. Academic Institution’s Online Resources ● There is a movement to release more of the information and resources available to students at the university to the entire public. ● A leader in this space has been MIT, releasing virtually all MIT content via MIT OpenCourseware.
  37. 37. Bootcamps ● Another phenomenon that has emerged is the “Coding Bootcamp”, where small groups of students have intensive training in data science topics for about 3 months, with the explicit goal of employment in the field.
  38. 38. Bootcamps ● Many bootcamps have been experimenting with ISAs - Income Sharing Agreements. ● ISAs allow students to pay nothing up-front, instead agreeing to pay a percentage of their salary only once they get a job.
  39. 39. Bootcamps and ISAs ● While at first ISAs may seem like a great solution, directly aligning a bootcamp’s interest with student outcomes, it can be unclear how the ISA actually operates behind the scenes.
  40. 40. Bootcamps and ISAs ● Bootcamps often attach many stipulations to the ISA, including accepting a job in any geographic location or forfeiting the original ISA contract and then having to pay the initial bootcamp fee amount.
  41. 41. Bootcamps and ISAs ● It is also commonplace for bootcamps to package and sell groups of ISAs to 3rd parties as an income producing assets, immediately off- loading the risk of the ISA. ● Students are almost never informed of this practice!
  42. 42. The Present of Data Science Education ● While online resources, academic institutions, bootcamps, and other resources have made huge strides in focusing on data science, there are still many trade-offs made in choosing one of the present options.
  43. 43. The Future of Data Science Education
  44. 44. The Future of Data Science Education ● As we look to the future of education, particularly in the field of data science, we should leverage data science to create a better future for data science education and education in general.
  45. 45. The Future of Data Science Education ● The future of data science education will merge the best aspects of the present education options, allowing for a balance between scalability and personalization!
  46. 46. The Future of Data Science Education ● Developments in the Future of Data Science Education: ○ Individualized Support with AI ○ AI Powered Adaptive Learning Software ○ Automated Testing and Grading ○ Automated Course Curriculums ○ Built-in Education inside Software ○ Hybrid Cohort Models
  47. 47. Individualized Support with AI ● Huge improvements in Large Language Models (LLMs) will allow for the ability to have a data science tutor instantaneously and in your pocket.
  48. 48. Individualized Support with AI ● Google is developing AI-powered learning platforms, which are currently being tested at universities, such as Southern New Hampshire University (SNHU). ● Educators build competency skills graphs that feed the platform, which then uses AI to auto- generate learning activities for students.
  49. 49. Individualized Support with AI ● Even today in the testing period, the Google AI- powered tutor offers four types of learning activities that students can choose from: ○ Short-answer questions ○ Multiple-choice questions ○ Paraphrasing practice ○ Guided note-taking
  50. 50. Individualized Support with AI ● Sundar Pichai is a huge proponent of using AI in the educational space, and believes in a future where you have “a personal tutor in your pocket”. ● So expect to see continued efforts on applying Google ML to education for all.
  51. 51. Individualized Support with AI ● We’ve also seen recent developments from Meta, as they have just released Galactica, an LLM specifically designed to deliver informational articles.
  52. 52. AI Powered Adaptive Learning Software ● Adaptive learning is a general term for an educational method which uses computer algorithms to orchestrate the interaction with the learner and deliver customized resources and learning activities to address the unique needs of each learner.
  53. 53. AI Powered Adaptive Learning Software ● Previously adaptive learning software consisted of predefined steps based on a developer defined script that reacted to student scores or inputs. ● You may have experienced this “adaptive learning” yourself if you’ve take certain standardized tests, for example the GMAT adapts question difficulty based on student answers.
  54. 54. AI Powered Adaptive Learning Software ● With a field as complex and multidisciplinary as Data Science, it's not possible to have a developer manually create adaptive learning software. ● Can AI powered systems provide a solution?
  55. 55. AI Powered Adaptive Learning Software ● AI systems powered by LLMs will be able to dynamically interact with students, pushing them to different material as needed based on their progress.
  56. 56. AI Powered Adaptive Learning Software ● Code in Place 2021 was an endeavour by a group of educators and volunteer teachers from all over the world to teach introductory programming to over 12,000 students. ● Based on Stanford University's CS 106A, the course taught the fundamentals of programming in Python.
  57. 57. AI Powered Adaptive Learning Software ● As part of Code in Place 2021, Chelsea Finn, a Stanford professor and A.I. researcher, helped build an automated feedback system, that could review code submitted by students and offer feedback in natural language.
  58. 58. Automated Testing and Grading ● A major components of these AI powered adaptive learning systems will be the ability to automatically test and grade student work. ● Unlike typical automated grading systems, future AI systems will be able to deliver not just natural language feedback, but personalized feedback for a specific user.
  59. 59. Automated Testing and Grading ● The Stanford developed AI feedback system provided 16,000 pieces of feedback, and students agreed with the feedback 97.9 percent of the time, according to a study by the Stanford researchers. ● By comparison, students agreed with the feedback from human instructors 96.7 percent of the time!
  60. 60. Automated Course Curriculums ● These AI systems will then easily expand to creating custom curriculums for students. ● Students will be able to take the version of the course that is best suited to their current needs. ● The future of educational material will be hyper- personalized, and perhaps one student’s CS course may not be the same as another’s!
  61. 61. Automated Course Curriculums ● Highly customized curriculums, tutors, assignments and grading are clearly on the path to being commonplace for all students. ● We already have many hyper customized offerings, such as song recommendations, its only natural that this extends into Data Science education!
  62. 62. Automated Course Curriculums ● The combination of AI, LLMs, and software will allow future teachers to greatly expand their abilities to teach students about data science. ● We’ve mainly covered the world of classic academic students or consumers looking to learn about data science, what about the world of enterprise data science training?
  63. 63. Hybrid Cohort Models ● Enterprise learners have distinct needs that are often different than typical students. ● Solutions for enterprise clients can also be applied to students who are currently employed but want to learn data science skills to transition careers.
  64. 64. Hybrid Cohort Models ● Typical enterprise training often happens on-site, via instructor-led training sessions. ● While its highly beneficial to have an instructor- led session teach students skills, the synchronous nature of the training means that the group of students need to all find the exact dates and times to actually conduct the training.
  65. 65. Hybrid Cohort Models ● On the other end of the spectrum are fully self- paced VOD offerings, which as we’ve discussed, have great benefits, but lack personalization and small class setting dynamics.
  66. 66. Hybrid Cohort Models ● Hybrid cohort models attempt to bring the best aspects of instructor-led training and self-paced video courses into one cohesive framework.
  67. 67. Hybrid Cohort Models ● Hybrid cohorts operate with a small group of people who watch self-paced video content on individual schedules during the week. ● Then on a regular basis the cohort meets with a human (for now) expert who can directly answer questions for the students in the cohort.
  68. 68. Hybrid Cohort Models ● This is one of our most effective enterprise training offerings at Pierian Training. ● Students get the asynchronous flexibility to watch and learn content on their own time, but also get access to instructors to answer questions specific to their needs at work.
  69. 69. Hybrid Cohort Models ● One of the key challenges to developing effective cohort models is the need for a robust library of self-paced content that can stand on its own. ● Only with an effective library can the additional components of group office hour meetings be added in a meaningful way.
  70. 70. Built-in Education inside Software ● Users of GitHub and VS Code are likely aware that GPT-3 based Codex model has been productized as GitHub Copilot, allowing for robust auto-completion of programming tasks.
  71. 71. Built-in Education inside Software ● Not only should you expect these LLMs to get better at generating code, but these AI-powered assistants will also be able to explain existing code, and offer suggestions for improving existing code.
  72. 72. Built-in Education inside Software ● The latest in AI code completion assistants will also act as a personalized Stack Overflow, allowing the assistant to not just complete code or explain code, but also directly answer natural language questions about how to perform coding tasks.
  73. 73. The Future of Data Science Education ● Current members of academia may feel threatened by advancements in new tools and software, allowing students to directly interact with AI, but the best tools in artificial intelligence don’t just simply replace humans, but allow us to focus on being more human to practice and teach in a way that only a human can.
  74. 74. Future State of Data Science in Practice ● A key consideration is as all the developments discussed in this presentation come to fruition, what will the current state of practicing data science look like? ● We’ve seen such huge strides in NLP systems and LLMs as teaching assistants, will the future of computing and data science be merely conversing with a computer?
  75. 75. Future State of Data Science in Practice ● If we develop the capabilities to teach data science as just discussed, we need to prepare for future frameworks where the main way of operating in the field of data science is through natural language based AI interactions.
  76. 76. Future State of Data Science in Practice ● Even as LLMs and No-Code tools develop, educational frameworks must make sure to cover foundational topics and ethical considerations, before the student becomes over reliant on AI based systems.
  77. 77. The Future of Data Science Education ● Key Takeaways: ○ Huge need for data science skill sets and large skills gap in the workforce. ○ Expect to see LLMs powered software to really start to grow in the education field. ○ Hybrid learning models will become more normalized, especially among technical topics.
  78. 78. The Future of Data Science Education ● The world needs more data scientists, so let’s use data science itself to create more! ● Thank you to the Data Science Conference for giving me the chance to speak on this topic!
  79. 79. Let’s keep in touch! ● Our Website: ○ ● My LinkedIn: ○ ● My Twitter: ○
  80. 80. Thank You!
  81. 81. Q&A

Notas del editor

  • By show of hands, how many people have an Undergraduate degree major in Data Science? That’s right, most likely because it did not exist when you were in school!
  • Although this is changing, but more on that later
  • Pic: Signal wave line
  • Although this is changing, but more on that later
  • Representative sample of 7,502 business decision makers
    – 500 in each country (United States, China, India,
    UAE, South Korea, Australia, Singapore, Canada,
    UK, Italy, Spain, France, Germany)
    – 1,000 in Latin America (Brazil, Mexico, Colombia,
    Argentina, Chile, Peru)
    – Conducted online through Morning Consult’s proprietary
    network of online providers
    Respondents represented a mix of small and large firms
    – 32% of respondents came from firms with more than
    1,000 employees
    – 27% of respondents came from firms with between
    251 and 1,000 employees
    – 20% came from firms with 51–250 employees
    – 21% came from smaller businesses (50 employees or less)
    – Sole proprietorships were not sampled
  • Salesforce’s 2022 Global Digital Skills Index reveals a growing digital skills crisis. This article takes an in-depth look at the findings, based on what 23,000+ workers across 19 countries say about digital skills, including their impact on the future of work, concerns about job readiness, and the significance of continuous learning.
  • Some key unlocks to help rapidly scale personalized, informative, data science education to users
  • Feedback for a specific user, not just
  • Some key unlocks to help rapidly scale personalized, informative, data science education to users
  • Ben Taylor “Jarvis for Everyone”