This document provides an overview of machine learning, artificial intelligence, and data science from the perspective of a data scientist and product manager. It defines key terms like machine learning, deep learning, and artificial intelligence. It also discusses how data science can be thought of as a process, as people working in teams, as products powered by machine learning, and as services. The document emphasizes the importance of responsibility in developing intelligent technologies and products, including fairness, explainability, accuracy, and privacy. It provides recommendations for evaluating machine learning services and recommends approachable resources for learning more about data science, machine learning, and AI.
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Understanding Products Driven by Machine Learning and AI: A Data Scientist's Perspective
1. @amcasari
Understanding Products Driven
by Machine Learning and AI:
A Data Scientist’s Perspective
A.M. Casari
Principal Product Manager + Data Scientist
Concur Labs @ SAP Concur
2. @amcasarihere to there via random walk
senior data scientist
@ SAP Concur
control systems
engineering +
robotics + legos
officer in USN
operations research
analyst
wandering dirtbag +
conservation volunteer
EE + applied math
+ complex systems
underwater robotics
consultant
extraordinaire
SAHM
product + data
@ Concur Labs
4. @amcasari
when we say…
DATA SCIENCE
§ We mean….the interdisciplinary intersection of methods,
processes, algorithms and problem solving techniques to
extract knowledge from data1
MACHINE LEARNING [ML]
§ We mean….the statistical class of algorithms which allow us
to systematically improve a computer’s ability to perform a
given task2
DEEP LEARNING [DL]
§ We mean….the family of ML methods based on
learning data representations3
ARTIFICIAL INTELLIGENCE [AI]
§ We mean….when a machine mimics cognitive functions
usually observed in animals, such as problem solving and
creativity4
hint:AI has not happened yet… +
our community is well represented inWikipedia
6. @amcasari
when we say
products are…
DATA DRIVEN
§ We mean….product strategy and engineering decisions
are made by qualitative + quantitative analysis of data
INTELLIGENT
§ We mean….users interact with features that thoughtfully
and seamlessly balance context and useful information
FUELED BY MACHINE LEARNING
§ We mean….somewhere in the backend, someone is
using data with some kind of predictor. More or less.
AUGMENTED
§ We mean….intelligent products which guide users through
a new experience without distracting from their purpose
11. @amcasari
when to move
on? “Models are not right or wrong; they're
always wrong. They're always approximations.
The question you have to ask is whether a
model tells you more information than you
would have had otherwise. If it does, it's
skillful.”
- Gavin Schmidt’s excellent TED Talk
12. @amcasari
be
responsible
technologists
§ Algorithmic Accountability Review
§ Responsibility
§ Explainability
§ Accuracy
§ Auditability
§ Fairness
§ Example Guiding Questions
§ How could this go south?
§ What social constructs am I modeling implicitly or
explicitly?
§ What are the impacts of the choices I have made in my
data modeling + feature selection?
§ Could the deployment of this work negatively impact a
subset of my users?
15. @amcasari
data science as
a team sport
v1
Cross Functional Team
Data Scientist Team
Cross Functional Team
Cross Functional Team
16. @amcasari
data science as
a team sport
v2
needs - “define the primary stages of leveraging Big Data with stakeholders representing the
domain. analysts usually drive from discovery toward integration, while the engineers tend to
drive from systems toward integration
NB: effective, hands-on management in Data Science must live in the space of integration, not
delegate it”
roles - “leverage different disciplines, opportunities, and risks... there’s great power in
pairing people with complementary skills, in team environments where they can recognize
each other’s priorities and perspectives
blurring these roles is wonderful... however, when businesses get into trouble, they also
tend to “push down” these roles, blurring boundaries in ways which stresses teams and
limits scalability”
diagram and description courtesy of Paco Nathan
17. @amcasari
data science as
a team sport
vNOW
Advanced Engineering Team
Data Science Team Cross Functional Team Cross Functional Team
Research Team Applied Research Team
20. @amcasari
data products
vNEXT
diagrams and description courtesy of Paco Nathan
The playbook on this is being written now…
personal digital stylists via StitchFix
augmented writing via Textio
artwork generation via Netflix
games to teach computers via Google
21. @amcasari
be
responsible
companies
§ Design for Fairness
§ Design for Accountability
§ Design for Transparency
§ Design for Privacy
§ Design for Ethics
§ Example Guiding Questions
§ Who is responsible if users are harmed by this product?
§ Who will have the power to decide on necessary changes
to the algorithmic system during design stage, pre-launch,
and post-launch?
§ How much of your system / algorithm can you explain to
your users and stakeholders?
§ What are realistic worst case scenarios in terms of how
errors might impact society, individuals, and stakeholders?
23. @amcasari
comparing
services +
vendors
§ Why are you asking for my data?
§ Cold-start versus warm-start
§ Evaluation comparison
§ How can your models work for me?
§ Feature Transfer
§ Define results by your business value, not their
metrics
§ All your services should uphold your data science
standards: Fairness, Accountability, Transparency,
Privacy, Ethics
§ What questions should I be asking about their
processes?
§ Privacy > GDPR compliant?
§ Where does the data live
§ Where do your services live
§ Who owns the trained models once they are trained
24. @amcasari
evaluating
during buy or
build
§ Do you have an defensible moat around
this data?
§ How long of a project runway would you
have to build a team?
§ Do you have internal resources who you
could leverage + build out a new team?
§ As this project/product scales, will the
cost of the services keep up with your
ARR?
§ What future-thinking, vertical specific
brainshare are you paying someone else
to gain?
25. @amcasari
Choose Your Own
Educational Adventure
Data science / ML / AI needs everyone
Approachable Resource Recommendations
Books!
• Python for Data Analysis, William McKinney
• Doing Data Science, Cathy O’Neil + Rachel
Schutt
• Data Science from Scratch, Joel Grus
• Machine Learning with Python Cookbook, Chris
Albon
MOOCs!
• Machine Learning, by Andrew Ng on Coursera
• Machine Learning Specialization, by Emily Fox +
Carlos Guestrin on Coursera
• fast.ai, by Jeremy Howard + Rachel Thomas